AI Applications
Specific applications and solutions of artificial intelligence in various fields
A stable diffusion web interface, developed based on the grado library, provides a friendly browser interface for users to visualize and operate the stable diffusion model conveniently.
An advanced natural language processing model library built for Jax, PyTorch and TensorFlow. It provides a rich set of pre-trained models and tools to help users achieve better results and performance in natural language processing tasks.
A selected Chinese chat generation template project provides various interesting templates for people using ChatGPT, making the chat more interesting! This project provides a variety of templates that allow ChatGPT to play different roles, such as Linux terminal, text translation and correction, technical interviewers, storytellers, talk show performers, writing guidance consultants, psychological counselors, etc. Through these templates, users can quickly get started with ChatGPT, expand its application scenarios, and provide richer chat experiences.
A development tool for creating powerful AI applications, it provides APIs for plugins and datasets, as well as an interface for quick engineering and visualization operations. For developers and researchers who want to develop applications, Dify provides convenient tools and interfaces to help them build feature-rich AI applications.
A powerful and modular GUI and graphics/node interface that provides stable and scalable user interface components. The design of this project focuses on customization and usability, enabling developers to quickly build beautiful and feature-rich graphical user interface applications.
A free online open source textbook "Hands-On Deep Learning" is a Chinese translation of the Introduction to Deep Learning course textbook at the University of California, Berkeley for the spring semester of 2019. The book comprehensively introduces deep learning, not only explaining the principles of algorithms, but also running them to gain an interactive learning experience.
A selected list of computer science video courses covering areas such as introduction to computer science, data structures and algorithms, system programming, software engineering, artificial intelligence, machine learning, etc., including open courses from famous universities and professional lecturer courses.
A reverse engineering project aimed at studying and exploring the working principles of the GPT-4 and GPT-3.5 models. It provides users with an opportunity to understand the internal mechanisms of GPT models and promotes research and improvement of generative pre-trained models.
🤖 An open-source, high-performance chatbot framework that supports voice synthesis, multimodal and scalable function call plugin systems. Users can deploy private ChatGPT/LLM web applications with one click for free, providing powerful chatbot features for projects.
A rich, leading and practical OCR tool library
Ultralytics provides a new deployment tool that supports the conversion of YOLOv8 models from PyTorch to various platforms, including ONNX, OpenVINO, CoreML and TFLite. For computer vision developers and researchers, this tool can help quickly deploy YOLOv8 models to different platforms, improving the performance of model applications.
A tool for building customized low-code machine learning (LLM) workflows using a drag-and-drop UI with LangchainJS. It simplifies the development and deployment of machine learning processes, enabling users to design their own machine learning workflows through drag-and-drop operations, thereby enhancing development efficiency.
Microsoft Azure cloud advocates are pleased to offer a 12-week, 24-lesson artificial intelligence course. In this course, you will learn: different approaches to artificial intelligence, including the "good old" symbolic methods of knowledge representation and reasoning (GOFAI). Neural networks and deep learning, which are at the core of modern artificial intelligence. We will use code from two of the most popular frameworks - TensorFlow and PyTorch - to illustrate the concepts behind these important topics. Neural architectures for processing images and text. We will introduce the latest models, but may lack some of the most advanced. Less popular artificial intelligence methods, such as genetic algorithms and multi-agent systems.
A practical real-world face restoration algorithm. It utilizes the rich prior knowledge from pre-trained face GANs (such as StyleGAN2) for blind face restoration, improving the quality and realism of the restored faces.
A voice generation model designed specifically for dialogue scenarios, mainly used for LLM assistant dialogue tasks, dialogue voice and video introductions, etc., supports the synthesis of speech from mixed Chinese and English text, has strong timbre performance, and can reach a level where it is difficult to distinguish between real and fake.
A Go tutorial "Go Programming Language Tutorial" is undoubtedly the most suitable book for those who are interested in learning the Go programming language. It contains the most comprehensive learning resources at present
A personal knowledge management system that prioritizes privacy, supports fine-grained block-level referencing and Markdown WYSIWYG editing, has real-time rendering, mathematical formulas, charts, HTML export, Markdown files, AI writing, etc., and supports cross-platform use.
A lightweight open source free third-party client for YouTube, which does not require Google service framework and YouTube account login, supports 4K video playback, picture-in-picture mode, search for videos/audio/channels/playlist, download videos/audio/subtitles, etc.
An open-source OCR image-to-text recognition software based on PaddleOCR technology. It is completely free and can be used offline. It supports screenshot text recognition, batch import of images, horizontal/vertical text, and can automatically ignore watermark areas. It is suitable for the Win10 operating system.
The Wisdom of Asking Questions - How To Ask Questions The Smart Way
A powerful Windows screenshot and screen recording tool that supports screenshot, screen recording, OCR text recognition, image watermark addition, content upload, address sharing, color adjustment, image editing, video format conversion, and other functions.
Your AI second brain. A copilot to get answers to your questions, whether they be from your own notes or from the internet. Use powerful, online (e.g gpt4) or private, local (e.g mistral) LLMs. Self-host locally or use our web app. Access from Obsidian, Emacs, Desktop app, Web or Whatsapp.
A selection of papers, technical articles and well-known blogs related to data science and machine learning, covering 24 technical directions such as data engineering, natural language processing, computer vision, reinforcement learning, etc. Most of the articles come from world-renowned universities and enterprises.
A community-supported supercharged version of paperless: scan, index, and archive all your physical documents.
A project for writing reusable computer vision tools. Through this project, users can more easily create and manage the tools and processes needed for computer vision applications. Whether it's dataset preparation or model training, Supervision provides tools to help developers.
Make it more efficient and convenient for Python developers to integrate image text recognition functions
An interactive deep learning book that provides code, math, and discussions across multiple frameworks. This project has been adopted at over 500 universities in 70 countries around the world, including Stanford University, Massachusetts Institute of Technology, Harvard University, Cambridge University, etc. It provides rich resources and an interactive learning experience for learning deep learning.
A leading stable diffusion model creative engine that enables professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technology. The solution provides an industry-leading web interface as well as support for use via CLI on the terminal. Due to its efficient and flexible features, it can serve as the foundation for a variety of commercial products to meet the needs of different scenarios.
A fully open-source enterprise-level instant messaging project implemented in Python. It has powerful performance and rich functions, equivalent to the open-source version of Slack, supporting drag-and-drop file upload, code highlighting, Markdown syntax, application integration, and other functions. Zulip supports multiple platforms including Web, PC, iOS, and Android, and is used by many well-known companies to improve team communication and office efficiency.
An assistant built for multi-modal GPT-4 level capabilities. It combines natural language processing and computer vision to provide users with powerful multi-modal interaction and understanding. LLaVA aims to better understand and process language and visual information, thus enabling more complex tasks and conversations. This project represents the direction of development for next-generation intelligent assistants, which can better understand and meet user needs.
PDFMathTranslate is a versatile tool designed for translating scientific PDF documents while preserving their original structure, including formulas, charts, tables of contents, and annotations. It supports multiple languages and integrates various translation services. The project offers multiple usage options, including a command-line tool, an interactive GUI, and Docker deployment, making it accessible for diverse user needs. It also provides online demos for quick testing and supports advanced features like partial document translation, multi-threading, and custom prompts. PDFMathTranslate is continuously updated with experimental backends and improved functionalities, ensuring flexibility and efficiency for academic and professional use.
I have collected a lot of tutorials and example codes related to computer vision, deep learning, and artificial intelligence.
Unilm is a large-scale self-supervised pre-training model across tasks, languages and modalities. It is pre-trained by self-supervised learning, which enables the model to be transferred to different tasks and languages, with wide application value. The design goal of Unilm is to provide a unified pre-training model that can handle various natural language processing tasks, such as machine translation, text summarization, question answering, etc.
The Chinese translation version is mainly for engineers who want to learn software by themselves, students in school, and Internet professionals who intend to switch to the computer industry.
It includes some practical machine learning and Python open source projects and tools. There are more than 900 projects in total, including data visualization, natural language processing, text and image data, web crawling, etc.
A carefully collected and organized set of resources related to Chinese large language models (LLMs), including multiple open-source fine-tuned Chinese models, underlying models, datasets, fine-tuning frameworks, inference deployment frameworks, evaluation methods, and related tutorials in various vertical fields.
A model that provides Chinese LLaMA and a large Alpaca model fine-tuned with instructions. These models are based on the original LLaMA, and have been retrained with Chinese data to expand the Chinese vocabulary, further enhancing the model's ability to understand Chinese semantics. At the same time, this project also uses Chinese instruction data for fine-tuning, significantly improving the model's understanding and execution of instructions.
Convert the MXNet code implementation in the original "Dive into Deep Learning" book to a PyTorch implementation.
A simple and practical tool specifically designed for removing image backgrounds or performing image cutting. It uses advanced algorithms to quickly and accurately separate foregrounds from backgrounds, suitable for both personal and commercial use. Whether it's creating product images with transparent backgrounds or engaging in creative design, Rembg can provide effective assistance.
A project for 3D real-time radiance field rendering. It employs Gaussian splatting technology to achieve high-quality radiance field rendering, suitable for the fields of graphic rendering and visualization. This project provides valuable tools and resources for developers engaged in research and application of real-time rendering.
A project to accelerate the training of NeRF models. NVIDIA's open source technology can train a fox NeRF model in just 5 seconds, which is based on static 2D images and uses neural networks to quickly train pictures that can be magnified and clearly observed from any angle.
The ML YouTube Courses project is dedicated to providing users with the latest machine learning and artificial intelligence courses, all of which can be found on YouTube. By aggregating various educational resources, this project offers learners and practitioners a convenient platform to easily browse, filter, and select course content that suits their learning needs. Whether you are a beginner or a professional, ML YouTube Courses is an ideal choice for discovering quality machine learning educational resources.
A large language model in the financial domain, trained through democratized internet-scale data on FinNLP and the FinNLP website. This project aims to provide robust natural language processing capabilities for the financial sector, assisting analysts, traders, and researchers in obtaining more accurate language model support for various tasks in finance.
A list of open-source libraries related to PyTorch on GitHub, containing learning tutorials, examples, etc.
A master's student at South China University of Technology has compiled it, which contains various technical tutorials, excellent articles and video tutorials related to deep learning.
ArkA game assistant. Based on image recognition technology, it can complete all the daily tasks of Ark with one click.
A GPT-based document query tool designed for real-time interaction with documents. It enables users to perform document queries using natural language, providing detailed explanations and answers. DocsGPT leverages advanced natural language processing capabilities to make document understanding and querying more intuitive and user-friendly.
A flexible and interesting JavaScript file upload library
ChuanhuChatGPT is an open-source chatbot project based on Transformers, providing powerful dialogue generation capabilities and various pre-trained models. This project uses advanced Transformer technology to enable interesting conversations with the robot. Developers can quickly build interactive and natural-flowing chatbots using ChuanhuChatGPT to meet various application needs.
This project is an immersive bilingual web page translation extension that helps users achieve bilingual translation while browsing the web, providing a better reading experience.
NLP tutorial, which includes 13 commonly used models and code implementation such as CNN, RNN, Transformer, and most of them are compatible with TensorFlow and PyTorch two frameworks.