haotian-liu

LLaVA

未分类

haotian-liu

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

22.3k
Stars
2.5k
Forks
1.1k
Issues
47
Contributors
158
Watchers
gpt-4chatbotchatgptllamamultimodalllavafoundation-modelsinstruction-tuningmulti-modalityvisual-language-learningllama-2llama2vision-language-model
Python
{"name":"Apache License 2.0","spdxId":"Apache-2.0"}

Project Description

An assistant built for multi-modal GPT-4 level capabilities. It combines natural language processing and computer vision to provide users with powerful multi-modal interaction and understanding. LLaVA aims to better understand and process language and visual information, thus enabling more complex tasks and conversations. This project represents the direction of development for next-generation intelligent assistants, which can better understand and meet user needs.

© 2025 GitHub Fun. All rights reserved.