Project structure of: THUDM/CogVLM

CogVLM Interactive AI demos, datasets, and utilities for NLP
- basic_demo CLI demos, web demo with CogAgent, chat app for text generation
  - cli_demo_hf.py CLI demo with CogAgent, Vicuna tokenizer, argparse, GPU support
  - cli_demo_sat.py CLI for text generation, distributed chat app with language support.
  - web_demo.py Create web demo using Gradio, CogVLM, CogAgent.
- composite_demo Interactive chat and image-based composite language model demo.
  - client.py Python client for composite language model interaction
  - conversation.py Conversation class with role, content, image, translation support.
  - demo_agent_cogagent.py CogVLM conversation agent demo
  - demo_chat_cogagent.py Chatbot with image processing and Chinese model integration
  - demo_chat_cogvlm.py Chat demo with image processing and Conversation object initialization.
  - main.py Chat application using CogAgent and CogVLM models with image uploads and prompts.
  - utils.py Dynamic indexing AI image analysis tool
- dataset.md Bilingual visual instruction dataset for CogVLM v1.0 training
- finetune_demo Fine-tunes models, evaluates performance in demonstrations
  - evaluate_cogagent.sh Trains CogAgent chat model with DeepSpeed, evaluation enabled.
  - evaluate_cogagent_demo.py Evaluate COGAgent demo: Fine-tune transformer model for text generation
  - evaluate_cogvlm.sh Fine-tunes CogVLM model with 8 GPUs, saves checkpoints every 200 iterations.
  - evaluate_cogvlm_demo.py Evaluate CogVLM model performance.
  - finetune_cogagent_demo.py Fine-tunes model, applies decoding strategies, calculates accuracy.
  - finetune_cogagent_lora.sh Finetune CogAgent model using Deepspeed, NCCL, and CUDA.
  - finetune_cogvlm_demo.py Trains CogVLM models, generates chat responses, evaluates.
  - finetune_cogvlm_lora.sh Fine-tune CogVLM with LORA and parallelism.
  - test_config_bf16.json BF16-enabled Batch Config for Model Training
- openai_demo OpenAI Chatbot Demo: FastAPI, CogVLM, CogAgent, GPU Memory
  - openai_api.py FastAPI app: Chat functionality, endpoints, GPU memory, CORS, device check
  - openai_api_request.py OpenAI API chatbot simulator for CogVLM and CogAgent.
- README.md Improved AI models with HuggingFace support and templates.
- requirements.txt Python project requirements: NLP, deep learning, data visualization, web dev.
- utils Utility scripts and functions for various processes.
  - merge_model.py Trains a pre-existing model with fine-tuning parameters.
  - models Efficient CLIP models and mixin for parallelism
    - __init__.py Imports CogAgent and CogVLM models.
    - cogagent_model.py Initialize GLU CogAgent model with VIT and ExternalVisionMixin, fine-tuning capabilities.
    - cogvlm_model.py CogVLM model initialization and fine-tuning.
    - eva_clip_L_hf.py CLIP Vision model init & layer config, Eva2LargeEncoder for EVAVisionTransformer.
    - eva_clip_model.py Efficient Transformer Model for CLIP
    - mixin.py Transformer mixin for model parallelism
  - split_dataset.py Splits, shuffles, and ensures reproducibility for specified files.
  - utils Utility scripts and functions for various processes.
    - __init__.py Imports functions from CogVLM libraries for various processes.
    - chat.py Process images, apply optional processors, chat mode.
    - dataset.py Loads .jpg images, converts to RGB, extracts text from names, returns data and ids.
    - grounding_parser.py Overlay images with text and boxes using grounding parser.
    - language.py PyTorch language model utilities with tokenization and preprocessing.
    - vision.py Vision.py: Image preprocessor using Torchvision transforms