The shelf

Collection of automated documentation.

Generated with prometheus

Cybergod (Autonomous Computer)
- AppAgent Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
- CogVLM A state-of-the-art-level open visual language model | 多模态预训练模型
- cybergod Autonomous computer program that can do anything without human operators.
- gpt4v-browsing Web Scraping with GPT-4 Vision API and Puppeteer
- gpt-eyes I GAVE GPT-4 EYES!
- GPT-4V-Act AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI
- self-operating-computer A framework to enable multimodal models to operate a computer.
- SingularGPT Automate device by ChatGPT , Make your device more like a human.
- Video-Pre-Training (VPT) Learning to Act by Watching Unlabeled Online Videos
Image models
- CLIP (Contrastive Language-Image Pretraining) Predict the most relevant text snippet given an image
- DALL-E PyTorch package for the discrete VAE used for DALL·E.
- DALLE2-pytorch OpenAI's updated text-to-image synthesis neural network, in Pytorch
Utils
- git_atomic_commit To fix any issue detected by git fsck caused by any git operation, once for all
- lazero AGI helper libraries, may help for AGI developments and researches, similar to google's automl-zero
- cf The comprehensive framework
- lazer Make everything executable, analyzable, controllable.
- lazero (legacy) Automatic information gathering, understanding and source code generating.
- metalazero A cross-platform lazero implememtation.
- lazero_android Lazero for Android
Q* (Q-Star)
- q-transformer Scalable Offline Reinforcement Learning via Autoregressive Q-Functions, out of Google Deepmind
- open_qstar Transformer-based LLM structurally infused with Q-Learning and A heuristic search algorithms*
- q-star A reinforcement learning-based framework for intelligent agents using Microsoft AutoGen.
- mcts-for-llm This is a pip package implementing Reinforcement Learning algorithms in non-stationary environments supported by the OpenAI Gym toolkit.
Superalignment
- automated-interpretability Language models can explain neurons in language models
- weak-to-strong Can weak model supervision elicit the full capabilities of a much stronger model?
Audio models
- whisper Robust Speech Recognition via Large-Scale Weak Supervision
Tree of Thoughts
- graph-of-thoughts Solving Elaborate Problems with Large Language Models
- lmql-tree-of-thoughts LMQL implementation of tree of thoughts
- tree-of-thoughts Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by at least 70%
- LLM_Tree_Search Alphazero-like Tree-Search can guide large language model decoding and training
Embodied Intelligence
- Voyager An Open-Ended Embodied Agent with Large Language Models
- RoboGen A generative and self-guided robotic agent that endlessly propose and master new skills.
- act-plus-plus Imitation Learning algorithms with Co-traing for Mobile ALOHA: ACT, Diffusion Policy, VINN
- mobile-aloha Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
Media content automation
- vced 通过你的文字描述来自动识别视频中相符合的片段进行视频剪辑
- pyjom Social media automation project
- PaddleVideo Awesome video understanding toolkits based on PaddlePaddle.
- google-research Google Research
- autoup Automatically make and upload videos to bilibili.com
- autowork Automate the entire process of video production
- DynamiCrafter Animating Open-domain Images with Video Diffusion Priors
Long Context Transformer
- RWKV-LM The RWKV Language Model
Multimodal Transformer
- MultiModalMamba A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi-Modal Model.
- Gemini The model that will "eclipse ChatGPT"
- gato A Generalist Agent
- NExT-Chat An LMM for Chat, Detection and Segmentation
Robotic Transformer
- robo_transformers Library for Robotic Transformers. RT-1, RT-X-1, Octo
- RT-2 New model translates vision and language into action
- AutoRT Embodied Foundation Models for Large Scale Orchestration of Robotic Agents
- open_x_embodiment Robotic Learning Datasets and RT-X Models
- robotics_transformer A collection code files and artifacts for running Robotics Transformer or RT-1.
- RT-X Pytorch implementation of the models RT-1-X and RT-2-X from the paper: "Open X-Embodiment: Robotic Learning Datasets and RT-X Models"
RAG and Automated Documentation
- my_blog_source AI assisted blog metadata generator
- local_rag Chat with Your Multiple PDFs on Your Local System
- autodoc Experimental toolkit for auto-generating codebase documentation using LLMs
- prometheous AI generated documentation and RAG
- write-the AI-powered Documentation and Test Generation Tool
Miscellaneous
- Kacket A toy Racket/Scheme code analyzer written in Kotlin.
- he4o HE —— “螺旋熵减机”
- linear_programming Linear Modeling and Debugging