chatgpt
GPT4 is out.
https://chat.forchange.cn
https://aigcfun.com
https://ai.askai.top
besides from decent processors, RAM and optimized runtime, in order to load LLMs fast, one would store the model weights on SSDs.
now colossalai supports chatgpt training with a single gpu, using open-source code
check humata for paper QA and information extraction/language understanding from PDF files
the syntax of chatgpt’s response is obviously markdown.
in order to be unblocked by chatgpt just because we are using static ip of corp’s wifi, we can connect through our phone’s hotspot.
Microsoft’s EdgeGPT needs you to open in Edge browser and join the waitlist of new Bing, having 3rd party API here
Merlin is an extension based on ChatGPT which is avaliable for free and all countries, with 11 queries for free each day. Pro subscriptions incoming.
Rallio67 builds dataset for RLHF and has released multiple chatgpt-like models on huggingface. namely, joi, chip and rosey, all based on pythia or neox-20b. laion people tend to share loads to CPU in order to run these huge models properly.
KoboldAI considered OPT and GPT-Neo as generic LMs. special models like NSFW shits may serve some purposes better.
many alternatives, but many are specialized in marketing and content generation, some are chatgpt replica, like chatsonic (with google knowledge) and youchat (from you.com (awesome!))
open assistant now has a data collection website, in which you can only perform tasks given and earn points (working for free? nah?)
it is adviced to run this chatgpt program with libraries instead of manually, to prevent issues.
my account has been banned from trying chatgpt. though it is not going to be free forever, you need to moderate your input (multi-language support, not only english but chinese) using some api to prevent similar incidents. also some topics outside of blacklist are banned intentionally so you need to check if the model is really producing the answer. if not you should avoid or change the way of asking it.
moderation via official openai api, perspective api (free), or via some projects like content moderation deeplearning, bert text moderation, bert-base-uncased-hatexplain, toxic-bert, copilot-toxicity and multilingual-hate-speech-robacofi, train on datasets like hate_speech_offensive, toxicity (by surge-ai, a dataset labelling workforce) and multilingual-hate-speech
from my point of view, this is a service you cannot replicate at home, either requires smaller models with different architecture, or requires crowd-sourced computational power.
saying chatgpt is powered by ray, increasing parallelism.
bigscience petals colab and petals repo
discord chatroom for reproducing chatgpt
since many different models are derived from the original pretrained language model, opendelta can save disk space by freezing main parameters, only tuning few of them.
this gpt seems really good. currently only api access.
but it is provided by openai which is no longer so “open” in the sense of “open-source”.
stability.ai is providing alternative open-source implementations of SOTA AI algorithms, which includes carper.ai, eleuther.ai, dreamstudio, harmonai (audio), laion.ai (datasets and projects)
viable approaches to chatgpt
according to my point of view, chatgpt is just specialized on chat, or socialized in other words.
the elo rating system is the key to facebook social network, many zero-sum games. basically it is some revolution rating system. to do such rating system effectively one shall use along with classifiers and embeddings.
according to the training process of instructgpt and webgpt, we know that gpt has learned more by interacting with people (multiple QA), doing self-examination (learning a reward model) and performing actions (searching and quoting on web).
RLHF
chainer, prompt engineering
langchain extending llm by advanced prompts, llm wrappers actions, databases and memories
RL algorithms, tools for providing feedback
Awesome-RLHF paper and code about RLHF
Efficient few-shot learning with Sentence Transformers, used by FewShotRLGPT (no updates till now?)
RLHF models
non-language models
language models
chatrwkv pure rnn language model, with chinese support
blenderbot2 a bot which can search internet, blenderbot3 is US only. install ParlAI then clone ParlAI_SearchEngine. tutorial
promptCLUE based on T5, created by clueai, trained on pCLUE
openchatgpt-neox-125m trained on chatgpt prompts, can be tested here, trained from pythia
copycat chatgpt replicate
medicine-chatgpt shit sick of COVID-19
baby-rlhf both cartpole and languge model
textrl 100+stars
PaLM-RLHF claims RETRO will be integrated soon?
RL4LMs with multiple rl methods
webgpt-cli interface openai api to browse web and answer questions
lm-human-preferences by openai
rlhf-magic using trlx (supports GPT3-like models) which has PPO and ILQL (as trainable model)
trl only has PPO on GPT2
Tk-Instruct T5 trained on natural instruct dataset. is it trained on RLHF systems?
datasets
whisperhub collection of chatgpt prompts by plugin
dataset building tools
open-chatgpt-prompt-collective
crowd-kit purify noisy data
reward models
rankgen scores model generations given a prefix (or prompt)
electra-webgpt-rm and electra-large-reward-model is based on electra discriminator
GPT3-like models
galactica is opt trained on scientific data
bloomz and mt0 trained on xP3 (multilingual prompts and code)
T0PP T0 optimized for zero-shot prompts, despite much smaller than GPT-3
RETRO another model with GPT-3 capabilities with fewer parameters?
gpt3 is gpt2 with sparse attension, which enables it to generate long sequence
metaseq provides OPT, which is basically GPT3
GPT-JT altered in many ways, trained on natural instructions huggingface space
Bloom large language model by bigscience
autonomous learning
autonomous-learning-library doc and repo
Gu-X doing god-knows-what experiments
analysis about how to make such model
gpt3 is capable of imitation (cause it is unsupervised.)
but! if you want to get things done (when you really need it!), you better want some aligned AI.
two similar models by openai: webgpt and instructgpt
about instructgpt
it is first fine-tuned on supervised datasets, then train some reward model, then use the reward model to handle prompts and do reinforcement learning with PPO.
details on webgpt environment
guess: create states by performing actions, then generate templates to allow model filling blanks.
Our text-based web-browsing environment is written mostly in Python with some JavaScript. For a
high-level overview, see Section 2. Further details are as follows:
• When a search is performed, we send the query to the Microsoft Bing Web Search API, and
convert this to a simplified web page of results.
• When a link to a new page is clicked, we call a Node.js script that fetches the HTML of the
web page and simplifies it using Mozilla’s Readability.js.
• We remove any search results or links to reddit.com or quora.com, to prevent the model
copying answers from those sites.
• We take the simplified HTML and convert links to the special format
【<link ID>†<link text>†<destination domain>】, or
【<link ID>†<link text>】 if the destination and source domains are the same. Here,
the link ID is the index of the link on the page, which is also used for the link-clicking
command. We use special characters such as 【 and 】 because they are rare and encoded
in the same few ways by the tokenizer, and if they appear in the page text then we replace
them by similar alternatives.
• We convert superscripts and subscripts to text using ^ and _, and convert images to the
special format [Image: <alt text>], or [Image] if there is no alt text.
• We convert the remaining HTML to text using html2text.
• For text-based content types other than HTML, we use the raw text. For PDFs, we convert
them to text using pdfminer.six. For all other content types, and for errors and timeouts, we
use an error message.
• We censor any pages that contain a 10-gram overlap with the question (or reference answer,
if provided) to prevent the model from cheating, and use an error message instead.
• We convert the title of the page to text using the format <page title> (<page domain>).
For search results pages, we use Search results for: <query>.
• When a find in page or quote action is performed, we compare the text from the command
against the page text with any links stripped (i.e., including only the text from each link).
We also ignore case. For quoting, we also ignore whitespace, and allow the abbreviated
format <start text>━<end text> to save tokens.
• During browsing, the state of the browser is converted to text as shown in Figure 1(b).
For the answering phase (the last step of the episode), we convert the question to
text using the format <question>■, and follow this by each of the collected quotes
in the format [<quote number>] <quote page title> (<quote page domain>)
<double new line><quote extract>■.
access via api
https://github.com/altryne/chatGPT-telegram-bot
https://github.com/taranjeet/chatgpt-api
https://github.com/acheong08/ChatGPT
https://github.com/vincelwt/chatgpt-mac
https://github.com/transitive-bullshit/chatgpt-api
https://github.com/rawandahmad698/PyChatGPT
models like chatgpt
lfqa retrival based generative QA
lm-human-preferences by openai
trl Train transformer language models with reinforcement learning based on gpt2
trlx A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF) by CarperAI
RL4LMs A modular RL library to fine-tune language models to human preferences
PaLM-rlhf-pytorch saying this is basically chatgpt with palm
gpt-gmlp saying this design integrates gpt with gmlps so will use less ram and can be trained on a single gpu
tk-instruct with all models by allenai can be multilingual, trained on natural instructions
there’s a ghosted repo named instructgpt-pytorch found in bing but no cache preserved, also an empty repo called InstructFNet wtf?
AidMe Code and experiment of the article AidMe User-in-the-loop Adaptative Intent Detecttion for Instructable Digital Assistant
cheese Used for adaptive human in the loop evaluation of language and embedding models.
Kelpie Explainable AI framework for interpreting Link Predictions on Knowledge Graphs
GrIPS Gradient-free, Edit-based Instruction Search for Prompting Large Language Models
queakily nlp datasets cleaner
gpt-j
super big bilingual model GLM-130B
multi-modal deeplearning paper collections
bloom a huge model like gpt-3
notice, gpt-2 is somehow inferior to gpt-3 since it has smaller model parameters
dialogue-generation Generating responses with pretrained XLNet and GPT-2 in PyTorch.
personaGPT Implementation of PersonaGPT Dialog Model
DialoGPT Large-scale pretraining for dialogue