Use ipython instead of python to test these code, get better parameter hints and dynamic hints, just like what you saw in brownie.
Use dataset.transform instead of dataset.map to save loading time.
Image processing
Many language models resize, reshape & pad the input image into 224x224 square and put into ViT directly.
To simplify the pipeline, we would recommend you to sample the image into fixed size square patches, like 2x2, 4x4 etc.
Or you can skip the ViT embedding part, just use Fuyu-8b or take its architecture FuyuForCausalLM and processor FuyuProcessor because it supports arbitrary sized images.
import numpy as np from empatches import EMPatches image = np.random.rand(512, 512, 3) emp = EMPatches() patches, indices = emp.extract_patches(image, patchsize = 128, overlap = 0) print(patches) # a list of numpy arrays, total 16 items print(indices) # [(x_start, x_end, y_start, y_end), ...], total 16 items
Convert fixed-size patches into embeddings
The embeddings from ViT cannot be used directly by LLM. Instead, use LayerNorm and Dense as simple adaptors.
The first token is the class token, randomly initialized and processed along with the transformer, output as the summary of the full image, can be extracted for image embedding.
Proof:
1 2 3 4 5
224x224 is the shape of input image 16x16 is the patch size 224/16 = 14 14*14 + 1 = 197
1 2 3 4 5 6 7 8 9 10 11 12
import torch import transformers # not torch.randn (sample from normal distribution) image = torch.rand(3, 224, 224) # chw model_name = "google/vit-base-patch16-224-in21k" processor = transformers.AutoImageProcessor(model_name) # for processing image image = processor(image, do_rescale=False) # use this parameter when passing values ranging from 0 to 1 #image = processor(pil_image) # can also handle pil image model = transformers.ViTModel(model_name) outputs = model(pixel_values = image) embeddings = outputs.last_hidden_state[:,0,:] # torch.Size([1, 768])
Audio processing
An useful and related field to speaker diarization in video processing is visual entity recognization, which can help you identify anime or movie characters across different frames.
When unsure, the agent shall consult online search engines, subtitles and existing recognized entities for classification. If a dataset is successfully created, one can train a YOLO model to speed up the process, used along with popular person/anime head detection models.
In most videos speakers and visuals are aligned. You can first identify speakers then get character identification. Remember you need to use special pipeline for long-time diarization, sharing speaker features for cross-audio diarization.
For multilanguage context, you would like to use speaker detection models like pyannote. Diart is a speech processing library based on that and can be used in real time, with speaker diarization, voice activity detection training pipelines.
Whisper-streaming uses LocalAgreement algoritm to segment chunks of audio and merge common patterns.
Whisper architecture is comprised of an audio encoder and transcription decoder. The output of the encoder is feed into every cross attention layer of the decoder. For feature extraction, you only need to use the encoder.
You pass single channel audio amplitude array to audio feature extractors with predetermined audio sample rate. If the sample rate mismatch, you need to resample the audio.
Different audio transformers choose different context window sizes. Like LLMs, they can be streamed. However during training they must use a fixed context size.
For Whisper, the context size is 30 seconds. Confugurable at: transformers.WhisperFeatureExtractor(chunk_length=30, ...)
For AST, it is 10.24 seconds. You can find more info about input and output sizes here. Configurable at: transformers.ASTFeatureExtractor(max_length=1024, ...)
These numbers can be found over respective processor parameters.
simply because the original note on chatgpt is too long, we start a new one, with more topics and more resources.
visit poe.com for a bunch of free chatbots, including GPT-4
chathub is a browser plugin which you can use ChatGPT, Bing and Bard
ts_server supports a bunch of models like GPT-J, GPT-NeoX, GPT-Neo, OPT, Fairseq GPT, M2M100, CodeGen, GPT2, T5, RWKV, LLAMA and Stable Diffusion, used by textsynth.com
to manage python versions and environments, pyenv and venv is lightweight and miniconda or mamba is more sophisticated.
javascript code for extracting model list from huggingface personal homepage:
1 2 3 4 5
var arr = []; for (var i ofdocument.getElementsByTagName("h4")) {var t = i.innerText; var tlist = t.split("/"); var t0 = tlist[0]; var t1 = tlist[1]; arr.push(`| [${t1}](https://huggingface.co/${t}) | unknown | unknown | ${t0} |`)}; console.log(arr.join(' '));
pynput as input event listener and actor, listener may have some strange keycodes when pressing modifier keys on windows.
note that special care needed for aligning mouse location with screenshot size
ViT-pytorch can be used in many ViT-based models, listed and implemented in the repo.
spaces
openai universe (blog post here) and starter agents, remotes are using vnc protocol and a reward protocol using websocket sending json (can send actions). they prefer TigerVNC, maybe that will send the existing monitor instead of invisible ones.
gym is classic and modular. atari-py enables old games
retro deprecates universe, but might help with general computer controlling AI systems since they are compatible. human don’t play games all day and night. beware of this and don’t turn the model into a heavy gamer.
there is no meaning of recording terminal input/output when using tools like vim. get screenshots, keystrokes and mouse clicks instead (using ttyd, gremlins.js or monkey.js). tkterminal won’t do. it is just a thin wrapper around subprocess.run
talking of browser, you can spin up novnc server and let the gremlins.js do its job.
download official delta weights via [magnet](magnet:?xt=urn:btih:a7fac57094561a63d53eed943f904abf24c6969d&dn=Vicuna-13B-HF-fp16-delta-merged_2023-04-03&tr=udp%3a%2f%2ftracker.opentrackr.org%3a1337%2fannounce&tr=udp%3a%2f%2ftracker-udp.gbitt.info%3a80%2fannounce&tr=udp%3a%2f%2ftracker1.bt.moack.co.kr%3a80%2fannounce&tr=udp%3a%2f%2ftracker.tiny-vps.com%3a6969%2fannounce&tr=udp%3a%2f%2ftracker2.dler.org%3a80%2fannounce&tr=udp%3a%2f%2fopentracker.i2p.rocks%3a6969%2fannounce&tr=udp%3a%2f%2ftracker.altrosky.nl%3a6969%2fannounce&tr=udp%3a%2f%2ftracker.theoks.net%3a6969%2fannounce&tr=udp%3a%2f%2ftracker.dler.org%3a6969%2fannounce&tr=udp%3a%2f%2ftracker.torrent.eu.org%3a451%2fannounce&tr=udp%3a%2f%2ftracker.openbittorrent.com%3a6969%2fannounce&tr=https%3a%2f%2fopentracker.i2p.rocks%3a443%2fannounce&tr=http%3a%2f%2ftracker.openbittorrent.com%3a80%2fannounce&tr=udp%3a%2f%2ftracker.moeking.me%3a6969%2fannounce&tr=udp%3a%2f%2ftracker.monitorit4.me%3a6969%2fannounce&tr=udp%3a%2f%2f9.rarbg.com%3a2810%2fannounce
chatgpt retrieval plugin chop document into chunks, process them into vectors and search them using one of many vector search backends. hosted as a fastapi service.
datasets
assistant dialogue
botbots dataset (two chatgpt talking to each other), created by using datasetGPT (LLM automation tool)
online paraphrase tool translation based, can process chinese
chatgpt says creativity matters, such as hackers, video producers and your evil sex shit.
you said LDA and LSA are helpful for paraphrasing
Yes, LDA and LSA are two methods that can be used to generate paraphrases of a given text. LDA (Latent Dirichlet Allocation) is a probabilistic model that can be used to identify topics in a text and then generate new sentences that express the same topics in a different way. LSA (Latent Semantic Analysis) is a method that uses singular value decomposition to identify the underlying meaning of words in a text and then generate paraphrases by replacing words with their closest semantic equivalents. Both LDA and LSA can be helpful for generating paraphrases, but they each have their own strengths and limitations.
can you demonstrate how to use LSA for paraphrasing in python
I’m sorry, but I am not currently able to browse the internet or access any external resources. As a large language model trained by OpenAI, my knowledge is based solely on the text that I have been trained on, and I don’t have the ability to run code or demonstrate specific examples. However, I can provide a general overview of how LSA could be used for paraphrasing in Python.
To use LSA for paraphrasing in Python, you would first need to create a matrix of word vectors for the words in the text you want to paraphrase. This matrix would represent the semantic relationships between the words in the text, and would be used to identify the underlying meaning of each word.
Next, you would use singular value decomposition (SVD) to decompose the matrix into three matrices, which represent the different factors that contribute to the meaning of the words in the text.
Once you have the three matrices, you can then use them to identify the most closely related words for each word in the text, and generate paraphrases by replacing words with their closest semantic equivalents.
This is just a brief overview of how LSA could be used for paraphrasing in Python. There are many different ways to implement LSA, and the exact details of the implementation would depend on the specific requirements of your project.
continue
I’m sorry, but I’m not sure what you would like me to continue with. If you have a specific question or topic you would like to discuss, I would be happy to help. Otherwise, I can provide more information about LSA and how it can be used for paraphrasing if that would be helpful. Let me know what you would like to know.
how to use LDA for paraphrasing
To use LDA for paraphrasing, you would first need to train a LDA model on a large corpus of text. This would involve using a machine learning algorithm to identify the topics that are present in the corpus, and then generating a set of “topic vectors” that represent the distribution of those topics across the text.
Once you have trained the LDA model, you can then use it to generate paraphrases of a given text by identifying the topics present in the text and then replacing words with synonyms or related words that express the same topics in a different way. This process can be repeated multiple times to generate multiple paraphrases of the same text.
Here is a brief example of how LDA could be used for paraphrasing in Python:
1 2 3 4 5 6 7 8 9 10 11 12 13
from gensim import corpora, models # Load the text you want to paraphrase text = "The quick brown fox jumps over the lazy dog." # Tokenize the text and create a dictionary of words tokens = text.split(" ") dictionary = corpora.Dictionary([tokens]) # Train a LDA model on the text lda_model = models.LdaModel([dictionary.doc2bow(tokens)], num_topics=1) # Generate paraphrases of the text for i inrange(3): paraphrase = lda_model.show_topics(num_topics=1, num_words=len(tokens), formatted=False) print(" ".join([word[0] for word in paraphrase[0][1]]))
This example trains a simple LDA model on a single sentence and then generates three paraphrases of the sentence by replacing words with their closest semantic equivalents. Of course, this is just a simple example to illustrate the basic concept of using LDA for paraphrasing. In practice, you would need to use a much larger corpus of text to train the LDA model and generate more accurate paraphrases.
demonstrate using LSA to paraphrase in python, using gensim if needed
As I mentioned earlier, I am not currently able to browse the internet or access external resources, so I cannot demonstrate using LSA to paraphrase in Python using gensim or any other specific libraries. However, I can provide a general overview of how LSA could be used for paraphrasing in Python.
To use LSA for paraphrasing in Python, you would first need to create a matrix of word vectors for the words in the text you want to paraphrase. This matrix would represent the semantic relationships between the words in the text, and would be used to identify the underlying meaning of each word.
Once you have the matrix of word vectors, you can use singular value decomposition (SVD) to decompose the matrix into three matrices, which represent the different factors that contribute to the meaning of the words in the text. The gensim library includes a implementation of SVD that you can use for this purpose.
Once you have the three matrices, you can then use them to identify the most closely related words for each word in the text, and generate paraphrases by replacing words with their closest semantic equivalents. Here is an example of how this could be done in Python:
from gensim import corpora, models, matutils # Load the text you want to paraphrase text = "The quick brown fox jumps over the lazy dog." # Tokenize the text and create a dictionary of words tokens = text.split(" ") dictionary = corpora.Dictionary([tokens]) # Create a matrix of word vectors using the dictionary word_vectors = [dictionary.doc2bow(tokens)] # Use SVD to decompose the matrix of word vectors U, S, VT = matutils.svd(word_vectors, k=3) # Generate paraphrases of the text for i inrange(3): paraphrase = [] for word in tokens: # Identify the closest semantic equivalents for each word closest_words = VT.dot(VT[dictionary.token2id[word]]) closest_words = sorted(enumerate(closest_words), key=lambda x: x[1], reverse=True) closest_word = closest_words[0][0] # Replace the word with its closest semantic equivalent paraphrase.append(dictionary[closest_word]) print(" ".join(paraphrase))
This example trains an LSA model on a single sentence and then generates three paraphrases of the sentence by replacing words with their closest semantic equivalents. Again, this is just a simple example to illustrate the basic concept of using LSA for paraphrasing. In practice, you would need to use a much larger corpus of text to train the LSA model and generate more accurate paraphrases.
how to use transformer for paraphrasing
The transformer is a type of neural network architecture that has been used for a wide range of natural language processing tasks, including paraphrasing. To use a transformer for paraphrasing, you would need to train the transformer on a large corpus of text using a machine learning algorithm.
During training, the transformer would learn to identify the underlying meaning of words and phrases in the text and then generate paraphrases by replacing words with synonyms or related words that express the same meaning in a different way.
Here is a brief example of how a transformer could be used for paraphrasing in Python:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM # Load the text you want to paraphrase text = "The quick brown fox jumps over the lazy dog." # Tokenize the text using a pre-trained transformer model tokenizer = AutoTokenizer.from_pretrained("distilbert-base-cased") tokens = tokenizer.encode(text) # Load a pre-trained transformer model transformer = AutoModelForSeq2SeqLM.from_pretrained("distilbert-base-cased") # Generate paraphrases of the text for i inrange(3): paraphrase = transformer.generate(tokens) paraphrase = tokenizer.decode(paraphrase[0]) print(paraphrase)
This example uses a pre-trained transformer model to generate three paraphrases of a single sentence. Of course, in practice you would need to train the transformer on a much larger corpus of text to generate more accurate paraphrases. The exact details of the implementation would depend on the specific requirements of your project.
how to generate catchy title from keywords using language model
To generate a catchy title from a set of keywords using a language model, you would need to train the language model on a large corpus of text that includes a variety of titles. During training, the language model would learn to identify the underlying meaning of words and phrases and the characteristics of catchy titles.
Once the language model has been trained, you can then use it to generate a catchy title from a set of keywords by providing the keywords as input to the model and then selecting the generated title that is the most catchy and relevant to the keywords.
Here is a brief example of how this could be done in Python:
from transformers import AutoTokenizer, AutoModelForCausalLM # Load the keywords you want to use to generate the title keywords = ["quick", "brown", "fox"] # Tokenize the keywords using a pre-trained language model tokenizer = AutoTokenizer.from_pretrained("gpt2") tokens = tokenizer.encode(keywords) # Load a pre-trained language model language_model = AutoModelForCausalLM.from_pretrained("gpt2") # Generate candidate titles from the keywords candidate_titles = [] for i inrange(3): title = language_model.generate(tokens, max_length=8) title = tokenizer.decode(title[0]) candidate_titles.append(title) # Select the most catchy and relevant title selected_title = candidate_titles[0] for title in candidate_titles: if is_catchy(title) and is_relevant(title, keywords): selected_title = title break print(selected_title)
This example generates three candidate titles from a set of keywords using a pre-trained language model, and then selects the most catchy and relevant title as the final title. Of course, this is just a simple example to illustrate the basic concept of using a language model to generate a catchy title from keywords. In practice, you would need to use a much larger corpus of text to train the language model and generate more accurate and diverse titles.