use txtai to do NLU and recommend things to people
topic discovery/acquiring
trending topics
baidu search trending
sogou trending
bilibili trending
wechat trending
toutiao trending
tencent trending
netease trending
youtube trending
reddit trending
twitch trending
popular topics
baijiahao popular topics
bilibili popular topics
douyin popular topics
personal/customized topics
tencent qq customized (can associate with mail)
wechat customized
bilibili per user customized
dog/cat video generation
make render engine runnable
issues:
video length too long (10 mins)
it was the speed calculation error.
bgm somehow not in sync (too broad bpm/clip ranges?)
to analyze the peaks (abrupt changes) in bgm and grab louder peaks using pyloudnorm (getting audio volume)
1 2
pip3 install pyloudnorm
1 2 3 4 5 6 7 8
import soundfile as sf import pyloudnorm as pyln data, rate = sf.read("0055014.wav") # load audio (with shape (samples, channels)) print(data.shape) meter = pyln.Meter(rate) # create BS.1770 meter loudness = meter.integrated_loudness(data) # measure loudness print(loudness)
place video on loudest points, abrupt changes detected by talib or just take direvative and gaussian average
video too repetitive (small corpus?)
do not remove subtitle and crop active region (reviewer’s resource not used? but i rather advise you to do it directly since it requires less computational power)
do not have minimum motion threshold (reviewer’s fault? also recommend you to do this in producer)
remove all watermarks, subtitles and crop video boundaries accordingly
source video and audio (infinite, basic test is to find 500 sources at once without duplicate, second test is to find 500 second is to find 500 without duplicate twice), improve highlight algorithm
find 500 songs without duplicate at once
find 500 songs no duplicate twice
find 500 animal videos without duplicate
find 500 animal videos no duplicate twice
generate appropriate title, cover, info and tags
collect feedback after the post
find some shocking fonts for cover and subtitle, english and chinese
make that karaoke effect
make ass with karaoke effect with lrc files
make lyrics sync logic fluent, according to what have learned from karaoke effects
make selected video clips fluent, no abrupt cuts, maybe we need pyscenedetect?
text to video, template based video generator (this is perhaps the most complex video generator ever. do it with caution, it might also includes the flipcard, narrator and slideshow based generators)
generator models subarchitecture (subcategories of template based generators)
flipcard
slideshow (video and audio, might also include the dog&cat video!)
narrator
summarized video
policy evasion, NSFW filters
remove all hints from image, video, audio and script that may lead to copyright issues
analyze the media content and metadata, relationships
analyze danmaku
paraphrase the script
cut the crap and understand each clip’s meaning
process the video clips, like changing the human figure, changing face, stylish the video, adding 2d to 3d effects
process the audio clips, like changing voice, adding sound effects, separating audio/music tracks, ducking
index, retrieve and align video and audio content according to our collected database
retrieve and align video and audio according to our smart search agent (keyword extractor, related words) and do live compilation
qq managing
mitm chats in friends
mitm chats in groups
source and send pictures to qzone
source and send pictures to chat
reduce posting frequency by group size and feedback
post relative video link relative to group topic
personal info collecting and email/sms bulk sending
avoid mail being trashed or turned into junk
collect and make mail templates for mail posting
voice changer
vst based voice changer
train or find a decent voice generator 御姐音语料库 小受音语料库
请在b站或者qq群里面寻找 或者什么其他的有关的地方寻找 谢谢
直播 live streaming
source the video
如果是同一个站的 尽量放一个月以前的视频 半个月以前的音频
prepare some space for storing live streaming data
note: in order to install libraries and dependencies, you need ubuntu inside termux. create shortcuts and alias to launch ubuntu. link files and directories to ububtu proot filesystem. also never attempt to update kali since that will break shit, especially for bumping python versions.
schedule the training on minute basis first for complete test, then schedule it on fixed time per day.
for qq client: dump 500 continual sentences when adding one new while holding the filelock, do not block or stop running if GPT not responding
for gpt2 server: (where to train? how to prevent maching from burning? for how long?)
rename the dataset while holding the filelock
always keep the latest 2 models, remove those not readable first, then delete older ones.
if train on CPU, still need to limit training time, sleep while doing so. GPU for sure we need sleep during training, and do not use VRAM for other applications.
debug the consecutive group reply thresholding protocol
reply according to individual description and group description
同时推广自己和别人的视频或者内容 收集推荐反馈 同时逐步减小推荐别人视频或者内容的频率
推广视频的时候可以加入别人的视频高赞评论 动态的GIF 音频 或者是短视频 然后再发送xml
增加复读图片的功能 增加chatlocal返回图片的功能
增加反馈功能 根据发言之后群里面的回复来确定发言是否有益
用txtai或者其他information retrieval (semantic search, smart search)语义查找工具来代替levenshtein的history based reply logic 查找时要包括上下文
复读机不能使得死群活起来 但是主动推送可以 推送长的 自言自语的对话到群里面 不能是同一个群 主题要相关 filter out too negative ones
拉人到别的群里面来 最好是多个号不共享群但是话题有交集的人
add involution option, allow to append unqualified replies to input message, separated by space.
add extend conversation option, allow to reply more than one sentence at a time (proper delay needed) -> could be achieved by using GPT2 alike generation model