pyjom producer

video and audio needs to be analysised separately.

audio can be processed by chunks, splited tracks, while video can be itered frame by frame.

Video Database

Video Database For Video Generation

A fastai/PyTorch package for unpaired image-to-image translation.


视听分割 视频注意力机制

only segment video objects that make sounds, video/audio combined segmentation:


video object tracking and segmentation unified framework:


video object segmentation handle long video with ease:


when removing video watermarks, remember to ease in/out. that is said, do not stop blurring immediately after the end mark. instead, extend the blur time and decrease blur level incrementally. also, the blur ease-in is needed for the start mark, blur ahead of the start mark and ease in incrementally.

descriptive information generation from video/image:




video understanding/captioning:











action recognition:






The data remaining only have texts, danmaku, likes, titles, intros, comments, tags, image/video analysis results(short description). You can only generate video from generated metadata or given rules. Find similar words, similar danmaku, similar features, comments or the inverse, according to the selected topic and main idea.

Analyze video when downloaded, mark its highlights, analyze texts and danmaku. Get video segments and audio segments.

Collect pictures/videos with given rules, namely finding the head of somebody, with how many likes, keywords.

Split audio and grab the main speaker. clone the voice and perhaps changes the gender.

Split video and do human/image segmentation if human/target is found. put it onto another human/target’s background masking the original human, with similar areas and movements.

Analyze video with off-topic(offline) and of-topic(online) sources.

Remove watermark according to username.

Generate danmaku and generate video accordingly. Generate texts and generate video accordingly. Doing faceswap, talking head and human/image segmentation accordingly.

Gpt-2 以及文本生成

Content Usage

Use the original transcript for paraphrasing, while using danmaku for joke generation.

Mmdetection And Mmd Dancing

3d 虚拟形象动作生成 视频生成 虚拟偶像 Vtuber:


human pose detection:


opengl recording:






download expose models:


smpl-x model download:


model zoo:


mmd auto tracking:




smplx expose alternative body tracker:


face tracking:


anime face detector:



anime facial features:


repair anime images:


paint manga from sketch (with color blocks):


if we can re-trace the action/expression done by vtubers, we can monetize those “highlight cuts”.

you can firstly find points in datasets and then generate mmd videos, and then create trainset. you can also generate pose from raw video and then create dataset.

found occasionally when browsing MMD, but found this with so many stars, which is an instance detection/segmentation library.


while rendering mmd can be done with mmd viewer like https://github.com/benikabocha/saba or could use renderer like blender or unity. we must bake physics before dancing.

found other dedicated renderer for mmd, with bullet physics:


found interesting repo of poetry composing:


mediapipe/paddlevideo alike:


three.js has multiple loaders:



render MMD using saba lib:



music based dance:






Articulated Animation

This one is dead simple. Use real human to talk as you go.


Install Vscode Extensions By Id



Converting Partial Zhihu Viral Content

just add a trailing sentence.


Python Media Automation

we first see the world, get the observation and respond in the form of content. it is a feedback loop.

to search components in videos, first take screenshots then do image search, then use the keywords to get the source video.

breakdown approach:

granualize every step, showing all possibilities to get content created and then optimoze it using standards.

filter approach:

establish some topics, create topic specific approaches to arrange the content, choose the best among all topics.

are they compatible? are you sure it is modular, scalable and extensible?

for novices, they have few unpolished ideas and waiting to realize it using code. but it lacks the feedback loop and thus you are unable to change yourself according to the reaction. breakdown approach must be used to automate the optimization, and topic based approach is simple at first hand.

to avoid copyright issues search for google.

topic based approach assues the public always have something in common and thus you only search specific things at first hand. they are easy to control, static and consistent. breakdown approach is where the evolution begins.

let’s assume our topic is about pets on weibo. pets have different kinds and the content creaters are different from each other. all we do is to download and upload. we get descriptions from our viewers, video play counts and various feedback. we improve the source by our feedback, searching for more untouched contents and more mixes like video/audio crossing.

breakdown approach is demostrated first-hand with our actor-critic model. we first view all possible posts from all sources, find what’s interesting and repost it to our target platform. this is likely to be cheating. we again choose our sources, our approach of modification based on feedback. topics are generated from the very first step.

the model of interests, which generates the topic, is the key breakdown approach. we have to eventually construct a breakdown approach to boost our searches in every aspect. feedback is one of those key features. we eventually have to view the content with the machine. suggest using the breakdown approach now.

anatomy of the post:

first thing it would be postable, according to our mandatory order. it would not be taken down or banned for a long time. banning detection is required and usually simple to test against.

second it is most profitable. we only prefer those tasks which give the most output. occasionly we choose something fresh despite lower expectations.

third it would be resourceful. consistently pinning audience in a series of videos is undoubtably competitent. this can be reached by utilizing our creativity engine based on comments and imagination, realize the unrealized.

have not yet found anything systematic on giving the full detail of such automated content creation system. we only pick up those pieces. it is important to make the entire design flexible and create miniature tests to fabricate the system. like any other famous writer/director, you could only name it but not reproduce it.

hands on the approach, no matter it is inspired by anyone or anything, it is time to begin, to complete the feedback loop.

not a pipe, but a loop.

we demonstrate the loop using fake data, then the real ones. maybe the initial topic is also meant to be fake data. the real world data is too stochastic for us to imagine. better construct something specific.

Cats Video With Lyrics_2

Cats video with lyrics (Algorithm)

Finally, the cats.

To harvest video by tags, authors, categorize them, count the views divided by time, model the predicted views and select only the best, original videos to upload.

The views on other platforms does not matter and views on other channels does not matter, in the long term. We might try to embrace them because of this number seen on others but we need real feedback on our channels.

Filters can be generated by regular expressions, common patterns found in text, and indexed. Filters generated from data. We index filters, record them according to views.

I am selling myself on cheap. Maybe not.

Using the same strategy of searching the web.

api of weibo (this is news!):



we need videos. you download it from baidu ai cloud.

we need feedback. we need this since this is how we try things.

anything can be hackish, especially for this damn “we media”. anything can be popular, from basic bash script to regular expressions, based on how you looking at it. we shall not feed our audience full of something than another unless we know for sure it is our target.

it is the power of the mirror! the feedback!

create a fake, dummy project full of skeletons and then realize every step till we are done.

