Autonomous Machines & Society.

2022-09-12
Audio And Music Tools

Audio and music processing tools.

Related links:

🔗 Audio, Music, Radio, and Podcast Streaming Apps

🔗 Audio and Music Tools

🔗 Free Audio and Music

AI Apps for Audio

AI Music Separation

AI Music Recognition

Audio Creation

Audio and Music Programming

AI Audio and Music Generator

Audio and Music Creation and Sharing

  • Beepbox : BeepBox is an online tool for sketching and sharing instrumental melodies.

  • Jummbox : Jummbox is Beepbox modification (online tool for sketching and sharing chip tune melodies).

WASM Audio Processing

Audio Player

Online Audio Player

Peer to Peer Audio

Audio Editor

Desktop Audio Editor

  • Audacity, #opensource

Online Audio Editor

Other Audio Tools

Working with Virtual Audio Tools

Audio Recording - Transcription

Benefits/drawbacks:

  • We cannot listen the audio output from playback apps

VAC Audio Recording

Use cases:

  • Record Zoom Webinar to Speechtexter, Google Docs etc.

Audio Input Mixing

Audio Input Mixing

Use cases:

Audio Output Mixing

Audio Output Mixing

Setting:

Use cases:

  • Transcribe (and listen) Zoom Seminar to Speaker and Speechtexter, Google Docs

  • Translate Youtube Video using Google Translate

Zoom Audio Recording/Transcription

Zoom Recording Transcription

Google Meet Audio Transcription

But we cannot do both Speechtexter and Google Docs

We cannot do both Speechtexter and Voice Note

We can do both Google Meet and Speechtexter/VoiceNote/Dictanote

We can do both Zoom and Speechtexter/VoiceNote/Dictanote

Read More

2022-09-12
Song Recognition, Music Recognition Api, Music Discovery, Audio Search, Audio Fingerprint

music discovery

find music by combining choices of experts

build recommendation engine using spotify api

use machine learning to predict and recommend music

similar song matching using last.fm or spotify

netease 网易云听歌识曲

demo

self-hosted

shazam algorithm

blog: create your on shazam

findit another shazam clone

audiotag.info

portify recognize local music and add them to spotify playlist

wav_to_info.py

audio identifier

nowspinning get song info and cover art

shazam

shazamio reverse engineered shazam api with many supported functions

audiorec shazam client for linux, with cli support

midomi houndify soundhound

to get track info from soundhound (no cookie):

https://www.midomi.com/api/track?trackID=100282107076607645

the houndify music recognition api:

wss://houndify.midomi.com/

some lib for midomi found over here

it can also bridge yandex music recognition api

houndify also have api for that, but requires credit

free credits per day: 100

soundhound now requires 1 credit

soundhound python sdk

Read More

2022-09-12
Mastering Video And Article Uploads: Tips For Cover Images, Introductions, Tags, And More

爬取 分析视频素材的流程

聚焦在主流平台

主流平台就是有现成api搜索的平台

如果没有api 就需要用playwright 但是可能耗时更长 也更加偏离找素材的关键

目前主流平台都有现成高级api可以对接 如果找不到api则可能说明不是主流的平台

主流平台分为主流媒体和搜索引擎两大类

如果需要搜索小众的平台 建议对接搜索引擎 加入高级搜索参数 搜出来的链接拿来分析看能不能直接下载

api应该具有的功能

搜索 高定制搜索 可以做到不重复

视频信息(时长 播放量 简介 封面 字幕 标题 标签 评论)提取

相关视频推荐提取 首页推荐提取

热榜热搜提取 搜索补全提取

下载视频 尽量无水印 字幕

如果是要发布内容的平台 则需要有上传功能

上传视频 封面 简介 标签 合集信息 字幕

上传文章 图片

用什么关键词

找到最合适的 适合当前生成框架的素材 需要自己去尝试总结

当然也可以用关键字和评论 视频播放量反馈机制寻找合适的关键词 或者是神经网络 机器学习 或者是图数据库 推荐算法

标签

关键词 ->视频 -> 同类视频

某个观众 -> 同个作者

如何分析视频

首先要裁剪画中画 再去除水印 去文字 提高画质 提高帧数 如果需要提取识别字幕就需要在指定区域识别 语音识别如果要做就需要分离人声 检测文字流畅度 (对于外文或者歌曲可能不会很流畅)

根据一定的标准筛选 裁剪时长和画布 比如时长 音量 光流 文字面积 是否有人像 人物的动作幅度

如果要分离人声 一般要配合相应的字幕 还得变声 检测说话人有几个(如果多人说话 语音识别可能不会正常工作 文字流畅度低) 是男是女

如果需要音乐 BGM 一般不直接从视频里面提取 而是从简介里面找到关键字 拿到专门的音乐平台去搜索 音乐也可能需要筛选一下 根据类别和播放量 评论反馈筛选

Read More

2022-09-12
Use Pyscenedetect Dynamically In Program

对于单纯拼接起来的视频 这个算法就如同手术刀一样精准

首尾有可能有一两帧看起来不太对 但是可以通过调节start和end来修正

警惕频繁转场的视频 它们可能是属于同一个小片段的 但是如果不打乱顺序有可能会触发版权识别问题

即使选出来了可以使用的片段 对于同一个视频的制作过程 依旧要隔一段时间采样 比如两个片段间隔至少5秒 不要单纯的把所有片段一次性提取出来 避免内容重复和版权问题

当然对于有渐变 转场的视频 可能需要用其他的检测方法

1
2
3
4
5
6
7
8
9
10
11
12
13
from scenedetect import open_video, SceneManager, split_video_ffmpeg
from scenedetect.detectors import ContentDetector
from scenedetect.video_splitter import split_video_ffmpeg
def split_video_into_scenes(video_path, threshold=27.0):
# Open our video, create a scene manager, and add a detector.
video = open_video(video_path)
scene_manager = SceneManager()
scene_manager.add_detector(
ContentDetector(threshold=threshold))
scene_manager.detect_scenes(video, show_progress=True)
scene_list = scene_manager.get_scene_list()
split_video_ffmpeg(video_path, scene_list, show_progress=True)

Read More

2022-09-12
Elinks/Lynx With Python: How To Speed Up Headless Website Browsing/Parsing/Scraping With Cookies

newscrawl 狠心开源企业级舆情新闻爬虫项目:支持任意数量爬虫一键运行、爬虫定时任务、爬虫批量删除;爬虫一键部署;爬虫监控可视化; 配置集群爬虫分配策略;👉 现成的docker一键部署文档已为大家踩坑

general news extractor for extracting main content of news, articles

1
2
pip3 install gne

first of all, set it up with a normal user agent

even better, we can chain it with some customized headless puppeteer/phantomjs (do not load video data), dump the dom when ready, and use elinks/lynx to analyze the dom tree.

to test if the recommendation bar shows up:

https://v.qq.com/x/page/m0847y71q98.html

to make web page more readable:

https://github.com/luin/readability

load webpage headlessly:

https://github.com/jsdom/jsdom

https://github.com/ryanpetrello/python-zombie

Read More

2022-09-12
Exploring Ffmpeg'S Advanced Encoding, Conversion, And Audio Functionality

ffmpeg one liners

For all snippets, check documentation for details and settings.

speed up ffmpeg encoding

ffmpeg speedup cli flags

ffmpeg -threads 4 -crf 28 -preset ultrafast

encode video from a V4L2 device, using specified settings.

x265 worked somewhat better here and produced less skips (although uses 10x CPU compared to x264)

ffmpeg -f video4linux2 -framerate 30 -input_format mjpeg -video_size 1920x1080 -i /dev/video6 -c:v libx265 -preset ultrafast -c:a none -crf 20 out.mp4

Convert a raw YUYV422 frame from my USB “microscope” to PNG:

other valid pixel formats are e.g. rgb24 or yuv420p

ffmpeg -f rawvideo -video_size 2592x1944 -pixel_format yuyv422 -i input_yuyv422_2592x1944.dat -f image2 output.png

copy a portion of a video, copying and not recoding. Might need to use the same container as the input

ffmpeg -i $input -ss $seek_to_seconds -t $output_length -c:v copy -c:a copy $output

Export and show h.264 MVs

ffplay -flags2 +export_mvs input.mkv -vf codecview=mv=pf+bf+bb

Show motion vector estimate on any input:

ffplay $input -vf mestimate=epzs:mb_size=16:search_param=32,codecview=mv=pf+bf+bb

Select one frame every 10, set presentation time stamp to 10x (0.1*PTS), deshake, do not copy audio

ffmpeg -i MOV_3147.mp4 -vf ‘select=not(mod(n,10))’,setpts=0.1*PTS,deshake=edge=blank:rx=64:ry=64:blocksize=4:contrast=31 -tune grain -crf 17 -an wolken-2-deshake.mkv

Extreme high quality deshake:

ffmpeg -i input.mkv -vf ‘select=not(mod(n,20))’,setpts=0.05*PTS,mestimate=hexbs,vidstabdetect=shakiness=10:result=transforms.trf

ffmpeg -i input.mkv -vf ‘select=not(mod(n,20))’,setpts=0.05*PTS,mestimate=hexbs,vidstabtransform=crop=black:smoothing=0:optzoom=0

or for pass 2

ffmpeg -i MOV_3147.mp4 -vf ‘select=not(mod(n,20))’,setpts=0.05*PTS,vidstabtransform=crop=black:smoothing=180:optzoom=0:interpol=bicubic -an -vcodec libx265 -crf 16 -tune grain wolken-2-deshake.mkv

Tonemap a ITU.2020 (HDR, high gamut, ususally 4K) video to ITU.709 (1080p)

Also see: https://stevens.li/guides/video/converting-hdr-to-sdr-with-ffmpeg/

ffmpeg -i file.mkv -vf zscale=t=linear:npl=100,format=gbrpf32le,zscale=p=bt709,tonemap=tonemap=hable,zscale=t=bt709:m=bt709:r=tv,format=yuv420p -crf 20 -acodec copy output.mkv

Extract all frames as .jpg

ffmpeg -r 1 -i file.mp4 -r 1 frames_%05d.jpg

Create video from all the frames (missing any codec spec):

ffmpeg -r 30 -i frames_%05d.jpg output.mp4

XXX the following where copied from

https://hbish.com/ffmpeg-one-liners/

Get infomation for a audio/video

ffmpeg -i file.mp3

Convert video into images

ffmpeg -i video.avi image_output%d.jpg

Split audio files

Generate a section of the audio from the 30 second mark (start) for 15 seconds (duration)

ffmpeg -f mp3 -i input.mp3 -t 00:00:30 -ss 00:00:15 output.mp3

Covert avi to animated gif

ffmpeg -i video.avi output.gif

Add audio to a video file

ffmpeg -i music.mp3 -i video.avi output.mpg

Extract audio from video file”>## Useful for extracting music from youtube videos

avi to mp3

ffmpeg -i video.avi -vn -ar 44100 -ac 2 -ab 192k -f mp3 output.mp3

flv to mp3

ffmpeg -i video.flv -ar 44100 -ac 2 -ab 192k -f mp3 output.mp3

Read More

2022-09-11
Motion Vector Estimation, Motion Vector Export, Ffmpeg Advanced Usage

motion verctor estimation, motion vector export, ffmpeg advanced usage

use cases

to detect hard-coded subtitles, crop the region and detect sudden changes

can also use pyscenedetect to do the job

pyav

docs

1
2
pip3 install av

remove/detect silence

… silencedetect A->A Detect silence.

… silenceremove A->A Remove silence.

frame interpolate

1
2
3
4
ffmpeg -y -i "/root/Desktop/works/pyjom/tests/random_giphy_gifs/samoyed.gif" \
-vf "minterpolate,scale=w=iw*2:h=ih*2:flags=lanczos,hqdn3d" \
-r 60 ffmpeg_samoyed.mp4

motion estimation

to get mosaic motion vectors and visualize:

1
2
3
4
ffmpeg -i "/root/Desktop/works/pyjom/tests/random_giphy_gifs/samoyed.gif" \
-vf "mestimate=epzs:mb_size=16:search_param=7, codecview=mv=pf+bf+bb" \
mestimate_output.mp4 -y

get help

on specific filter:

1
2
ffmpeg -h filter=showspectrumpic

on all filters:

1
2
ffmpeg -filters

crop detection, picture in picture (PIP) detection

1
2
3
4
ffmpeg -i "/root/Desktop/works/pyjom/samples/video/LiEIfnsvn.mp4" \
-vf "mestimate,cropdetect=mode=mvedges,metadata=mode=print" \
-f null -

scene change detection

1
2
3
4
5
6
ffmpeg -hide_banner -i "$file" -an \
-filter:v \
"select='gt(scene,0.2)',showinfo" \
-f null \
- 2>&1

extract motion vectors

ffmpeg can produce motion vector estimation but it is not exportable, only for internal use.

mp4 format provides motion vector information thus maybe we need not to use GPU to get those ‘optical flow’ data.

extract by using ffmpeg apis

mv-extractor Extract frames and motion vectors from H.264 and MPEG-4 encoded video.

extract from mp4 file

mpegflow for easy extraction of motion vectors stored in video files

mv-tractus: A simple tool to extract motion vectors from h264 encoded videos.

take screenshot at time:

1
2
ffmpeg -ss 01:10:35 -i invideo.mp4 -vframes 1 -q:v 3 screenshot.jpg

video denoise filters:

dctdnoiz fftdnoiz hqdn3d nlmeans owdenoise removegrain vaguedenoiser nlmeans_opencl yaepblur

super-resolution, resampling:

deeplearning model, tensorflow

1
2
3
4
5
6
env LD_LIBRARY_PATH=/root/anaconda3/pkgs/cudatoolkit-10.0.130-0/lib/:/root/anaconda3/pkgs/cudnn-7.6.5-cuda10.0_0/lib/:$LD_LIBRARY_PATH \
ffmpeg -i "/root/Desktop/works/pyjom/samples/video/LiEIfnsvn.mp4" \
-y -vf \
"sr=dnn_backend=tensorflow:model=./sr/espcn.pb,yaepblur" \
supertest.mp4

use standard scale method:

1
2
3
4
ffmpeg -y -i "/root/Desktop/works/pyjom/tests/random_giphy_gifs/samoyed.gif"\
-vf "minterpolate,scale=w=iw*2:h=ih*2:flags=lanczos,hqdn3d" \
-r 60 ffmpeg_samoyed.mp4

options:

‘fast_bilinear’

Select fast bilinear scaling algorithm.

‘bilinear’

Select bilinear scaling algorithm.

‘bicubic’

Select bicubic scaling algorithm.

‘experimental’

Select experimental scaling algorithm.

‘neighbor’

Select nearest neighbor rescaling algorithm.

‘area’

Select averaging area rescaling algorithm.

‘bicublin’

Select bicubic scaling algorithm for the luma component, bilinear for chroma components.

‘gauss’

Select Gaussian rescaling algorithm.

‘sinc’

Select sinc rescaling algorithm.

‘lanczos’

Select Lanczos rescaling algorithm. The default width (alpha) is 3 and can be changed by setting param0.

‘spline’

Select natural bicubic spline rescaling algorithm.

‘print_info’

Enable printing/debug logging.

‘accurate_rnd’

Enable accurate rounding.

‘full_chroma_int’

Enable full chroma interpolation.

‘full_chroma_inp’

Select full chroma input.

‘bitexact’

Enable bitexact output.

Read More

2022-09-11
Unlocking The Power Of Ai: Exploring Deep Learning Datasets And Resources

datasets for deeplearning, classification and evaluation

100 audio and video datasets

video datasets

youtube8m

Read More

2022-09-11
Random Giphy Gifs

gif ratings

official doc

vulgar rate:

r > pg-13 > pg > g

while ‘y’ means accepting all shits, not ‘youth’ nor ‘young’

implementation

giphy has many extensible apis. i guess most media platforms are all the same (complex enough), but we have to start somewhere though…

giphy has ‘clips’ now. clips are gifs with sound, just like short videos.

beta key limitations:

1000 requests per day, 42 requests per hour

or just use the public beta key? does that subject to the rate limit?

this public beta key is trashed.

1
2
var PUBLIC_BETA_API_KEY = 'dc6zaTOxFJmzC';

is this public api key? maybe it is both api and sdk key.

Gc7131jiJuvI7IdN0HZ1D7nh0ow5BU6g

api keys:

IoJVsWoxDPKBr6gOcCgOPWAB25773hqP

lTRWAEGHjB1AkfO0sk2XTdujaPB5aH7X

sdk keys:

6esYBEm9OG3wAifbBFZ2mA0Ml6Ic0rvy

sXpGFDGZs0Dv1mmNFvYaGUvYwKX0PWIh

to use api:

https://github.com/austinkelleher/giphy-api

to use sdk:

https://github.com/Giphy/giphy-js/blob/master/packages/fetch-api/README.md

find public api keys inside html:

1
2
3
4
5
6
7
8
9
window.GIPHY_FE_MOBILE_API_KEY = "L8eXbxrbPETZxlvgXN9kIEzQ55Df04v0"
window.GIPHY_FE_WEB_API_KEY = "Gc7131jiJuvI7IdN0HZ1D7nh0ow5BU6g"
window.GIPHY_FE_FOUR_O_FOUR_API_KEY = "MRwXFtxAnaHo3EUMrSefHWmI0eYz5aGe"
window.GIPHY_FE_STORIES_AND_GIPHY_TV_API_KEY = "3eFQvabDx69SMoOemSPiYfh9FY0nzO9x"
window.GIPHY_FE_DEFAULT_API_SERVICE_KEY = "5nt3fDeGakBKzV6lHtRM1zmEBAs6dsIc"
window.GIPHY_FE_GET_POST_HEADERS_KEY = "e0771ed7b244ec9c942bea646ad08e6bf514f51a"
window.GIPHY_FE_MEDIUM_BLOG_API_KEY = "i3dev0tcpgvcuaocfmdslony2q9er7tvfndxcszm"
window.GIPHY_FE_EMBED_KEY = "eDs1NYmCVgdHvI1x0nitWd5ClhDWMpRE"

search for ‘ear flops’ to locate the tags in ‘samoyed.html’

Read More

2022-09-10
Gfw Circumvention, Download Youtube Videos, Scrape Banned Websites

binder as colab alternative

apart from kaggle, you can also use github actions, devops and more, if only we can get the results in time with code.

github integrated ci platforms

cirrus graphql spec with artifact info

github actions api: download artifact

circleci: artifact

azure pipelines: artifacts

Read More