2022-09-13

Audio Watermark Removal

Audio Source Separation Without Any Training Data.

DAP can also be successfully applied to address audio watermarker removal with co-separation. Given 3 sounds with audio watermarkers, our cosep model can generate 3 individual music sounds and the corresponding watermarker.

audio watermark may exists in sub or ultra frequencies. use ffmpeg to remove it

process audio with some equalizer

2022-09-12

Use Pyscenedetect Dynamically In Program

对于单纯拼接起来的视频这个算法就如同手术刀一样精准

首尾有可能有一两帧看起来不太对但是可以通过调节start和end来修正

警惕频繁转场的视频它们可能是属于同一个小片段的但是如果不打乱顺序有可能会触发版权识别问题

即使选出来了可以使用的片段对于同一个视频的制作过程依旧要隔一段时间采样 比如两个片段间隔至少5秒 不要单纯的把所有片段一次性提取出来避免内容重复和版权问题

当然对于有渐变转场的视频可能需要用其他的检测方法

from scenedetect import open_video, SceneManager, split_video_ffmpeg
from scenedetect.detectors import ContentDetector
from scenedetect.video_splitter import split_video_ffmpeg
def split_video_into_scenes(video_path, threshold=27.0):
# Open our video, create a scene manager, and add a detector.
video = open_video(video_path)
scene_manager = SceneManager()
scene_manager.add_detector(
ContentDetector(threshold=threshold))
scene_manager.detect_scenes(video, show_progress=True)
scene_list = scene_manager.get_scene_list()
split_video_ffmpeg(video_path, scene_list, show_progress=True)

ffmpeg one liners

For all snippets, check documentation for details and settings.

speed up ffmpeg encoding

ffmpeg speedup cli flags

ffmpeg -threads 4 -crf 28 -preset ultrafast

encode video from a V4L2 device, using specified settings.

x265 worked somewhat better here and produced less skips (although uses 10x CPU compared to x264)

ffmpeg -f video4linux2 -framerate 30 -input_format mjpeg -video_size 1920x1080 -i /dev/video6 -c:v libx265 -preset ultrafast -c:a none -crf 20 out.mp4

Convert a raw YUYV422 frame from my USB “microscope” to PNG:

other valid pixel formats are e.g. rgb24 or yuv420p

ffmpeg -f rawvideo -video_size 2592x1944 -pixel_format yuyv422 -i input_yuyv422_2592x1944.dat -f image2 output.png

copy a portion of a video, copying and not recoding. Might need to use the same container as the input

ffmpeg -i $input -ss $seek_to_seconds -t $output_length -c:v copy -c:a copy $output

Export and show h.264 MVs

ffplay -flags2 +export_mvs input.mkv -vf codecview=mv=pf+bf+bb

Show motion vector estimate on any input:

ffplay $input -vf mestimate=epzs:mb_size=16:search_param=32,codecview=mv=pf+bf+bb

Select one frame every 10, set presentation time stamp to 10x (0.1*PTS), deshake, do not copy audio

ffmpeg -i MOV_3147.mp4 -vf ‘select=not(mod(n,10))’,setpts=0.1*PTS,deshake=edge=blank:rx=64:ry=64:blocksize=4:contrast=31 -tune grain -crf 17 -an wolken-2-deshake.mkv

Extreme high quality deshake:

ffmpeg -i input.mkv -vf ‘select=not(mod(n,20))’,setpts=0.05*PTS,mestimate=hexbs,vidstabdetect=shakiness=10:result=transforms.trf

ffmpeg -i input.mkv -vf ‘select=not(mod(n,20))’,setpts=0.05*PTS,mestimate=hexbs,vidstabtransform=crop=black:smoothing=0:optzoom=0

or for pass 2

ffmpeg -i MOV_3147.mp4 -vf ‘select=not(mod(n,20))’,setpts=0.05*PTS,vidstabtransform=crop=black:smoothing=180:optzoom=0:interpol=bicubic -an -vcodec libx265 -crf 16 -tune grain wolken-2-deshake.mkv

Tonemap a ITU.2020 (HDR, high gamut, ususally 4K) video to ITU.709 (1080p)

Also see: https://stevens.li/guides/video/converting-hdr-to-sdr-with-ffmpeg/

ffmpeg -i file.mkv -vf zscale=t=linear:npl=100,format=gbrpf32le,zscale=p=bt709,tonemap=tonemap=hable,zscale=t=bt709:m=bt709:r=tv,format=yuv420p -crf 20 -acodec copy output.mkv

Extract all frames as .jpg

ffmpeg -r 1 -i file.mp4 -r 1 frames_%05d.jpg

Create video from all the frames (missing any codec spec):

ffmpeg -r 30 -i frames_%05d.jpg output.mp4

XXX the following where copied from

https://hbish.com/ffmpeg-one-liners/

Get infomation for a audio/video

ffmpeg -i file.mp3

Convert video into images

ffmpeg -i video.avi image_output%d.jpg

Split audio files

Generate a section of the audio from the 30 second mark (start) for 15 seconds (duration)

ffmpeg -f mp3 -i input.mp3 -t 00:00:30 -ss 00:00:15 output.mp3

Covert avi to animated gif

ffmpeg -i video.avi output.gif

Add audio to a video file

ffmpeg -i music.mp3 -i video.avi output.mpg

Extract audio from video file”>## Useful for extracting music from youtube videos

avi to mp3

ffmpeg -i video.avi -vn -ar 44100 -ac 2 -ab 192k -f mp3 output.mp3

flv to mp3

ffmpeg -i video.flv -ar 44100 -ac 2 -ab 192k -f mp3 output.mp3

motion verctor estimation, motion vector export, ffmpeg advanced usage

use cases

to detect hard-coded subtitles, crop the region and detect sudden changes

can also use pyscenedetect to do the job

pyav

docs

1 2	pip3 install av

remove/detect silence

… silencedetect A->A Detect silence.

… silenceremove A->A Remove silence.

frame interpolate

ffmpeg -y -i "/root/Desktop/works/pyjom/tests/random_giphy_gifs/samoyed.gif" \
-vf "minterpolate,scale=w=iw*2:h=ih*2:flags=lanczos,hqdn3d" \
-r 60 ffmpeg_samoyed.mp4

motion estimation

to get mosaic motion vectors and visualize:

ffmpeg -i "/root/Desktop/works/pyjom/tests/random_giphy_gifs/samoyed.gif" \
-vf "mestimate=epzs:mb_size=16:search_param=7, codecview=mv=pf+bf+bb"  \
mestimate_output.mp4 -y

get help

on specific filter:

1 2	ffmpeg -h filter=showspectrumpic

on all filters:

1 2	ffmpeg -filters

crop detection, picture in picture (PIP) detection

ffmpeg -i "/root/Desktop/works/pyjom/samples/video/LiEIfnsvn.mp4" \
-vf "mestimate,cropdetect=mode=mvedges,metadata=mode=print" \
-f null -

scene change detection

ffmpeg -hide_banner -i "$file" -an \
-filter:v \
"select='gt(scene,0.2)',showinfo" \
-f null \
- 2>&1

extract motion vectors

ffmpeg can produce motion vector estimation but it is not exportable, only for internal use.

mp4 format provides motion vector information thus maybe we need not to use GPU to get those ‘optical flow’ data.

extract by using ffmpeg apis

mv-extractor Extract frames and motion vectors from H.264 and MPEG-4 encoded video.

extract from mp4 file

mpegflow for easy extraction of motion vectors stored in video files

mv-tractus: A simple tool to extract motion vectors from h264 encoded videos.

take screenshot at time:

1 2	ffmpeg -ss 01:10:35 -i invideo.mp4 -vframes 1 -q:v 3 screenshot.jpg

video denoise filters:

dctdnoiz fftdnoiz hqdn3d nlmeans owdenoise removegrain vaguedenoiser nlmeans_opencl yaepblur

super-resolution, resampling:

deeplearning model, tensorflow

env LD_LIBRARY_PATH=/root/anaconda3/pkgs/cudatoolkit-10.0.130-0/lib/:/root/anaconda3/pkgs/cudnn-7.6.5-cuda10.0_0/lib/:$LD_LIBRARY_PATH \
ffmpeg -i "/root/Desktop/works/pyjom/samples/video/LiEIfnsvn.mp4" \
-y -vf \
"sr=dnn_backend=tensorflow:model=./sr/espcn.pb,yaepblur" \
supertest.mp4

use standard scale method:

ffmpeg -y -i "/root/Desktop/works/pyjom/tests/random_giphy_gifs/samoyed.gif"\
-vf "minterpolate,scale=w=iw*2:h=ih*2:flags=lanczos,hqdn3d" \
-r 60 ffmpeg_samoyed.mp4

options:

‘fast_bilinear’

Select fast bilinear scaling algorithm.

‘bilinear’

Select bilinear scaling algorithm.

‘bicubic’

Select bicubic scaling algorithm.

‘experimental’

Select experimental scaling algorithm.

‘neighbor’

Select nearest neighbor rescaling algorithm.

‘area’

Select averaging area rescaling algorithm.

‘bicublin’

Select bicubic scaling algorithm for the luma component, bilinear for chroma components.

‘gauss’

Select Gaussian rescaling algorithm.

‘sinc’

Select sinc rescaling algorithm.

‘lanczos’

Select Lanczos rescaling algorithm. The default width (alpha) is 3 and can be changed by setting param0.

‘spline’

Select natural bicubic spline rescaling algorithm.

‘print_info’

Enable printing/debug logging.

‘accurate_rnd’

Enable accurate rounding.

‘full_chroma_int’

Enable full chroma interpolation.

‘full_chroma_inp’

Select full chroma input.

‘bitexact’

Enable bitexact output.

ffmpeg python wrapper

most famous code to cli args ffmpeg python wrapper:

https://github.com/kkroening/ffmpeg-python

FFmpeg

2022-09-13 Audio Watermark Removal

2022-09-12 Use Pyscenedetect Dynamically In Program

2022-09-12 Exploring Ffmpeg'S Advanced Encoding, Conversion, And Audio Functionality