Project structure of: PaddlePaddle/PaddleVideo
__init__.pyPython module, licenses, imports PaddleVideo class.README.mdVideo action detection for abnormal behavior. SlowFast+FasterRCNN.
get_image_label.pyTrains and validates object detection models using video frames.README.mdDetect UAVs in restricted zones using PaddleDetection.
action.pyBasketball action detection Python scriptlogger.pyCustom logger class for news stripper appfeature_extractor.pyAudio feature extraction for basketball actions.model_config.pyModelAudio: Extracts audio features, slices data, appends to list.vgg_params.pyGlobal VGGish model parameters. Audio feature extraction, PCA, embedding. Adjustable settings.
audio_infer.pyAudio inference model for predicting basketball actions.bmn_infer.pyBasketball action detection via BMN inferencinglstm_infer.pyBasketball LSTM action detection model with GPU optim.pptsm_infer.pyPaddleVideo-based action detection with PPTSM inference
__init__.pyRegisters readers for different file formats. Alphabetical order.audio_reader.pyAudioReader class for YouTube-8M dataset management.bmninf_reader.pyReads and processes BMN model data for video analysis.feature_reader.pyPython FeatureReader for YouTube-8M dataset.reader_utils.pyVideo input data reader classes and singleton reader_zoo.tsminf_reader.pyTSMINF reader: multiprocessing, jpg format, ML transformations
config_utils.pyLoads, processes, prints config files with nested dictionaries.preprocess.pyPython FFmpeg utils for frames, audio, MP4 downloadprocess_result.pyProcesses video results, applies NMS
eval.pyBest IOU and score threshold for evaluating basketball actions found.predict.pyBasketball Action Prediction, JSON Output.
README.mdBasketball action detection app, F1-score 80.14%, PaddlePaddle 2.0
__init__.pyEIVideo init.py: Sets root path, defines constants, constructs full paths.api.pyEIVideo API: Retrieves, converts, and annotates images/videos.main.pyTrains PaddleVideo model with distributed training and mixed precision.__init__.pyEIVideo/paddlevideo initialization file__init__.pyImports, defines, exports, licensesbuilder.pyPaddleVideo dataset loader, signal handlers for SIGINT/TERM.__init__.pyImage preprocessing pipelines for PaddleVideo's EIVideo application.compose.pyFlexible composition of pipeline components.custom_transforms_f.pyPaddle Video's image preprocessing classes for resizing, aspect ratio adjustment, and custom cropping transforms.
registry.pyOrganizes PaddleVideo functionalities in 4 registries.
__init__.pyApache licensed PaddleVideo library init file for VOSMetric and build_metric functions.base.pyAbstract base class for PaddleVideo metrics.build.pyApache License v2.0, Python build metric toolregistry.pyRegistry initializer for metric management.vos_metric.pyVOS Metric: Video Object Segmentation.
__init__.pyImports and registers PaddleVideo modules, includes popular models.__init__.pyDeepLab import for deeplab_manet module included in all list.aspp_manet.pyASPP-MANET backbone model initialization in Paddledecoder_manet.pyPaddle Decoder class for Manet architecturedeeplab_manet.pyIntroduces FrozenBatchNorm2d, DeepLab network backbone with freezed BatchNorm layers.resnet_manet.pyResNet-MANET model with BatchNorm, ReLU, residual blocks.
builder.pyBuilds computer vision models with configuration.__init__.pyPaddleVideo framework: BaseSegment, Manet classes.__init__.pyPaddlePaddle components for video segmentationbase.pyBase class for semi-Video Object Segmentation in PaddlePaddlemanet_stage1.pyManet Stage 1 Video Segmentation
__init__.pyModule init, copyright, license, imports IntVOS.IntVOS.pyCompute L2 distances, apply attention, feature extraction, and pooling.
registry.pyRegistry classes for video pipeline components in PaddleVideo's EIVideo.weight_init.pyInitialize weights for PaddlePaddle layer with custom options
__init__.pyImports "test_model" from "test.py", adds to alltest.pyTest model function for gradient-free testing with multi-card support.
__init__.pyPaddleVideo utility functions and classes import, logger/profiler setup.build_utils.pyBuilds object from config and registry.config.pyEIVideo: Config file parsing, printing, visualizing, and checking.dist_utils.pyDistributed computing utilities for PaddleVideo's EIVideo module.logger.pyColorful, configurable, non-propagating logger for PaddleVideo.manet_utils.pyOpenCV, PaddleVideo utilities with PyTorch init.precise_bn.pyPreciseBN: recompute BN stats for improved accuracyprofiler.pyProfiler initialization and stop for PaddlePaddle's operator-level timing.record.pyLogs video processing metrics for clarity.registry.pyRegistry class for mapping names to objectssave_load.pyAdapts ViT model, loads/saves PaddlePaddle models.
version.pyPaddleVideo version: 0.0.1, Apache License 2.0
README.MDChinese CLI video annotation tool guidesetup.pySource citation required for codeversion.pyEIVideo version 0.1a by Acer Zhang (credit requested)
__init__.pyQEIVideo Root & Versionbuild_gui.pyVideo processing GUI builder with PyQt5.__init__.pyAuthor, date, copyright - EIVideo's QEIVideo gui init.demo.pyDrawFrame class QWidget for path drawing and mouse events.ui_main_window.pyEIVideo: UI-driven video/painting app with "Hi" message.
start.pyInitialize QApplication, create GUI, run event loop.__init__.pyEIVideo QEIVideo gui module init comment
__init__.pyQEIVideo gui init, author, date, copyright.demo.pyPyQt5 video player UI with interactive buttons and tabs.
version.pyEIVideo app version info by Acer Zhang 01/11/2022.PaintBoard.pyPaintBoard: Widget for drawing, clearing, changing pen attributes, and retrieving content.
README.mdWindows video annotation tool with Baidu AI, maintained by QPT-Family on GitHub.cmdUpdating EIVideo on GitHub, managing branches and code.demo.uiQt application UI video demo with interactive elements.
README.mdFight Recognition model guide for PaddleVideo.
README.mdOpenPose for figure skating action data processing
download.shDownload, extract, and delete 4 tar files related to FootballAction.
dataset_url.list13 EuroCup2016 football videos from BCEBOSdownload_dataset.shDownloads 12 EuroCup2016 videos using wget.url.listUnique EuroCup2016 URLs for MP4 videos.url_val.listList of hashed MP4 URLs for EuroCup2016 videos.
get_frames_pcm.pyPython script extracts frames and audio from MP4 videos.get_instance_for_bmn.pyGenerates BMN-formatted output from ground truth data.get_instance_for_lstm.pyCalculates IoU, checks hits, splits datasets for handling football actions.get_instance_for_pptsm.pyProcesses video data, extracts action instances, creates dataset.
extract_bmn.pyClassify and extract features from videos using pre-trained model.extract_feat.pyBaidu models extract audio, classify videos.
action.pyAction detection system with ML/DL for Baidu Cloudlogger.pyCustom logger class for news stripper appfeature_extractor.pyExtracts MFCC and STFT audio features for football action detection.model_config.pyExtracts audio features, slices data, returns feature list.vgg_params.pyGlobal VGGish model parameters for audio feature extraction
audio_infer.pyAudio inference model for predicting football actions.bmn_infer.pyPaddle BMN Infer action detection model, averaging predictions.lstm_infer.pyLSTM model for football action predictionpptsm_infer.pyPPTSM model inference for football actions.
__init__.pyImports and registers readers for map files.audio_reader.pyAudioReader class for YouTube-8M dataset management.bmninf_reader.pyBMNINF reader for football action detectionfeature_reader.pyAttention-based LSTM feature reader for FootballActionreader_utils.pyReader registration and lookup for video input data.tsminf_reader.pyTSMINF Reader: JPG video dataset, threading, action detection.
config_utils.pyLoads and processes config files for organization.preprocess.pyFour FFmpeg functions for video, audio handlingprocess_result.pyProcesses video results with NMS
eval.pyEvaluates precision, recall, and F1 scores for football action prediction models.predict.pyLoads model, reads videos, predicts actions, stores results.
README.mdEnhanced FootballAction model in PaddleVideo.
config.pyMa-Net config file: Imports, parses, trains, tests.custom_transforms_f.pyData augmentation transforms for PaddlePaddleDAVIS2017.mdDownload and organize DAVIS2017 dataset.DAVIS2017_cn.mdDAVIS2017 dataset prep for Ma-Netdavis_2017_f.pyDAVIS 2017 dataset loader, Ma-Net preprocessinghelpers.pyHelps with image data conversion, masking, and normalization.samplers.pyRandomly samples identities and instances with replacement.
aspp.pyASPP module layer for Ma-Net CNN__init__.pyBackbone network builder for specified model and stride.drn.pyDeep Residual Network (DRN) model & MA-Net architecture in PaddlePaddlemobilenet.pyMobileNetV2 model for Ma-Net application.resnet.pyResNet architecture with batch normalization, ReLU, output strides, residual connections.xception.pyAlignedXception network for image classification
decoder.pyDecoder neural network layer for class predictiondeeplab.pyDeeplab: BatchNorm, ASPP, decoder, freeze.IntVOS.pyVideo object segmentation with PaddlePaddle and neural networks.loss.pyCustom loss function for image classification tasks with optional hard example mining.
README.mdMA-Net PaddleVideo model testing and training scriptREADME_cn.mdChinese Ma-Net video segmentation model README.run.shTrain, test DeepLabV3_coco on DAVIS dataset.test.pyDAVIS2017 image processing with PaddlePaddle, 8-turn interactive classification.train_stage1.pyMa-Net video object detection, training and visualization.train_stage2.pyTrains stage 2 Ma-Net models, applies loss, evaluates performance.api.pyTensor handling utilities for PyTorch and PaddlePaddlemask_damaging.pyDamages and scales mask with rotations and translations.meters.pyComputes and stores average values.utils.pyConverts labels to RGB color map.
download.shDownloads ernie model, checkpoints, and test dataset.eval_and_save_model.shEvaluates and saves AttentionLstmErnie model in specified directories.inference.shInference script for AttentionLstmErnie model.README.mdMultimodal video classification with PaddlePaddle 2.0accuracy_metrics.pyAccuracy metrics calculator for multimodal video tagging models.config.pyMerges and prints config sections.__init__.pyMultimodalVideoTag: ATTENTIONLSTMERNIE reader imported.ernie_task_reader.pyERNIE task reader: preprocesses text, formats sequences for ERNIE models.feature_reader.pyMultimodal LSTM, attention, NextVlad feature reader.reader_utils.pyReaderZoo manages reader instances with singleton pattern.tokenization.pyTokenization and Unicode conversion for text
eval_and_save_model.pyMultimodal video tagging with PaddlePaddle and AttentionLstmErnie.inference.pyPaddle Video Inference: Multimodal Tagging, GPU Supportattention_lstm_ernie.pyERNIE-based LSTM attention model for multimodal video taggingernie.pyERNIE model, multimodal video tagging, Paddletransformer_encoder.pyTransformer encoder layer for NLP tasks
train.pyVideo training: PaddlePaddle, feeds, outputs, loss, optimizer, build, train, save.utils.pyMultimodal Video Tagging Utilities
train.shTrain Attention LSTM Ernie model with GPU optimization.
Readme.mdPP-Care video understanding app with TSM and ResNet50
prepare_dataset.pyPrepare PaddleVideo and PPHuman dataset keypoints.
README.mdTrains PaddleVideo JSON for PP-Human inference
README.mdPaddleVideo applications: action detection, recognition, classification, and analysis.__init__.pyImports base model and trainer modules.base_dataset.pyBase class for video feature datasets.base_model.pyBase class for PaddleVideo modelsbase_trainer.pyT2VLAD trainer: multi-epoch, metrics tracking, model saving.
download_features.shDownload, extract MSRVTT datasets
data_loaders.pyEfficient data loader for training with LRU cache, PP library, MSRVTT dataset.MSRVTT_dataset.pyMSR-Vtt dataset loader and checker
__init__.pyImports all logger functions and classes in PaddleVideo's T2VLAD app.log_parser.pyComputes performance metrics for epochs.logger.pyConfigures logger based on JSON file or uses basic setup.
loss.pyContrastive loss, max margin ranking, T2VLAD modelsmetric.pyCalculates retrieval metrics and offers visualization options.model.pyVideo analysis CENet model with T2VLAD and BERT.net_vlad.pyT2VLAD: VLAD representation network.text.pyTextEmbedding: Word2Vec for video descriptions and queries. CPU-only, GPT or Word2Vec.
parse_config.pyConfigParser: argument parsers, slave mode, directories, config, experiment settings.README.mdTrains T2VLAD text video retrieval model using PaddleVideo on MSR-VTT dataset.README_en.mdT2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval.test.pyCompress predictions using PaddleVideo library.train.pyTrains video analysis model, handles args, saves checkpoints__init__.pyImports all from "trainer" module.trainer.pyTrains video retrieval model, efficient sample copies, logs progress, computes Mean Average Precision.
__init__.pyImports all util module functions.util.pyData processing, categorizing, feature adjustment, Tensor format utilities.
README.mdTable Tennis Action Recognition with VideoSwinTransformer
submission_format_transfer.pyConverts JSON timestamps, formats for table tennis analysis.
extract_bmn_for_tabletennis.pyClassify videos, extract features, predict bounding boxes.
fix_bad_label.pyFixes bad labels in Table Tennis applicationget_instance_for_bmn.pyGenerates BMN model ground truth data for table tennis.gts_format_transfer.pyJSON file format transfer toolaction.pyBaidu Cloud action detection Python scriptlogger.pyCustom logger for action detection in Table Tennis appfeature_extractor.pyExtracts audio features for Table Tennis prediction using VGG-16.model_config.pyAudio feature extraction for Table Tennis predictionsvgg_params.pyGlobal VGGish model parameters defined, with customizable audio features extraction.
audio_infer.pyAudio inference model for PaddleVideo predictionsbmn_infer.pyGPU-optimized action detection inference modellstm_infer.pyTable Tennis LSTM Action Detection Modelpptsm_infer.pyInferModel class for PPTSM model inference with GPU.
__init__.pyImports reader classes, registers alphabetically.bmninf_reader.pyBMNINFReader: TableTennis Action Detection Dataset.feature_reader.pyTable tennis action detection from YouTube-8M dataset using LSTM, attention cluster, and nextVlad models.reader_utils.pyReaderZoo class for PaddleVideo TableTennis error handling and reader management.
config_utils.pyPaddleVideo TableTennis config parsing utilspreprocess.pyVideo, audio extraction and sorting utility.process_result.pyCalculates video results with suppression, removes overlaps, and processes video properties.
eval.pyOptimize F1 scores for table tennis action prediction.predict.pyVideo prediction setup for TableTennis using PaddleVideo
val_split.pySplits JSON table tennis gts into training and validation sets
main.pyTrains PaddleVideo models, supports dist. test.__init__.pyPaddleVideo library license and imports.__init__.pyVideo dataset loading and processing module for PaddleVideo.builder.pyPython file constructs video pipelines using PaddleVideo & PaddlePaddle.__init__.pyVideo and frame datasets for PaddleVideo.base.pyVideo dataset class for loading and preparing data.frame_rec.pyPaddleVideo's FrameRecDataset for raw frame loading and transformation.video.pyPaddleVideo: Efficient Video Dataset Loading and Processing
__init__.pyPaddleVideo library functions for video analysis tasks.augmentations.pyScaling and multi-cropping for PaddleVideocompose.pyComposes video pipeline elements, handles lists, old config workaround.decode.pyPaddleVideo's VideoDecoder: Decodes MP4, RGB frames, audio, and masks.mix.pyMixup operator for video quality assessment.sample.pySampler class for PIL image data sampling in video files.
registry.pyManages registries for pipelines and datasets in PaddleVideo.
__init__.pyVideo Quality Assessment app initializationbase.pyBaseMetric: Foundation for video quality metrics, subclasses required.build.pyBuild and import metrics for PaddleVideo library.quality_metric.pyCalculates PLCC & SROCC using numpy and scipy stats.registry.pyVideo quality assessment metrics registry in PaddleVideo library.
__init__.pyVideo quality assessment models registration and exportation.__init__.pyBackbone models: ResNet, ResNetTweaksTSM imported and defined.resnet.pyResNet backbone using ConvBNLayer and blocks.resnet_tweaks_tsm.pyTSM ResNet model with configurable layers
builder.pyModel builder for computer vision components__init__.pyPaddleVideo: BaseRecognizer & Recognizer2D classes defined.__init__.pyImports, adds to all, and disclaims.base.pyBase class for model recognizers in PaddleVideorecognizer2d.pyTrains 2D models using PaddleVideo's Recognizer2D class
__init__.pyHead models for Video Quality Assessment in PaddleVideo.base.pyPaddleVideo BaseHead model for Video Quality Assessmenttsm_rec_head.pyTSMRecHead: TSN-based classifier for TSNstsn_head.pyTSN head for video quality assessment
__init__.pyImplements loss functions for PaddleVideo.base.pyBase loss function for PaddleVideo, requires subclass for _forward method.l1_loss.pyL1 Loss: Image/Video Quality Assessmentsmooth_l1_loss.pySmooth L1 Loss Function
registry.pyRegisters various models in PaddleVideo's VQA app.weight_init.pyCustom weight initialization in PaddlePaddle
__init__.pyPaddleVideo: Building Video Quality Optimizer and LR.custom_lr.pyLearning rate schedulers for PaddleVideo optimization.lr.pyCreates learning rate scheduler for PaddleVideooptimizer.pyConstructs optimizer and learning rate scheduler for parameter optimization.
__init__.pyImport, define, train, test functions for PaddleVideo's Video Quality Assessment module.test.pyTests Paddle model with parallel processingtrain.pyTrain video quality model with PaddleVideo.
__init__.pyPaddleVideo utils module with Registry, build, and more.build_utils.pyModule Builder from Configconfig.pyConfig and AttrDict functions for PaddleVideo.dist_utils.pyDist utils for PaddleVideo's distributed video quality assessment.logger.pyLogger class for PaddleVideo's VQA app, enabling distributed logging.precise_bn.pyPrecise batch normalization updates for PaddleVideo.record.pyRecords metrics for Video Quality Assessment in PaddleVideo.registry.pyRegistry: Objects registration and retrieval via names.save_load.pySave, load functions for model weights using Paddle.
version.pyPaddleVideo library version "0.0.1" under Apache License 2.0.
README.mdVideo quality assessment model using ppTSM network on KonVid-150k dataset.run.shTSM model training and testing scriptsave_model.shSaves best model from TSM_pptsm.yaml with 32 segments.setup.pyVideo understanding with PaddlePaddle toolkits setup.
eval.pyPrepares environment, imports libraries, defines functions, and runs tests.FineTune.mdFine-tune VideoTag model with custom data, AttentionLSTM, TSN__init__.pyVideo metrics import from metrics_utilaccuracy_metrics.pyCalculates video accuracy metrics in PaddleVideo's VideoTag app.
metrics_util.pyEvaluates metrics in video analysis tasks.average_precision_calculator.pyCalculates interpolated average precision for classification tasks.eval_util.pyVideo classification model evaluation metrics in PaddleVideomean_average_precision_calculator.pyMean Average Precision Calculator for Youtube-8m Dataset
__init__.pyRegistered models AttentionLSTM and TSN for easy retrieval.__init__.pyImports functions and classes from "attention_lstm.py" for easy access.attention_lstm.pyAttentionLSTM model for video tagginglstm_attention.pyLSTM Attention Model: Dynamic LSTM, Dropout
model.pyVideoTag app's Python model module for Paddle.__init__.pyImports all "tsn" subdirectory components.tsn.pyTSN model class with segmentation and training parameters.tsn_res_model.pyTSN ResNet model for PaddlePaddle
utils.pyDecompress, download functions and AttrDict class.
predict.pyPredicts video tags using PaddleVideo's models.__init__.pyVideoTag reader classes imported, registered. Alphabetical order.feature_reader.pyYouTube-8M LSTM DataReader in Python.kinetics_reader.pyKineticsReader: Efficient Kinetics dataset reader with data augmentation.reader_utils.pyRegister and retrieve readers in Zoo
README.mdVideoTag: Large-scale video classification via image modeling and sequence learning.Run.mdVideoTag app installation guide and usage.Test.mdTest VideoTag model on custom data using Python script.train.pyVideo tagging model training in Pythontsn_extractor.pyInference script for videos with Infer model.config_utils.pyConfig utils for PaddleVideo's VideoTag apptrain_utils.pyPaddlePaddle training utility with dataloader.utility.pyEnsures PaddlePaddle version 1.6.0 installed, handles GPU usage.
videotag_test.pyVideo tagging model performance testing with PaddlePaddle and PaddleVideo.
README.mdRun benchmark script for TimeSformer model in PaddleVideo.run_all.shBenchmark TimeSformer on PaddleVideo with UCF101 dataset.run_benchmark.shTimeSformer video classification benchmark tests
prepare_asrf_data.pyPrepares ASRF data for classification.transform_segmentation_label.pyProcess video data for labeling and organization.
download_dataset.shDownload and unzip NTU-RGB-D skeleton data, then statistics.get_raw_denoised_data.pyProcesses and denoises NTU RGB-D dataset data.get_raw_skes_data.pyExtracts body info from skeleton data using NTU RGB-D dataset.seq_transformation.pyPython script for NTU RGB+D data processing, splitting, and encoding.
auto-log.cmakeAutolog external project finder and inclusion with CMake
postprocess_op.hSoftmax Inplace Postprocessing Classpreprocess_op.hImage preprocessing operations for PaddleVideo.utility.hUtility functions for PaddleVideo operations.video_rec.hVideo recognition class with OpenCV, PaddlePaddle integration
readme.mdDeploys PaddleVideo models, C++, displays results, requires libcudnn.so fix.readme_en.mdDeploy PaddleVideo models on Linux with Docker support.main.cppProcess video frames using OpenCV and PaddleVideo.postprocess_op.cppSoftmax postprocessing for PaddleVideo.preprocess_op.cppNormalizes and scales images for inference in PaddleVideo library.utility.cppUtility functions: ReadDict, frame capture, and conversion.video_rec.cppVideo AI inference, processing time measurement, GPU optimized.
build.shSetup and compile script for OpenCV, PaddlePaddle, CUDA, cuDNN, TensorRT.
paddle_env_install.shSetup PaddleVideo C++ serving environment with TensorRT.preprocess_ops.pyComposes image processing steps, preprocesses video frames.readme.mdDeploy Paddle Serving in Docker with GPU/CPU options.readme_en.mdAccelerates PaddleServing installation with Docker, Linux, and GPU support.run_cpp_serving.shRuns PaddleVideo server with PP-TSM/TSN models on different ports.serving_client.pyVideo client for Paddle Serving using PaddleVideo
predict_onnx.pyPerform video object detection with ONNX predictor.readme.mdConverts PaddlePaddle to ONNX for inferencereadme_en.mdConvert Paddle2ONNX PP-TSN model for video prediction.
pipeline_http_client.pyServes PaddleVideo models, parses args, sends video data via HTTP.pipeline_rpc_client.pyWeb serving client for PaddleVideo modelsreadme.mdDeploy PaddlePaddle model for serving with PaddleServing.readme_en.mdDeploy PaddleServing for HTTP deep learning prediction.recognition_web_service.pyPython web service for image recognition using PaddlePaddleutils.pyConverts video frames to numpy array, base64 strings.
quant_post_static.pyIntroduces quantization in PaddleVideo for GPU utilization.readme.mdPaddleVideo model compression with PaddleSlim.readme_en.mdModel compression library for PaddleVideo with quantization, pruning, and distillation.
benchmark.mdPaddleVideo: Speed benchmark, Slowfast 2x faster, Action segmentation on Breakfast dataset.ActivityNet.mdActivityNet dataset prep for PaddleVideoAVA.mdAVA dataset: download, cut, frame extraction, organization.fsd.mdFigure Skating Dataset: 30 fps, Open Pose key points, train/test data available.k400.mdDownload and extract Kinetics-400 dataset frames.msrvtt.mdMSR-VTT: 10K videos, ActBERT model, multi-modal transformers.ntu-rgbd.mdPrepares NTU RGB+D dataset for CTR-GCNOxford_RobotCar.mdOxford-RobotCar: Day-Night Depth Estimation DatasetREADME.mdComprehensive action recognition datasets table.SegmentationDataset.mdVideo action segmentation dataset using breakfast, 50salads, and gtea datasets with pre-training model features.ucf101.mdUCF101 dataset organization toolucf24.mdUCF24 dataset preparation with PaddleVideoyoutube8m.mdMassive YouTube video classification dataset.
install.mdInstall PaddlePaddle and PaddleVideo, requirements, setup, usage.SlowFast_FasterRCNN_en.mdTrain SlowFast Faster RCNN model for action detection
adds.mdEstimates depth using day/night images with ADDS-DepthNet code.
bmn.mdBMN model: temporal action proposal generation with commands.yowo.mdYOWO: Single-stage, channel fusion, attention, UCF101-24 pre-trained.
actbert.mdActBERT: Multimodal, global action, TNT block, SOTA video-language.
transnetv2.mdTransNetV2: Deep Learning Video Segmentation Model
README.mdModel Zoo: Comprehensive Action Recognition & Segmentation Models.agcn.mdAGCN: Improved ST-GCN for Video Recognitionagcn2s.mdAGCN2s: Enhanced ST-GCN for motion recognition.attention_lstm.mdAttentionLSTM: LSTMs, Attention layer, YouTube-8M, PaddleVideoctrgcn.mdPaddlePaddle, CTR-GCN, NTU-RGB+D, bone-based behavior recognition, 99.9988% accuracymovinet.mdMoViNet: Lightweight Video Recognition Modelposec3d.mdTrains PoseC3D model, tests without GPU.pp-timesformer.mdPP-TimeSformer: Enhanced TimeSformer for video recognition, multi-GPU support.pp-tsm.mdPP-TSM: Action Recognition, UCF101/Kinetics, PaddlePaddle, ResNet101pp-tsn.mdEnhanced TSN model for video recognitionslowfast.mdSlowFast model: Video Recognition with Multigrid Training.stgcn.mdST-GCN model training, testing, and inference.timesformer.mdTimeSformer: Efficient Video Classifiertokenshift_transformer.mdToken Shift Transformer: High Accuracy Video Classificationtsm.mdTrains TSM model on UCF-101, Kinetics-400 datasets. ResNet-50, PaddlePaddle, AMP, Momentum optimization, L2_Decay, three sampling methods.tsn.mdTSN: 2D-CNN video classification with ResNet-50 and Kinetics-400.tsn_dali.mdImprove TSN training speed with DALI for action recognition.videoswin.mdSOTA Video-Swin-Transformer model with multi-scale, efficient attention.
asrf.mdImproved video action segmentation model using PaddlePaddle.cfbi.mdCFBI video segmentation model, ECCV 2020.mstcn.mdTrains, evaluates, and compares MS-TCN video action segmentation models.
quick_start.mdPaddleVideo: Install, Use, Action Recognitiontools.mdTools for PaddleVideo: model parameters, FLOPs calculation, and exported model testing. Python 3.7 required.accelerate.mdEnglish/Chinese tutorial links provided.Action Recognition DatasetsAction Recognition Datasets: Essential resources for training and evaluation.Action Recognition PapersEfficient Action Recognition Methodsconfig.mdInversion-of-Control, Dependency Injection with Config Filecustomized_usage.mdCustomize PaddleVideo framework elements.demosDemo: Action Recognition with Various Algorithmsdeployment.mdConvert dygraph models for deployment with PaddleInference.modular_design.mdModular Design Tutorial Translationpp-tsm.mdOptimized video recognition model for high performance and fast inference.Spatio-Temporal Action Detection PapersComprehensive list of Spatio-Temporal Action Detection papers, 2015-2017.summarize.mdVideo Classification: RGB, Skeleton, Deep LearningTemporal Action Detection PapersAction detection and localization in videos through research papers.TSM.mdTSM: Efficient Video Understanding, Balanced Performance
usage.mdPaddleVideo Linux setup, multi-card training, log format, fine-tuning, benchmarking
main.pyParallelized PaddleVideo model training with distributed environments.MANIFEST.inIncludes essential files for PaddleVideo package distribution.__init__.pyPaddleVideo library initialization.__init__.pyPaddleVideo loader/init functions for datasets, dataloaders, pipelines.builder.pyConstructs PaddleVideo pipeline, builds loader for distributed training.dali_loader.pyDali-powered Paddle video loader__init__.pyImports PaddleVideo dataset classes for video understanding.actbert_dataset.pyActBERT dataset setup for PaddlePaddle video processingasrf_dataset.pyAction segmentation dataset loader for PaddleVideoava_dataset.pyAVA Dataset Class for PaddleVideobase.pyBaseDataset class for PaddlePaddle with loading, preparing, retrieving. Supports list results.bmn_dataset.pyBMNDataset: Action localization video datasets loader.davis_dataset.pyVOS Test class for Davis 2017 dataset in PaddleVideo.feature.pyPaddleVideo library's FeatureDataset initialization and methods.frame.pyPaddleVideo library: load, transform, process video data.MRI.pyMRI dataset loader for action recognitionMRI_SlowFast.pyMRI_SlowFast: Imports, classes, processes video data for training/validation.ms_tcn_dataset.pyMS-TCN dataset class and loadermsrvtt.pyPrepares MSRVTT dataset for training/testing.oxford.pyOxford dataset class for PaddleVideo.skeleton.pySkeleton Dataset Loader for Action Recognitionslowfast_video.pyVideo dataset for action recognition using PaddleVideo's SFVideoDataset.ucf101_skeleton.pyUCF101 Skeleton Dataset PaddleVideo Classucf24_dataset.pyUcf24Dataset: PaddleVideo's UCF24 dataset loader.video.pyVideoDataset: Raw video loader with index, handling corruption.
__init__.pyPaddleVideo pipeline components and modules import.anet_pipeline.pyPaddleVideo: Feature extraction, IoU calculation, and data preparation.augmentations.pyPaddleVideo loader with resize, augmentation.augmentations_ava.pyAVA dataset augmentation and resizing in PaddleVideo.compose.pyUnifies pipeline components, flexible transformations.decode.pyTimeSformer model for video decoding in PaddleVideodecode_image.pyDecode image pipeline for PaddleVideo.decode_sampler.pyVideo decoding pipeline with PIL conversion.decode_sampler_MRI.pyDecodes and samples MRI frames for SFMRI.mix.pyVideo Mix Operator: Data Augmentation for Image Classificationmultimodal.pyIntroduces FeaturePadding class for multimodal data preprocessing.sample.pyPython code defines PaddleVideo pipeline for efficient frame sampling.sample_ava.pySampleFrames class for PaddleVideo loader pipelines in OpenCV.sample_ucf24.pySamplerUCF24: Pipeline for sampling frames from videos.segmentation.pyMulti-scale segmentation for PaddleVideo.segmentation_pipline.pySegmentation sampler for PaddleVideo pipelines.skeleton_pipeline.pyData processing classes for efficient PaddleVideo.
registry.pyCustom pipeline and dataset registries for PaddleVideo.
__init__.pyImports various metrics for video analysis.__init__.pyImports ANETproposal as public API.anet_prop.pyEfficiently calculates metrics for ActivityNet proposals.
metrics.pyCalculates precision, recall, and CorLoc metrics for object detection.np_box_list.pyManages bounding boxes, checks validity.np_box_ops.pyNumpy bounding box ops for IoU calculationobject_detection_evaluation.pyObject detection evaluation module for AVA dataset with mAP metrics.per_image_evaluation.pyPython code for AVA performance metrics evaluation.README.mdAction recognition metrics evaluation toolstandard_fields.pyStandardizes object detection metrics.
ava_metric.pyAVA metric computation for PaddleVideo.ava_utils.pyAVA metric handling for video sequencesbase.pyPaddleVideo Base Metrics Classbmn_metric.pyCalculates BMN metric for object detection.build.pyPython script, Apache License 2.0, builds metrics from configcenter_crop_metric.pyCenter crop metric for PaddleVideo.center_crop_metric_MRI.pyCalculates top-1/5 accuracy metrics for image classification tasks with multi-GPU support.depth_metric.pyDepthMetric: Process, accumulate, all-reduce, and average metrics.msrvtt_metric.pyMS-RNN/VTT model metrics computation and loggingmulti_crop_metric.pyMultiCrop metric for PaddleVideo accuracyrecall.pyCalculate recall for object detection using IoU and image proposals.registry.pyManage diverse metrics with Registry class in utils module.segmentation_metric.pyLabel change detection with precision, recall, F1 score in video segmentation.skeleton_metric.pySkeleton metric: Skeleton-based model performance evaluation.transnetv2_metric.pyTransNetV2 metrics: precision, recall, F1-score calculation.ucf24_utils.pyUcf24Metrics: UCF101 dataset utilities for precision/recall, mAP.vos_metric.pyVOSMetric: Video Object Segmentation Metricsaverage_precision_calculator.pyAverage precision calculator for video detection tasks.eval_util.pyEvaluates PaddleVideo performance in Youtube8m using Hit@1 and precision.mean_average_precision_calculator.pyCalculates mAP for ranked lists with interpolated precisions.
yowo_metric.pyYOWOMetric class for PaddleVideo metrics calculation and logging
__init__.pyPaddleVideo model registry and functions initialization__init__.pyImports MaxIoUAssignerAVA for importability.max_iou_assigner_ava.pyMax IoU Assigner AVA classifier
__init__.pyBackbone models for video analysis in PaddleVideo.actbert.pyBERT embeddings for multimodal video action recognition with ACTBERT backboneadds.pyPaddleVideo backbone with ResNet V1.5, DepthDecoder, and PoseDecoder for diverse pose estimation.agcn.pyAGCN: Adaptive GCN Backbone for Graph Convolution Tasksagcn2s.pyTemporal Convolutional Networks with GCN Units for NTURGB+Dasrf.pyASRF backbone for PaddleVideo tasksbmn.pyBMN model: Masks, ConvLayers for BMSN backbone in Paddle.aicfbi.pyFPN-based CFBI model for feature extractionctrgcn.pyCTRGCN backbone for video models with batch norm and NTUGraphdarknet.pyDarknet backbone with ConvBNLayer, concatenation, convolutions.deeplab.pyDeeplab: PaddlePaddle Network for Semantic Segmentationmovinet.pyMoViNet model with MobileNetV2 layers for video analysisms_tcn.pyMSTCN backbone, SingleStageModel class, Kaiming initializationpptsm_mv2.pyMobileNetV2 backbone for PaddlePaddlepptsm_mv3.pyPPTSM-Mv3 Backbone for PaddleVideopptsm_v2.pyPython module: PPTSM_v2 video backbone.resnet.pyResNet backbone model with dynamic blocksresnet3d.py3D ResNet model for PaddleVideoresnet3d_slowonly.pySlowOnly 3D ResNet backbone with Slowfast pathwayresnet_slowfast.pyResNetSlowFast for video recognition with slow-fast pathways.resnet_slowfast_MRI.pyInitialize ResNet SlowFast MRI model for video analysis.resnet_tsm.pyResNet-TSM Backbone: Deprecated, Needs Updatesresnet_tsm_MRI.pyResNet-TSM MRI model in PaddleVideo.resnet_tsn_MRI.pyResNet-TSN model, PaddlePaddle, weights init, results outputresnet_tweaks_tsm.pyTSM ResNet backbone for Temporal Segment Networksresnet_tweaks_tsn.pyResNet TSN backbones in PaddleVideo library.resnext101.pyResNeXt-101 PaddlePaddle model defined.stgcn.pySTGCN model for spatio-temporal data processing in Pythonswin_transformer.pySwin Transformer backbone for image processing in PaddleVideo.toshift_vit.pyVision Transformer Model with Token Shifttransnetv2.pyOctConv3D 3D convolutional layer for shot transition detection in TransNetV2vit.pyVisionTransformer-based video processing in PaddleVideo.vit_tweaks.pyVit_tweaks: VisionTransformer enhancements.yowo.pyYOWO: YOLO-style backbone for video classification
bbox_utils.pyBounding box utilities for YOLO.builder.pyDynamic video object detection model builder.__init__.pyPaddleVideo framework: base classes for model modeling.__init__.pyPaddleVideo detectors: Base, FastRCNN, TwoStage.base.pyBaseDetector: Parent for detectors, common features and abstract train_step.fast_rcnn.pyFastRCNN: Two-stage object detection.two_stage.pyTwo-stage detector SlowFast model Python implementation.
__init__.pyImports, classes, DepthEstimator, BaseEstimatorbase.pyBaseEstimator class for PaddleVideo modelingdepth_estimator.pyDepthEstimator: Depth estimation model with forward_net, loss.
__init__.pyPaddleVideo localizers for video tasks in various techniques.base.pyBase class for PaddlePaddle localization modelsbmn_localizer.pyBMN Localizer Model for PaddleVideoyowo_localizer.pyYOWOLocalizer: NMS, matching, precision, recall, F-score.yowo_utils.pyYOLOv2 post-processing with PaddlePaddle
__init__.pyMultimodal Model Framework for PaddleVideoactbert.pyMultimodal ActBert model for text, video, action prediction.base.pyMultimodal model base class for PaddleVideo.
__init__.pyPaddleVideo Partitioner Initialization Filebase.pyBase partitioner class for PaddleVideo's modeling frameworktransnetv2_partitioner.pyTransNetV2 Partitioner for PaddleVideo
__init__.pyImported recognizers for video recognition tasks in PaddleVideo.base.pyBase recognizer model initialization and operation modes.recognizer1d.py1D recognizer model in PaddleVideorecognizer2d.py2D video model for PaddleVideo analysisrecognizer3d.pyRecognizer3D: 3D model framework, training, validation, testingrecognizer3dMRI.py3D Recognizer model for PaddleVideo training, validation, and testing.recognizer_gcn.pyIntroduces GCN Recognizer model for PaddleVideo.recognizer_movinet_frame.pyMoViNetRecognizerFrame: Framework for model training, testing, inference.recognizer_transformer.pyTransformer-based recognizer model implementationrecognizer_transformer_MRI.pyRecognizerTransformer_MRI: Multi-view MRI classifier.recognizerDistillation.pyRecognizer distillation in PaddleVideo framework.recognizerMRI.py2D Image Classifier Model using RecognizerMRI.
__init__.pySegment models in PaddleVideo, BaseSegment and CFBI.base.pySemi-Video Object Segmentation abstract base classcfbi.pyCFBI model for image segmentation and video processing with instance-level attention.utils.pyPaddleVideo segment matching, ASPP models, padding, distances, data prep.
__init__.pyPaddleVideo segmenter module with 3 classes for feature extraction, segmentation, and separation.asrf.pyASRF segmenter: ASRF backbone, head network, validation.base.pyBaseSegmenter: Foundation for PaddleVideo segmenters with train, valid, test, and inference.ms_tcn.pyMS-TCN video segmenter modelutils.pyGaussian smoothing, Kaiming Uniform for layer init.
__init__.pyImports various head classes for video tasks.adds_head.pyAddHead: Detection, Loss, Metrics, Multi-GPU Supportagcn2s_head.pyAGCN2sHead: PaddleVideo model head with channels, classes, people, dropoutasrf_head.pyASRF head: Action recognition, conv. layers, precision, recall, F1attention_lstm_head.pyAttention LSTM head for PaddleVideobase.pyBase PaddleVideo head: Classification, loss/accuracy, label smoothing.bbox_head.pyObject detection bbox head with classification targets, dropout, and recall/precision calculation.cfbi_head.pyCollaborativeEnsemblerMS neural network architecture.ctrgcn_head.pyCTRGCN Head: Neural network head for CTR-GCN model in PaddleVideoi3d_head.pyI3D head with pooling, dropout, and options.movinet_head.pyMoViNetHead: BaseHead extension, unmodified input.ms_tcn_head.pyMS-TCN Head model: F1 scoring, loss calculation.pptimesformer_head.pyPaddlePaddle TimeSformer Headpptsm_head.pyPaddlePaddle Video: ppTSMHead class for TSN models.pptsn_head.pyPaddlePaddle ppTSN head for classification tasks with dropout.roi_extractor.pyRoIAlign extractor for PaddlePaddleroi_head.pyROI alignment head with NMS for bounding boxes.single_straight3d.py3D ROI extractor class for feature extractionslowfast_head.pySlowFast head 3D projection PaddleVideo classstgcn_head.pySTGCNHead: Convolutional layer, forward pass in PaddlePaddletimesformer_head.pyTimeSformer Head: Model Head for TimeSformertoken_shift_head.pyTokenShiftHead: Paddle's Linear classification headtransnetv2_head.pyTransNetV2Head: CV model head, loss calculation.tsm_head.pyTSM head: TSNHead extension for classification taskstsn_head.pyTSN head: Adaptive pooling, dropout, class scores
__init__.pyImports diverse losses for PaddleVideo tasks.actbert_loss.pyActBert loss function codeasrf_loss.pyCustom video modeling loss functions definedbase.pyBase loss function class for PaddlePaddle.bmn_loss.pyBMN loss function, PaddleVideo.cross_entropy_loss.pyCustom PaddlePaddle loss for classification tasksdepth_loss.pyCalculates depth losses for disparity tasks.distillation_loss.pyDistillation loss, KL divergence, CrossEntropy, weighted average.transnetv2_loss.pyTransNetV2 loss calculation classyowo_loss.pyOptimize hard examples, YOLOv3-style losses for bounding box location, confidence, classification on GPU.
registry.pyModel registration for efficient organization.__init__.pyLicensing, RandomSampler import.random_sampler.pyRandomSampler: Balanced positive-negative bbox sampling.
weight_init.pyInitialize layer weights in PaddlePaddle.
__init__.pyImports optimizer and learning rate functions under Apache 2.0 license.custom_lr.pyCustom warmup & decay schedulers for PaddleVideo.lr.pyLearning rate scheduler construction.optimizer.pyInitializes optimizer configurations, handles decay and creates optimizer.
__init__.pyPaddleVideo tasks module initialization filetest.pyTest PaddlePaddle models in paralleltrain.pyTrain PaddlePaddle's Distributed Model with AMP, Log, Evaluate, and Save.train_dali.pyTrains DALI TSN model with optimization steps, logs metrics.train_multigrid.pyTrains PaddleVideo models with multigrid configuration.
__init__.pyPaddleVideo utils initialization.build_utils.pyBuild utility function for object creation from config and registry.config.pyConfigure and load YAML files with AttrDict, logger, visualization support.dist_utils.pyDistributed computing utilities for PaddleVideo.logger.pyPaddleVideo's colorful logger setup with non-propagation.__init__.pyImports useful functions from PaddleVideo library.batchnorm_helper.pyBatchNorm3D helper for PyTorchinterval_helper.pyDetermine evaluation epoch using multigrid schedulemultigrid.pyManages multigrid training schedules.save_load_helper.pyEnsures state dict consistency and loads pre-trained weights.short_sampler.pyDistributedShortSampler: Efficient Distributed Data Loading
precise_bn.pyPrecise Batch Normalization for faster trainingprofiler.pyPaddleVideo Profiler: Performance Analysis and Optimization Toolrecord.pyRecords metrics, calculates means, logs progress in training processes.registry.pyRegistry for mapping names to objects with registration and retrieval methods.save_load.pySave/load functions for PaddlePaddle models.
version.pyPaddleVideo version 0.0.1, Apache License 2.0
README.mdAdvanced video processing library with deployment supportREADME_en.mdDeep learning library for video processingrun.shTrains deep learning models for computer vision tasks using PaddlePaddle.setup.pyPaddleVideo setup: PaddlePaddle-powered video tool, Python 3.7.benchmark_train.shBenchmarks PaddleVideo model, adjusting batch sizes and precisions for performance.common_func.shCommon function script for parsing, setting, and status checking.compare_results.pyCompares log results with ground truth for testing.extract_loss.pyExtracts and processes expressions for calculations.prepare.shPrepare PaddlePaddle's video models, handle data & install packagesREADME.mdTIPC-supported PaddleVideo tutorials and features.test_inference_cpp.shPaddleVideo model inference tests with MKLDNN or float point precision.test_paddle2onnx.shConvert PaddlePaddle to ONNX.test_ptq_inference_python.shScript for PaddleVideo model inference tests on GPUs/CPUs with varied batch sizes.test_serving_infer_cpp.shBash script initializes model, serves via Python/C++, tests web function with IFS separation.test_serving_infer_python.shConfigures model serving, sets up API server, transfers model, and handles cleanup tasks.test_train_dy2static_python.shTrains and analyzes two models, compares losses, logs differences.test_train_inference_python.shEfficient PaddleVideo model training and evaluation.test_train_inference_python_npu.shUpdate config for NPU, disable MKLDNN on non-x86_64, set Python to 3.9, change exec script to "npu".test_train_inference_python_xpu.shXPU PaddleVideo script with Python 3.9 NPU, logs start.
__init__.pyImports and defines package contents.ava_predict.pyAVA model prediction with PaddleVideo and OpenCV.export_model.pyExport PaddleVideo model functions.predict.pyPaddle Video Tool: Predict with TensorRT YOWOsummary.pyPaddleVideo model construction and evaluation.utils.pyPaddleVideo tools for inference, detection, and pose estimationwheel.pyWheel.py: Downloads, initializes, and iterates model results for video label prediction.