该模型为抖动视频稳像模型,输入一个抖动视频,实现端到端的视频稳像(视频去抖动),返回稳像处理后的稳定视频。
模型效果如下,Demo中的测试视频源来自DVS开源数据集。
原始视频(左) , 去抖视频(右) |
在 ModelScope 框架上,提供输入视频,即可以通过简单的 Pipeline 调用来使用视频稳像模型。模型暂时仅支持在GPU上进行推理,具体示例代码如下:
import cv2
from modelscope.outputs import OutputKeys
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
test_video = 'https://modelscope.oss-cn-beijing.aliyuncs.com/test/videos/video_stabilization_test_video.avi'
video_stabilization = pipeline(Tasks.video_stabilization,
model='damo/cv_dut-raft_video-stabilization_base')
out_video_path = video_stabilization(test_video)[OutputKeys.OUTPUT_VIDEO]
print('Pipeline: the output video path is {}'.format(out_video_path))
利用NUS视频稳像数据集进行评测
import os
import tempfile
from modelscope.hub.snapshot_download import snapshot_download
from modelscope.utils.config import Config
from modelscope.utils.constant import DownloadMode, ModelFile
from modelscope.trainers import build_trainer
from modelscope.msdatasets import MsDataset
from modelscope.msdatasets.task_datasets.video_stabilization import \
VideoStabilizationDataset
tmp_dir = tempfile.TemporaryDirectory().name
if not os.path.exists(tmp_dir):
os.makedirs(tmp_dir)
model_id = 'damo/cv_dut-raft_video-stabilization_base'
cache_path = snapshot_download(model_id)
config = Config.from_file(os.path.join(cache_path, ModelFile.CONFIGURATION))
dataset_val = MsDataset.load(
'NUS_video-stabilization',
namespace='zcmaas',
subset_name='Regular',
split='train',
download_mode=DownloadMode.REUSE_DATASET_IF_EXISTS)._hf_ds
eval_dataset = VideoStabilizationDataset(dataset_val, config.dataset)
kwargs = dict(
model=model_id,
train_dataset=None,
eval_dataset=eval_dataset,
work_dir=tmp_dir)
trainer = build_trainer(default_args=kwargs)
metric_values = trainer.evaluate()
print(metric_values)
模型使用公开数据集DeepStab进行训练。
本算法模型的训练与推理过程参考了一些开源项目:
如果你觉得这个模型对你有所帮助,请考虑引用下面的相关论文:
@article{xu2022dut,
title={Dut: Learning video stabilization by simply watching unstable videos},
author={Xu, Yufei and Zhang, Jing and Maybank, Stephen J and Tao, Dacheng},
journal={IEEE Transactions on Image Processing},
volume={31},
pages={4306--4320},
year={2022},
publisher={IEEE}
}
@inproceedings{teed2020raft,
title={Raft: Recurrent all-pairs field transforms for optical flow},
author={Teed, Zachary and Deng, Jia},
booktitle={European conference on computer vision},
pages={402--419},
year={2020},
organization={Springer}
}
@article{Choi_TOG20,
author = {Choi, Jinsoo and Kweon, In So},
title = {Deep Iterative Frame Interpolation for Full-Frame Video Stabilization},
year = {2020},
issue_date = {February 2020},
publisher = {Association for Computing Machinery},
volume = {39},
number = {1},
issn = {0730-0301},
url = {https://doi.org/10.1145/3363550},
journal = {ACM Transactions on Graphics},
articleno = {4},
numpages = {9},
}