算法介绍

任务简介

单目深度估计，是指以单目RGB图像作为输入，根据图像中的结构信息、角点信息、相对位置信息等等对输入中的每个像素的深度值进行估计，输出稠密深度图。

模型介绍

本模型来自于From Big to Small: Multi-Scale Local Planar Guidance for Monocular Depth Estimation论文。
[Paper] |
[Project page]

模型主要探索如果充分利用多尺度的信息来提升稠密深度图的效果，例如在主架构中加入ASPP模块，在不同尺度的分支之间添加充足的skip-connection来将多尺度信息融合，而在各个分支内设计Local Planar Guidance模块来利用局部信息。

模型结构

模型推理示例

仅支持GPU

import cv2

from modelscope.outputs import OutputKeys
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks


image = 'https://modelscope.oss-cn-beijing.aliyuncs.com/test/images/image_depth_estimation_kitti_007517.png'

estimator = pipeline(task=Tasks.image_depth_estimation, model='damo/cv_densenet161_image-depth-estimation_bts')
result = estimator(input=image)
depth_vis = result[OutputKeys.DEPTHS_COLOR]
cv2.imwrite('result.jpg', depth_vis)

模型性能

kitti dataset

Base Network	AbsRel	SqRel	RMSE	RMSElog	SILog
DenseNet161	0.06	0.29	3.03	0.1	9.83

Citation

@article{lee2019big,
  title={From big to small: Multi-scale local planar guidance for monocular depth estimation},
  author={Lee, Jin Han and Han, Myung-Kyu and Ko, Dong Wook and Suh, Il Hong},
  journal={arXiv preprint arXiv:1907.10326},
  year={2019}
}
@misc{ErenBalatkan/Bts-PyTorch,
      title={https://github.com/ErenBalatkan/Bts-PyTorch}
}