
基于压缩视频的图像插值技术研究硕士论文.pdf
67页国内图书分类号:TP391.4 学校代码:10213 国际图书分类号:681.39 密级:公开 工工学学硕硕士学士学位论文位论文 基于压缩视频的图像插值技术研究 硕 士 研 究 生 : 高欣玮 导 师 : 赵德斌教授 申请学位 : 工学硕士 学科 : 计算机科学与技术 所 在 单 位 : 计算机科学与技术学院 答 辩 日 期 : 2011 年 6 月 授予学位单位 : 哈尔滨工业大学 Classified Index: TP391.4 U.D.C: 681.39 Dissertation for the Master Degree in Engineering IMAGE INTERPOLATION FOR COMPRESSED VIDEO Candidate:: Gao Xinwei Supervisor:: Prof. Zhao Debin Academic Degree Applied for:: Master of Engineering Speciality:: Computer Science and Technology Affiliation:: School of Computer Science and Technology Date of Defence:: June, 2011 Degree-Conferring-Institution:: Harbin Institute of Technology 哈尔滨工业大学工学硕士学位论文 - I - 摘摘 要要 图像插值是最基础的图像研究课题之一,许多的图像插值方法在文献中被提出,用以解决非压缩图像的插值问题。
然而,大量的视频序列是以压缩格式存储的,或由带宽限制要求视频以压缩形式进行传输一些基于非压缩视频的图像插值方法,当直接应用于压缩视频的图像插值时,往往得不到较好的效果这是因为,一方面,这些方法没有利用视频码流中已有的信息;另一方面他们没有考虑压缩视频的量化误差,而量化误差在一些情况下很明显关于上述问题, 我们在研究 H.264/AVC 和 MPEG-2 压缩视频的图像插值方面做了一些努力 我们提出了一种对 H.264/AVC 压缩视频,基于模式指导的帧内视频图像插值方法我们在设计插值滤波器的时候考虑了帧内方向预测模式信息对每一个帧内方向预测模式,我们在经典视频序列训练集上训练出一组相对应的最优的插值滤波器,所以每一个插值滤波器能自动适应于与他对应的一个帧内方向预测模式,更进一步地,量化因子作为上下文参考也参与到插值滤波器的设计与选择 实验结果表明该模型相比于其他传统的模型 Bicubic, Bilinear, LAZA 和 NEDI,能提高插值的性能,同时保持低运算复杂度 进而我们提出了一种对 H.264/AVC 压缩视频的基于模式指导的帧间视频图像插值方法,对每一个帧间帧(P 帧和 B 帧)而言,帧间预测模式被考虑来获取到运动信息, 如 motion vector (运动向量) 。
每一个在帧间帧的待插值像素点,它的插值滤波器是由它根据运动信息的对应参考点的插值滤波器拷贝而来,而双向参考的待插值像素点,取其双向插值滤波后得到的像素值的均值这样的设计不破坏压缩视频的结构实验结果表明该模型相比于其他传统的模型Bicubic, Bilinear, LAZA 和 NEDI, 能提高插值的性能, 同时保持低运算复杂度 借鉴 H.264/AVC 帧内预测模式和边缘指导的图像插值技术的成功经验,我们提出一种基于 MPEG-2 压缩视频的方向插值方法,在模型中,8x8 的帧内预测模式帧中的规则块在变换域被分成九种方向,然后在这一块上的插值被认为是沿着这一像素块的方向每一个规定的方向,我们在经典的视频序列训练集上训练出一组最优的维纳插值滤波器,并用这组滤波器进行插值利用相似的方法,我们对每一个帧间帧的规则块来说,沿着其方向的对应块,作为该块的插值参考,实验结果表明该模型相比于其他传统的线性的插值模型 Bicubic 和Bilinear 和方向指导的模型 LAZA 和 NEDI,均能提高插值的性能,同时保持低运算复杂度来满足实际应用 关键词:图像插值;压缩视频;模式指导;维纳滤波;低复杂度; 哈尔滨工业大学工学硕士学位论文 - II - Abstract Image interpolation is one of the most elementary imaging research topics. A number of image interpolation methods have been developed for uncompressed images in the literature. However, a lot of videos have already been stored in compressed format or have to be transmitted in compressed format due to bandwidth. The image interpolation methods developed for uncompressed images may not be effective when directly applied to compressed videos, because on the one hand, they do not utilize the information existed in the coded bitstreams; on the other hand, they do not consider quantization error, which may be dominant in some cases. Towards these problems above, we make several efforts on the study of H.264/AVC and MPEG-2 compressed video. A mode-dependent intra frame interpolation method is proposed for H.264/AVC compressed video. The intra prediction mode information is taken into account in the interpolation filter design. For each intra prediction mode, an optimal Wiener filter is trained based on the representative video sequences. Therefore the trained filter is adaptive to the intra prediction mode. Furthermore, the quantization parameter is also explored as context information for filter selection. Extensive experiments demonstrate that the proposed method achieves better performance than the traditional methods such as Bicubic, Bilinear, LAZA and NEDI, while keeping low computational complexity. A mode-dependent inter frame interpolation method is proposed for H.264/AVC compressed video. For the inter frames (P frame and B frame), the inter prediction mode information is taken into account by utilizing the motion information, i.e. the motion vector. For each pixel to be interpolated in the inter frames, the filters of its corresponding pixels in the reference frames are used. Extensive experiments demonstrate that the proposed method achieves better performance than the traditional methods such as Bicubic, Bilinear, LAZA and NEDI, while keeping low computational complexity. Inspired by the success of the intra prediction in H.264/AVC and the edge-directed image interpolation methods (such as LAZA and NEDI), we propose a directional frame interpolation for MPEG compressed video. In the proposed method, 8×8 intra blocks in I frames are first classified to nine block directions in transform domain. Then the interpolation on each block is performed along its block direction. For each block direction, an optimal Wiener filter is trained based on the representative video sequences and then used for its interpolation. In the similar way, for each pixel in an inter block in P or B frames, the interpolation is performed along 哈尔滨工业大学工学硕士学位论文 - III - the block direction of its corresponding reference pixels. The experimental results demonstrate that the proposed method achieves better performance than the traditional linear methods such as Bicubic and Bilinear and the edge-directed methods such as LAZA and NEDI, while keeping low computational complexity which meets the requirement of practical applications. Keywords: Imag。
