您所在位置：网站首页 > 学术论文 > 毕业论文 > 基于听觉感知模型和统计学习的语音鲁棒处理

基于听觉感知模型和统计学习的语音鲁棒处理.pdf

115页

卖家[上传人]：lizhe****0920

文档编号：47305660

上传时间：2018-07-01

文档格式：PDF

文档大小：1.64MB

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

18金贝

下载

/ 115 举报版权申诉马上下载

文本预览

下载提示

常见问题

上海交通大学博士学位论文基于听觉感知模型和统计学习的语音鲁棒处理姓名：张文军申请学位级别：博士专业：控制理论与控制工程指导教师：谢剑英20030910上海交通大学博士学位论文 i 基于听觉感知模型和统计学习的语音鲁棒处理摘要语音技术被广泛应用于多样环境之前依然面临着各种挑战例如如何在具有环境噪声和通道失真的情况下加强语音处理技术的鲁棒性语音鲁棒处理的研究是开展包括语音识别语音合成语种识别以及说话人识别在内的语音学研究的基础和关键也是语音库建立过程中的重要工作目前语音处理系统的识别率和语音合成的自然度还不能令人满意其根本原因是对自然语音的研究不够深入不能准确归纳描述和模拟自然语音的规律语音处理技术的进展必须依靠现实环境中各种语音数据的语料库的收集整理和发布本文主要目的是研究语音鲁棒处理技术提高噪声环境中语音切分的鲁棒性然后在此基础上具体实现语音库建设辅助工具本文首先基于人类的听觉感知模型研究了语音信号的时频分析方法构造了满足听觉感知模型的非均匀完全重构滤波器组完成了基于最大似然估计的子带语音去噪算法实现了基于 MDL最小描述长度的自适应平滑子带语音鲁棒端点检测算法其次讨论了基于隐马尔可夫模型语音切分的缺陷指出了韵律因素对语音切分的影响提出了语音鲁棒切分的贝叶斯框架最后描述了标注图的主要思想提出了基于 XML 的语音标注体系结构并利用可扩展标注语言 XMLVisual Basic 和 SQL 实现了语音库建设辅助工具的原形系统具体标注了孤立数字语音库连续数字串语音库和用于说话人识别的特殊语音库本文的主要贡献包括 ? 基于人类听觉感知模型在完全重构滤波器组的时域条件基础上利用 Bark变换和全通系统实现了满足听觉感知模型的非均匀完全重构滤波器组 ? 根据小波去噪的基本原理在最大似然谱估计的基础上引入了自适应机制调整不同子带的门限得到了适合于缓变非平稳噪声的子带语音去噪方法概率密度的计算利用了正交基下概率密度的计算思想 ? 通过启发式边缘聚焦的思想首先通过双门限方法得到语音的低能量区然后采用基于最小描述长度MDL的自适应平滑算法确定不同子带的边缘最后利用模糊决策模型综合了不同子带的结果实现了鲁棒的子带语音端点检测上海交通大学博士学位论文 ii ? 基于贝叶斯决策方法分析了语音分割中韵律因素的影响和基于隐马尔可夫模型语音切分的缺陷提出了语音鲁棒分割的贝叶斯框架实现了用于贝叶斯框架的语音分割模型 ? 基于 XML 的语音标注体系结构利用标注图的理论框架建立了语音库建设辅助工具的原形系统并实现了孤立数字语音库连续数字串语音库和用于说话人识别的特殊语音库本文进行了大量的仿真研究和实验同时将改进后的算法同原算法进行了比较结果表明我们提出的算法是有价值的关键词子带语音最小描述长度MDL听觉模型贝叶斯方法语音切分标注图小波去噪端点检测上海交通大学博士学位论文 iii Robust Speech Processing Based on Auditory Model and Statistical Learning Abstract Before applied in the adverse environment far and wide, speech technique is still confronted with varieties of challenge, for example, how to enhance the robustness of speech technique in the condition of noise and channel distortion. The study of robust speech processing play the key role in speech recognition, speech synthesis and speaker recognition, it is also the important basis of producing speech corpora. The ultimate reason why speech applications presently does not turn up trumps is that our study of natural speech is not thorough enough to induce, depict and simulate the rule of natural speech by rule and line, so the development of speech technique builds upon the collection, settlement and issuance of varieties of speech corpora in real-life environment. Our study aims at studying robust speech processing in order to satisfy the robustness of speech segmentation in adverse environment, and implementing tools for assisting speech corpora production based on annotation graph. In this paper, we firstly study the time-frequency analysis of speech based on auditory model, construct non-uniform PR filter banks based on auditory models, realize sub-band speech de-noising based on ML, and accomplish fuzzy sub-band speech endpoint detection based adaptive smoothing using MDL (minimal description length). Secondly, we discuss the speech segmentation based on Hidden Markov Model, study the effect of rhythm in this problem and put forward the Bayesian framework of robust speech segmentation. Thirdly, we describe the principle of annotation graph, and then put forward the architecture of speech labeling based XML, finally realize the example system of tools for assisting speech corpora production utilizing XML, Visual Basic and SQL. Finally, we practically label some speech corpora, for example isolated-digital speech corpora. The contribution and innovation of this paper include: ? According to the condition of PR filter banks in time domain, we analyze human’s auditory model, and then realize the non-uniform PR filter banks based on auditory model using the Bark transformation; ? Utilizing the principle of wavelet de-noising, we introduce the adaptive mechanism based on ML to tune the threshold in different sub-band speech, and obtain sub-band speech de-noising method fitted for the slow- changed non-stationary noise. 上海交通大学博士学位论文 iv ? Based on the thought of heuristic “edge- focus”, we firstly fix on the low-energy areas using “double-threshold” method, and then make use of the adaptive smoothing using MDL in sub-band speech to locale the actual endpoint. In order to synthesize the result o f different sub-band speech, we utilize the fuzzy decision model to achieve the robust sub-band speech endpoint detection. ? We analyze the effect of rhythm and limitation of speech segmentation based on HMM, and then build the Bayesian framework of robust speech segmentation using Bayesian decision theory, finally realize the segmentation model used by Bayesian speech segmentation. ? After constituting the speech labeling architecture based on XML using the principle of annotation-graph, we build the example system of tools for assisting speech corpora production, and then label some speech corpora, for example isolated-digital speech corpora. Many simulations and experiments show these algorithms effective. KeywordSub-band speech, MDL (minimal description length)Auditory Model, Bayesian Decision, Speech Segmentation, Wavelet De-noising, Endpoint Detection, Annotation-graph上海交通大学学位论文原创性声明本人郑重声明所呈交的学位论文是本人在导师的指导下独立进行研究工作所取得的成果除文中已经注明引用的内容外本论文不包含任何其他个人或集体已经发表或撰写过的作品成。

点击阅读更多内容

相关文档

物料无人职守系统.ppt 语言之幽默表达.ppt 污水源热泵原理课件.ppt 消费电子行业CRM解决方案.pptx 系统性红斑狼疮的护理.ppt 现代顺风耳-电话(28张ppt)课件.ppt 现代办公设备应用_01计算机基本知识.ppt 混龄班美工区环境创设与使用情况的研究分析环境工程管理专业.docx 当今企业成本管理存在问题与对策分析研究财务管理专业.doc 论老九门IP分析研究公共管理专业.docx 论肖邦e小调夜曲分析研究音乐学专业.doc 论企业会计电算化条件下的内部控制分析研究财务管理专业.doc 会计诚信建设的思考及对策分析研究财务管理专业.doc 落实“三权分置”释放农村土地新活力分析研究法学专业.docx 电动汽车驱动控制系统设计和实现车辆工程管理专业.doc 基于主成分分析的区域经济发展研究分析—以苏北地区为例工商管理专业.doc 接受美学视角下的英语双关语翻译的相关思考分析研究英语教学专业.docx 论程序正义对司法公信力的影响分析研究——以呼格吉勒图案为例法学专业.docx 车载移动电视的传播效果分析研究媒体学专业.doc 企业并购对财务绩效的影响分析研究—以立思辰并购上海友网科技为例会计学专业.doc

猜您喜欢

五标石头街道雨水施工图设计变更.pdf 总结考生托福写作的困境和问题对策-智课教育旗下智课教育.pdf R2868翻译.pdf 北京航空航天大学生物与医学工程学院三维扫描测量系统采购文件.pdf 初中语文教学论文深文浅教的几点做法.pdf 【徐州赛场】报到、奖励、住宿、会务、联系等重要事宜(1).pdf 土木工程毕业设计工业厂房计算书.doc 喝一瓶啤酒4小时后开车依旧算酒驾,到底多久后能开车？.pdf 外商投资企业设立登记.pdf 多方面分析托福阅读-智课教育旗下智课教育.pdf 2015海南公务员考试行测备考：五方法拨开词语辨析迷雾.pdf 量子力学ch1量子(周)习题解析-yu.pdf 直角三角形全等的判定.ppt 墙面铺瓷砖不包边,就等着在家哭吧!!!.pdf 直流直流变流电路.ppt 如何做一名成功的超市店长69P.pdf 托福写作开头段独创模板-智课教育旗下智课教育.pdf 开花和结果七年级下册课件.ppt 众合重点学科班理论法学王锴.pdf 慧聪网行业公司简介.ppt

进入店铺

收藏店铺

相似文档更多>

正为您匹配相似的精品文档