电子文档交易市场
安卓APP | ios版本
电子文档交易市场
安卓APP | ios版本

算法导论课件第10讲赫夫曼编码

43页
  • 卖家[上传人]:E****
  • 文档编号:91095368
  • 上传时间:2019-06-21
  • 文档格式:PPT
  • 文档大小:439.50KB
  • / 43 举报 版权申诉 马上下载
  • 文本预览
  • 下载提示
  • 常见问题
    • 1、1,第4章 贪心算法,赫夫曼编码 单源最短路径,2,Huffman codes,3,Huffman codes,Data Compression via Huffman Coding Motivation Limited network bandwidth. Limited disk space. Human codes are used for data compression. Reducing time to transmit large files, and Reducing the space required to store them on disk or tape. Huffman codes are a widely used and very effective technique for compressing data, savings of 20% to 90% are typical, depending on the characteristics of the data.,4,Problem Suppose that you have a file of

      2、 100K characters. To keep the example simple, suppose that each character is one of the 6 letters from a through f. Since we have just 6 characters, we need just 3 bits to represent a character, so the file requires 300K bits to store. Can we do better? Suppose that we have more information about the file: the frequency which each character appears. Solution The idea is that we will use a variable length code instead of a fixed length code (3 bits for each character), with fewer bits to store th

      3、e common characters, and more bits to store the rare characters.,Huffman codes,5,Example of Data Compression,For example, suppose that the characters appear with the following frequencies, and following codes: Then the variable-length coded version will take not 300K bits but 45*1 + 13*3 + 12*3 + 16*3 + 9*4 + 5*4 = 224K bits to store, a 25% saving. In fact this is the optimal way to encode the 6 characters present, as we shall see.,6,Characteristic of Optimal Code,Represented as a binary tree wh

      4、ose leaves are the given characters. In an optimal code each non-leaf node has two children.,7,Prefix-free Code,In a Prefix code no codeword is a prefix of another code word. Easy encoding and decoding. To encode, we need only concatenate the codes of consecutive characters in the message. The string 110001001101 parses uniquely as 1100-0-100-1101, which decodes to FACE. To decode, we have to decide where each code begins and ends. Easy, since, no codes share a prefix.,8,Example of Huffman codes

      5、,9,Example of Huffman codes,10,Algorithm of Huffman Coding,The greedy algorithm for computing the optimal Human coding tree T is as follows. It starts with a forest of one-node trees representing each c C, and merges them in a greedy style, using a priority queue Q, sorted by the smallest frequency:,11,Algorithm of Huffman Coding,12,霍夫曼编码实例, step I,Assume that relative frequencies are: A: 40 B: 20 C: 10 D: 10 R: 20 (I chose simpler numbers than the real frequencies) Smallest number are 10 and 10

      6、 (C and D), so connect those,13,霍夫曼编码实例, step II,C and D have already been used, and the new node above them (call it C+D) has value 20 The smallest values are B, C+D, and R, all of which have value 20 Connect any two of these,14,霍夫曼编码实例, step III,The smallest values is R, while A and B+C+D all have value 40 Connect R to either of the others,15,霍夫曼编码实例, step IV,Connect the final two nodes,16,霍夫曼编码实例, step V,Assign 0 to left branches, 1 to right branches Each encoding is a path from the root,A =

      7、0 B = 100 C = 1010 D = 1011 R = 11 Each path terminates at a leaf Do you see why encoded strings are decodable?,17,赫夫曼编码的正确性,Definition: the cost of the tree T.,For each character c in the alphabet C, let f (c) denote the frequency of c in the file and let dT(c) denote the depth of cs leaf in the tree. Note that dT(c) is also the length of the codeword for character c. The number of bits required to encode a file is thus,Given a tree T corresponding to a prefix code, it is a simple matter to com

      8、pute the number of bits required to encode a file.,18,赫夫曼编码的正确性,定义:树T的代价,对字母表C中的每一个字符c ,设f (c)表示c在文件中出现的频度, dT(c)表示c的叶子在树中的深度。注意dT(c)也是字符c的编码的长度。这样编码一个文件所需的位数为:,给定对应一种前缀编码的二叉树T ,很容易计算出编码一个文件所需要的位数。,(1),19,赫夫曼编码的正确性,Lemma 1 Let C be an alphabet in which each character c C has frequency f c. Let x and y be two characters in C having the lowest frequencies. Then there exists an optimal prefix code for C in which the codewords for x and y have the same length and differ only in the last bit.,20,赫夫曼编

      9、码的正确性,引理 1 设C为一字母表,其中每个字符c C具有频度f c. 。设x和y为C中具有最低频度的两个字符,则存在C的一种最优前缀编码,其中x和y编码长度相同但最后一位不同。,21,赫夫曼编码的正确性,Proof,The idea of the proof is to take the tree T representing an arbitrary optimal prefix code and modify it to make a tree representing another optimal prefix code such that the characters x and y appear as sibling leaves of maximum depth in the new tree. If we can do this, then their codewords will have the same length and differ only in the last bit. Let a and b be two characters that are sibling leaves of maximum depth in T. Without loss of generality, we assume that fa fb and fx fy. Since fx and fy are the two lowest leaf frequencies, in order, and fa and fb are two arbitrary frequencies, in order, we have f x fa and fy fb. As shown in Figure , we exchange the positions in T of a and x to produce a tree T, and then we exchange the positions in T of b and y to produce a tree T. By equation the above, the difference in cost between

      《算法导论课件第10讲赫夫曼编码》由会员E****分享,可在线阅读,更多相关《算法导论课件第10讲赫夫曼编码》请在金锄头文库上搜索。

      点击阅读更多内容
    最新标签
    发车时刻表 长途客运 入党志愿书填写模板精品 庆祝建党101周年多体裁诗歌朗诵素材汇编10篇唯一微庆祝 智能家居系统本科论文 心得感悟 雁楠中学 20230513224122 2022 公安主题党日 部编版四年级第三单元综合性学习课件 机关事务中心2022年全面依法治区工作总结及来年工作安排 入党积极分子自我推荐 世界水日ppt 关于构建更高水平的全民健身公共服务体系的意见 空气单元分析 哈里德课件 2022年乡村振兴驻村工作计划 空气教材分析 五年级下册科学教材分析 退役军人事务局季度工作总结 集装箱房合同 2021年财务报表 2022年继续教育公需课 2022年公需课 2022年日历每月一张 名词性从句在写作中的应用 局域网技术与局域网组建 施工网格 薪资体系 运维实施方案 硫酸安全技术 柔韧训练 既有居住建筑节能改造技术规程 建筑工地疫情防控 大型工程技术风险 磷酸二氢钾 2022年小学三年级语文下册教学总结例文 少儿美术-小花 2022年环保倡议书模板六篇 2022年监理辞职报告精选 2022年畅想未来记叙文精品 企业信息化建设与管理课程实验指导书范本 草房子读后感-第1篇 小数乘整数教学PPT课件人教版五年级数学上册 2022年教师个人工作计划范本-工作计划 国学小名士经典诵读电视大赛观后感诵读经典传承美德 医疗质量管理制度 2 2022年小学体育教师学期工作总结 2022年家长会心得体会集合15篇
    关于金锄头网 - 版权申诉 - 免责声明 - 诚邀英才 - 联系我们
    手机版 | 川公网安备 51140202000112号 | 经营许可证(蜀ICP备13022795号)
    ©2008-2016 by Sichuan Goldhoe Inc. All Rights Reserved.