电子文档交易市场
安卓APP | ios版本
电子文档交易市场
安卓APP | ios版本
换一换
首页 金锄头文库 > 资源分类 > DOC文档下载
分享到微信 分享到微博 分享到QQ空间

计算机体系结构复习

  • 资源ID:31793859       资源大小:63KB        全文页数:4页
  • 资源格式: DOC        下载积分:10金贝
快捷下载 游客一键下载
账号登录下载
微信登录下载
三方登录下载: 微信开放平台登录   支付宝登录   QQ登录  
二维码
微信扫一扫登录
下载资源需要10金贝
邮箱/手机:
温馨提示:
快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
如填写123,账号就是123,密码也是123。
支付方式: 支付宝    微信支付   
验证码:   换一换

 
账号:
密码:
验证码:   换一换
  忘记密码?
    
1、金锄头文库是“C2C”交易模式,即卖家上传的文档直接由买家下载,本站只是中间服务平台,本站所有文档下载所得的收益全部归上传人(卖家)所有,作为网络服务商,若您的权利被侵害请及时联系右侧客服;
2、如你看到网页展示的文档有jinchutou.com水印,是因预览和防盗链等技术需要对部份页面进行转换压缩成图而已,我们并不对上传的文档进行任何编辑或修改,文档下载后都不会有jinchutou.com水印标识,下载后原文更清晰;
3、所有的PPT和DOC文档都被视为“模板”,允许上传人保留章节、目录结构的情况下删减部份的内容;下载前须认真查看,确认无误后再购买;
4、文档大部份都是可以预览的,金锄头文库作为内容存储提供商,无法对各卖家所售文档的真实性、完整性、准确性以及专业性等问题提供审核和保证,请慎重购买;
5、文档的总页数、文档格式和文档大小以系统显示为准(内容中显示的页数不一定正确),网站客服只以系统显示的页数、文件格式、文档大小作为仲裁依据;
6、如果您还有什么不清楚的或需要我们协助,可以点击右侧栏的客服。
下载须知 | 常见问题汇总

计算机体系结构复习

Review 11. Pipeline 特点Pipelining doesnt help latency of single task, it helps throughput of entire workload Pipeline rate limited by slowest pipeline stageMultiple tasks operating simultaneouslyPotential speedup = Number pipe stagesUnbalanced lengths of pipe stages reduces speedupTime to “fill” pipeline and time to “drain” it reduces speedup2. RISC MIPS5 steps of MIPS datapath IF ID EXE MA WB3. Three Hazards structural不能同时运作 data 之前结果 control 跳转4. One memory port two different cache entries holding data for the same physical address!for update: must update all cache entries with same physical addressor memory becomes inconsistent3. TLBs:A way to speed up translation is to use a special cache of recently used page table entries - this has many names, but the most frequently used is Translation Lookaside Buffer or TLBVirtual Address Physical Address Dirty Ref Valid Access4. P408 计算加速比 命中性能5. SPEC: System Performance Evaluation Cooperative6. Moores Law: the number of transistors in a dense integrated circuit doubles approximately 18 months. 摩尔定律指出集成电路上可容纳的晶体管数目,约每隔 18个月便会增加一倍,性能也将提升一倍。7. Performance Summary needs good benchmarks and good ways to summarize performance.8. AMAT = Average Memory Access Time例:Suppose a processor executes at Clock Rate = 200 MHz (5 ns per cycle), Ideal (no misses) CPI = 1.1 50% arith/logic, 30% ld/st, 20% control Suppose that 10% of memory operations get 50 cycle miss penalty Suppose that 1% of instructions get same miss penalty CPI = ideal CPI + average stalls per instruction1.1(cycles/ins) + 0.30 (DataMops/ins) x 0.10 (miss/DataMop) x 50 (cycle/miss) + 1 (InstMop/ins) x 0.01 (miss/InstMop) x 50 (cycle/miss) = (1.1 + 1.5 + .5) cycle/ins = 3.1 58% of the time the proc is stalled waiting for memory!AMAT=(1/1.3)x1+0.01x50+(0.3/1.3)x1+0.1x50=2.549. 冯诺依曼和哈佛结构性能呢对比:16KB I&D: Inst miss rate=0.64%, Data miss rate=6.47%32KB unified: Aggregate miss rate=1.99%Assume 33% data ops 75% accesses from instructions (1.0/1.33)hit time=1, miss time=50Note that data hit has 1 stall for unified cache (only one port)AMATHarvard=75%x(1+0.64%x50)+25%x(1+6.47%x50) = 2.05AMATUnified=75%x(1+1.99%x50)+25%x(1+1+1.99%x50)= 2.2410.write through(a valid bit) write back(dirty bit and valid bit)Write Allocate vs Non-Allocate 写入缺失时做法,先读到缓存中在写,和直接写磁盘11. Improving Cache Performance P426 Reduce the miss rate 3Cs n-way 1-way(size x) 2-way(size x/2)Reduce Misses via Larger Block Size (因空间局部性会降低强制缺失,可能增大冲突缺失,若容量小可能增大容量缺失)提高了缺失代价Reduce Misses via Higher Associativity 2:1 Cache Rule会延长命中时间 AMATReducing Misses via a“Victim Cache”Add buffer to place data discarded from cacheReducing Misses via “Pseudo-Associativity”Reducing Misses by Hardware Prefetching of Instructions & Datals Reducing Misses by Software Prefetching DataPrefetching comes in two flavors:Binding prefetch: Requests load directly into register. Must be correct address and register!Non-Binding prefetch: Load into cache. Can be incorrect. Frees HW/SW to guess!Reducing Misses by Compiler Optimizations(merging arrays loop interchange loop fusion blocking)Reduce the miss penaltyRead Priority over Write on Miss(读取缺失优先级高于写入缺失)让读取缺失一直等待到写入缓冲区为空为止Reduce Miss Penalty: Early Restart and Critical Word FirstDont wait for full block to be loaded before restarting CPUNon-blocking Caches to reduce stalls on misses:Add a second-level cache:Which apply to L2 Cache?Reduce Conflict Misses via Higher AssociativityReduce the time to hit in the cache1. main memory DRAM cache SRAM1. Tomasulo算法: renaming 保留站

注意事项

本文(计算机体系结构复习)为本站会员(第***)主动上传,金锄头文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即阅读金锄头文库的“版权提示”【网址:https://www.jinchutou.com/h-59.html】,按提示上传提交保证函及证明材料,经审查核实后我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。




关于金锄头网 - 版权申诉 - 免责声明 - 诚邀英才 - 联系我们
手机版 | 川公网安备 51140202000112号 | 经营许可证(蜀ICP备13022795号)
©2008-2016 by Sichuan Goldhoe Inc. All Rights Reserved.