Computer Organization and Architecture Designing for Performance Ninth Edition William Stallings International Edition contributions by R. Mohan National Institute of Technology, Tiruchirappalli PEARSON Boston Columbus Indianapolis New York San Francisco Upper Saddle River Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montreal Toronto Delhi Mexico City Sao Paulo Sydney Hong Kong Seoul Singapore Taipei Tokyo
Contents Online Resources 11 Preface 13 About the Author 21 Chapter 0 Reader's and Instructor's Guide 23 0.1 Outline of the Book 24 0.2 A Roadmap for Readers and Instructors 24 0.3 Why Study Computer Organization and Architecture? 25 0.4 Internet and Web Resources 27 PART ONE OVERVIEW 28 Chapter 1 Introduction 28 1.1 Organization and Architecture 29 1.2 Structure and Function 30 1.3 Key Terms and Review Questions 36 Chapter 2 Computer Evolution and Performance 37 2.1 A Brief History ofcomputers 38 2.2 Designing for Performance 59 2.3 Multicore, MICs, and GPGPUs 65 2.4 The Evolution of the Intel x86 Architecture 66 2.5 Embedded Systems and the ARM 67 2.6 Performance Assessment 71 2.7 Recommended Reading 81 2.8 Key Terms, Review Questions, and Problems 82 PART TWO THE COMPUTER SYSTEM 87 Chapter 3 A Top-Level View of Computer Function and Interconnection 87 3.1 Computer Components 88 3.2 Computer Function 90 3.3 Interconnection Structures 106 3.4 Bus Interconnection 107 3.5 Point-to-Point Interconnect 115 3.6 PCI Express 120 3.7 Recommended Reading 130 3.8 Key Terms, Review Questions, and Problems 130 Chapter 4 Cache Memory 134 4.1 Computer Memory System Overview 135 4.2 Cache Memory Principles 142 4.3 Elements of Cache Design 145
6 CONTENTS 4.4 Pentium 4 Cache Organization 163 4.5 ARM Cache Organization 166 4.6 Recommended Reading 168 4.7 Key Terms, Review Questions, and Problems 169 Appendix 4A Performance Characteristics of Two-Level Memories 174 Chapter 5 Internal Memory 181 5.1 Semiconductor Main Memory 182 5.2 Error Correction 192 5.3 Advanced DRAM Organization 196 5.4 Recommended Reading 202 5.5 Key Terms, Review Questions, and Problems 203 Chapter 6 External Memory 207 6.1 Magnetic Disk 208 6.2 RAID 217 6.3 Solid State Drives 227 6.4 Optical Memory 232 6.5 Magnetic Tape 237 6.6 Recommended Reading 239 6.7 Key Terms, Review Questions, and Problems 240 Chapter 7 Input/Output 243 7.1 External Devices 245 7.2 I/O Modules 248 7.3 Programmed I/O 250 7.4 Interrupt-Driven I/O 254 7.5 Direct Memory Access 262 7.6 I/O Channels and Processors 268 7.7 The External Interface:Thunderbolt and Infiniband 270 7.8 IBM zenterprise 196 I/O Structure 278 7.9 Recommended Reading 282 7.10 KeyTerms, Review Questions, and Problems 282 Chapter 8 Operating System Support 287 8.1 Operating System Overview 288 8.2 Scheduling 299 8.3 Memory Management 305 8.4 Pentium Memory Management 316 8.5 ARM Memory Management 321 8.6 Recommended Reading 326 8.7 Key Terms, Review Questions, and Problems 326 PART THREE ARITHMETIC AND LOGIC 331 Chapter 9 Number Systems 331 9.1 The Decimal System 332 9.2 Positional Number Systems 333 9.3 The Binary System 334 9.4 Converting Between Binary and Decimal 334
CONTENTS 7 9.5 Hexadecimal Notation 337 9.6 Recommended Reading 339 9.7 Key Terms and Problems 339 Chapter 10 Computer Arithmetic 341 10.1 The Arithmetic and Logic Unit 342 10.2 Integer Representation 343 10.3 Integer Arithmetic 348 10.4 Floating-Point Representation 363 10.5 Floating-Point Arithmetic 371 10.6 Recommended Reading 380 10.7 Key Terms, Review Questions, and Problems 381 Chapter 11 Digital Logic 386 11.1 Boolean Algebra 387 11.2 Gates 390 11.3 Combinational Circuits 392 11.4 Sequential Circuits 410 11.5 Programmable Logic Devices 419 11.6 Recommended Reading 423 11.7 Key Terms and Problems 423 PART FOUR THE CENTRAL PROCESSING UNIT 427 Chapter 12 Instruction Sets: Characteristics and Functions 427 12.1 Machine Instruction Characteristics 428 12.2 Types of Operands 435 12.3 Intel x86 and ARM Data Types 437 12.4 Types of Operations 440 12.5 Intel x86 and ARM Operation Types 453 12.6 Recommended Reading 463 12.7 Key Terms, Review Questions, and Problems 463 Appendix 12A Little-, Big-, and Bi-Endian 469 Chapter 13 Instruction Sets: Addressing 13.1 Addressing Modes 474 13.2 x86 and ARM Addressing Modes 481 13.3 Instruction Formats 486 13.4 x86 and ARM Instruction Formats 495 13.5 Assembly Language 499 13.6 Recommended Reading 501 13.7 KeyTerms, Review Questions, Modes and Formats 473 and Problems 501 Chapter 14 Processor Structure and Function 505 14.1 Processor Organization 506 14.2 Register Organization 508 14.3 Instruction Cycle 513 14.4 Instruction Pipelining 517 14.5 The x86 Processor Family 534
8 CONTENTS 14.6 The ARM Processor 542 14.7 Recommended Reading 548 14.8 Key Terms, Review Questions, and Problems 549 Chapter 15 Reduced Instruction Set Computers 553 15.1 Instruction Execution Characteristics 555 15.2 The Use of a Large Register File 560 15.3 Compiler-Based Register Optimization 565 15.4 Reduced Instruction Set Architecture 567 15.5 RISC Pipelining 573 15.6 MIPS R4000 578 15.7 SPARC 584 15.8 RISCVersus CISC Controversy 590 15.9 Recommended Reading 591 15.10 KeyTerms, Review Questions, and Problems 591 Chapter 16 Instruction-Level Parallelism and Superscalar Processors 595 16.1 Overview 596 16.2 Design Issues 601 16.3 Pentium 4 611 16.4 ARM Cortex-A8 617 16.5 Recommended Reading 625 16.6 KeyTerms, Review Questions, and Problems 627 PART FIVE PARALLEL ORGANIZATION 633 Chapter 17 Parallel Processing 633 17.1 Multiple Processor Organizations 635 17.2 Symmetric Multiprocessors 637 17.3 Cache Coherence and the MESI Protocol 641 17.4 Multithreading and Chip Multiprocessors 648 17.5 Clusters 655 17.6 Nonuniform Memory Access 662 17.7 Vector Computation 666 17.8 Recommended Reading 678 17.9 KeyTerms, Review Questions, and Problems 679 Chapter 18 Multicore Computers 686 18.1 Hardware Performance Issues 687 18.2 Software Performance Issues 691 18.3 Multicore Organization 696 18.4 Intel x86 Multicore Organization 698 18.5 ARM 11 MPCore 701 18.6 IBM zenterprise 196 Mainframe 706 18.7 Recommended Reading 709 18.8 KeyTerms, Review Questions, and Problems 709
CONTENTS 9 Appendix A Projects for Teaching Computer Organization and Architecture 713 A.l Interactive Simulations 714 A.2 Research Projects 716 A.3 Simulation Projects 716 A.4 Assembly Language Projects 717 A.5 Reading/Report Assignments 718 A.6 Writing Assignments 718 A.7 Test Bank 718 Appendix B Assembly Language and Related Topics 719 B.l Assembly Language 720 B.2 Assemblers 728 B.3 Loading and Linking 732 B.4 Recommended Reading 740 B.5 KeyTerms, Review Questions, and Problems 741 ONLINE CHAPTERS1 PART SIX THE CONTROL UNIT 19-1 Chapter 19 Control Unit Operation 19-1 19.1 Micro-operations 19-3 19.2 Control of the Processor 19-13 19.3 Hardwired Implementation 19-30 19.4 Recommended Reading 19-35 19.5 KeyTerms, Review Questions, and Problems 19-35 Chapter 20 Microprogrammed Control 20-1 20.1 Basic Concepts 20-3 20.2 Microinstruction Sequencing 20-16 20.3 Microinstruction Execution 20-26 20.4 Tl 8800 20-45 20.5 Recommended Reading 20-59 20.6 KeyTerms, Review Questions, and Problems 20-60 ONLINE APPENDICES Appendix C Hash Tables Appendix D Victim Cache Strategies D.l Victim Cache D.2 Selective Victim Cache 'Online chapters, appendices, and other documents are Premium Content, available via the access card at the front of this book.
10 CONTENTS Appendix E Appendix F Interleaved Memory The International Reference Alphabet Appendix G Virtual Memory Page Replacement Algorithms G.l Optimal G.2 Least Recently Used G.3 First-In-First-Out G. 4 Other Page Replacement Algorithms Appendix H H. l Recursion Recursive Procedures H.2 Activation Tree Representation H.3 Stack Processing H. 4 Recursion and Iteration Appendix I I. 1 Pipeline 1.2 Reorder Buffers Additional Instruction Pipeline Topics Reservation Tables 1.3 Tomasul o s Algorithm 1.4 Scoreboarding Appendix J J.l Linear Tape Open Technology LTO Generations J.2 LTO Format J.3 LTO Operation Appendix K DDR SRAM Appendix L L.l Protocols and Protocol Architectures Introduction L.2 The TCP/IP Protocol Architecture L.3 The Role of an Internet Protocol L.4 IPv4 L.5 IPv6 L.6 The OSI Protocol Architecture Appendix M Appendix N Scrambling Timing Diagrams Appendix O Stacks 0.1 Stack Structure 0.2 Stack Implementation 0.3 Expression Evaluation Glossary 745 References 755 Index 767