PDF | This book offers a new approach to understanding computer architecture, emphasizing the quantitative aspects of design and practical trade-offs that must . Classification of Computer Architectures SIMD Schemes MIMD Schemes Interconnection Networks Analysis and Performance Metrics Computer Architecture I. Lecture 1: Welcome and IntroducAon. Instructor: David Black-‐Schaffer. TAs: Muneeb Khan and Andreas Sembrant.
|Language:||English, Spanish, German|
|ePub File Size:||19.33 MB|
|PDF File Size:||15.24 MB|
|Distribution:||Free* [*Sign up for free]|
“The 5th edition of Computer Architecture: A Quantitative Approach continues the legacy Figures from the book in PDF, EPS, and PPT formats. □. Links to. PDF Drive is your search engine for PDF files. As of today we have Computer System Architecture-Morris Mano third edition. Pages·· 5. Outline. ○ What is Computer Architecture? –Fundamental Abstractions & Concepts. ○ Instruction Set Architecture & Organization. ○ Why Take This Course?.
In OISC, the instruction set consists of only one instruction, and then by composition, all other necessary instructions are synthesized. This is an approach completely opposite to that of a complex instruction set computer CISC , which incorporates complex instructions as microprograms within the processor. Computer Architecture: A Minimalist Perspective examines computer architecture, computability theory, and the history of computers from the perspective of one instruction set computing - a novel approach in which the computer supports only one, simple instruction. This bold, new paradigm offers significant promise in biological, chemical, optical, and molecular scale computers. Includes a complete implementation of a one instruction computer. Computer Architecture: A Minimalist Perspective is designed to meet the needs of a professional audience composed of researchers, computer hardware engineers, software engineers computational theorists, and systems engineers.
For this, many aspects are to be considered which includes instruction set design, functional organization, logic design, and implementation. The implementation involves integrated circuit design, packaging, power, and cooling. Optimization of the design requires familiarity with compilers, operating systems to logic design, and packaging. Please help improve this section by adding citations to reliable sources.
Unsourced material may be challenged and removed. March Main article: Instruction set architecture An instruction set architecture ISA is the interface between the computer's software and hardware and also can be viewed as the programmer's view of the machine.
A processor only understands instructions encoded in some numerical fashion, usually as binary numbers. Software tools, such as compilers , translate those high level languages into instructions that the processor can understand. Besides instructions, the ISA defines items in the computer that are available to a program—e.
Instructions locate these available items with register indexes or names and memory addressing modes. The ISA of a computer is usually described in a small instruction manual, which describes how the instructions are encoded. Also, it may define short vaguely mnemonic names for the instructions. The names can be recognized by a software development tool called an assembler.
An assembler is a computer program that translates a human-readable form of the ISA into a computer-readable form.
Disassemblers are also widely available, usually in debuggers and software programs to isolate and correct malfunctions in binary computer programs. ISAs vary in quality and completeness. A good ISA compromises between programmer convenience how easy the code is to understand , size of the code how much code is required to do a specific action , cost of the computer to interpret the instructions more complexity means more hardware needed to decode and execute the instructions , and speed of the computer with more complex decoding hardware comes longer decode time.
Memory organization defines how instructions interact with the memory, and how memory interacts with itself. During design emulation software emulators can run programs written in a proposed instruction set. Modern emulators can measure size, cost, and speed to determine if a particular ISA is meeting its goals. Main article: Microarchitecture Computer organization helps optimize performance-based products.
For example, software engineers need to know the processing power of processors. They may need to optimize software in order to gain the most performance for the lowest price.
This can require quite detailed analysis of the computer's organization. For example, in a SD card, the designers might need to arrange the card so that the most data can be processed in the fastest possible way. Computer organization also helps plan the selection of a processor for a particular project.
Multimedia projects may need very rapid data access, while virtual machines may need fast interrupts. Sometimes certain tasks need additional components as well. For example, a computer capable of running a virtual machine needs virtual memory hardware so that the memory of different virtual computers can be kept separated.
Computer organization and features also affect power consumption and processor cost. Main article: Implementation Once an instruction set and micro-architecture are designed, a practical machine must be developed. This design process is called the implementation. Implementation is usually not considered architectural design, but rather hardware design engineering. Implementation can be further broken down into several steps: Logic Implementation designs the circuits required at a logic gate level Circuit Implementation does transistor -level designs of basic elements gates, multiplexers, latches etc.
Physical Implementation draws physical circuits.
The different circuit components are placed in a chip floorplan or on a board and the wires connecting them are created. Design Validation tests the computer as a whole to see if it works in all situations and all timings. Once the design validation process starts, the design at the logic level are tested using logic emulators. However, this is usually too slow to run realistic test. Most hobby projects stop at this stage. The final step is to test prototype integrated circuits. Integrated circuits may require several redesigns to fix problems.
Design goals[ edit ] The exact form of a computer system depends on the constraints and goals. Computer architectures usually trade off standards, power versus performance, cost, memory capacity, latency latency is the amount of time that it takes for information from one node to travel to the source and throughput. Sometimes other considerations, such as features, size, weight, reliability, and expandability are also factors.
The most common scheme does an in depth power analysis and figures out how to keep power consumption low, while maintaining adequate performance. Performance[ edit ] Modern computer performance is often described in IPC instructions per cycle. When the scheduler selects a process for execution, its state is changed from ready-to-run to running.
Finally, a process in the wait can go into the ready-to-run state, if the event it is waiting for has occurred.
A thread, like a process, is a sequence of instructions. Threads are created within, and belong to, processes. All the threads created within one process share the resources of the process, in particular the address space.
Scheduling is performed on a per-thread basis. Threads have a similar life cycle to the processes and are mainly managed in the same way. Initially each process is created with a single thread. However, threads are usually allowed to create new ones using particular system calls. Then, a thread tree is typically created for each process. Concurrent execution is the temporal behavior of the N-client 1-server model where one client is served at any given moment.
This model has a dual nature; it is sequential in a small time scale, but simultaneous in a rather large time scale. Space for Diagram In this situation the key problem is how the competing clients, let us say processes or threads, should be scheduled for service execution by the single server processor.
The scheduling policy may be viewed as covering two aspects. The first deals with whether servicing a client can be interrupted or not and, if so, on what occasions pre-emption rule. The other states how one of the competing clients is selected for service selection rule. Scheduling policy Whether servicing a client can be How clients from the competing clients Interrupted and if so on what occasion will be selected for service Main aspects of the scheduling policy.
The pre-emption rule may either specify time-sharing, which restricts continuous service for each client to the duration of a time slice, or can be priority based, interrupting the servicing of a client whenever a higher priority client requests service.
The selection rule is typically based on certain parameters, such as priority, time of arrival, and so on. This rule specifies an algorithm to determine a numeric value, which we will call the rank, from the given parameters. During selection the ranks of all competing clients are computed and the client with the highest rank is scheduled for service.
Parallel execution is associated with N-client N-server model. Having more than one server allows the servicing of more than one client at the same time; this is called parallel execution. As far as the temporal harmonization of the execution is concerned, there are two different schemes to be distinguished. In the lock-step or synchronous scheme each server starts service at the same moment, as in SIMD architectures. In the asynchronous scheme, the servers do not work in concert, as in MIMD architectures.
Architectures, compilers and operating system have been striving for more than two decades to extract and utilize as much parallelism as possible in order to speed up computation.
In our subject area the notion of parallelism is used in two different contexts. Either it designates available parallelism in programs or it refers to parallelism occurring during execution, called utilized parallelism. Types of available parallelism Problem solutions may contain two different kinds of available parallelism, called functional parallelism and data parallelism. We term functional parallelism that kind of parallelism which arises from the logic of a problem solution.
There is a further kind of available parallelism, called data parallelism. It comes from using data structures that allow parallel operations on their elements, such as vectors or matrices, in problem solutions. From another point of view parallelism can be considered as being either regular or irregular. Data parallelism is regular, whereas functional parallelism, with the execution of loop-level parallelism, is usually irregular.
Levels of available functional parallelism Programs written in imperative languages may embody functional parallelism at different levels, that is, at different sizes of granularity. Exploited by means of operating systems Available and utilized levels of functional parallelism. Available instruction-level parallelism means that particular instructions of a program may be executed in parallel. To this end, instructions can be either assembly machine-level or high-level language instructions.
Usually, instruction- level parallelism is understood at the machine-language assembly-language level. Parallelism may also be available at the loop level. Here, consecutive loop iterations are candidates for parallel execution.
However, data dependencies between subsequent loop iterations, called recurrences, may restrict their parallel execution. Next, there is parallelism available at the procedure level in the form of parallel executable procedures. The extant of parallelism exposed at this level is subject mainly to the kind of the problem solution considered.
In addition, different programs users are obviously independent of each other. Thus, parallelism is also available at the user level which we consider to be coarse-grained parallelism.
Multiple, independent users are a key source of parallelism occurring in computing scenarios. Utilization of functional parallelism Available parallelism can be utilized by architectures, compilers and operating systems conjointly for speeding up computation. Let us first discuss the utilization of functional parallelism.