CDA-5155 Computer Architecture Principles Fall 2000
CDA-5155 Computer Architecture Principles Fall 2000 Multiprocessor Architectures Review Protocols: reliable and heterogeneous networking Interconnect technologies/topologies Length, latency, diameter, blocking, deadlock, bisection BW, overheads, routing, congestion, connectionless? CPU interface to memory hierarchy vs. network (SPEC)
Standardization key for LAN, WAN Internetworking protocols used as LAN protocols IC revolutionizing networks and processors Switch is a specialized computer Amdahl: High BW networks with high overheads Overview High performance computing Parallelism
Taxonomy of multiprocessors Programming models Performance ASCI Accelerated Strategic Computing Initiative High Performance Computing Hardware and software El dorado - Attack of the killer micros Microprocessor: the most cost-effective processor Dynamic supercomputer market Timesharing workloads
Multiprocessor vs. high performance uniprocessor Performance and application domains Throughput (multiprocessing workloads) Timesharing, file, database, and web servers Response time (parallel applications) Single complex problem Computation/communication = f(#processors, data size) Parallelism Two or more things that happen at the same time Granularity - size of computations performed at the same time between synchronizations Carry lookahead adder Pipelined processor Two-way superscalar processor
Multiprocessor COW Levels of parallelism Bit level Instruction level Thread level Challenges (Amdahls law) Limited amount of parallelism in programs High cost of communication Parallel Computers Parallel computer: collection of processing elements that cooperate and communicate to solve large problems fast.
Questions about parallel computers: How large a collection? How powerful are processing elements? How do they cooperate and communicate? How are data transmitted? What type of interconnection? What are HW and SW primitives for programmer? Does it translate into performance?
Taxonomy of Parallel Computers Flynn: I & D streams Shared Memory Model Each processor can name every physical location in the machine via Load and Store Data size: byte, word, ... or cache blocks Process: a virtual address space (>= 1 thread of control) Multiple processes can overlap (share), but ALL threads share a process address space Writes to shared address space by one thread are visible to reads of other threads Usual model: share code, private stack, some shared heap, some private heap Performance
Latency, BW, scalability when communicate? Message Passing Model Nodes: whole computers (CPU, RAM, I/O) Communication: explicit I/O operations Send (local buffer, remote process) Recv (local buffer, remote process) Synchronization When send completes When buffer free When request accepted
Necessary even for 1 processor Shared Memory machine1 machine2 machine1 machine2 machine1 machine2 Application
Application Application Application Application Application Language run-time system Language run-time system
Language run-time system Language run-time system Language run-time system Language run-time system
Operating system Operating system Operating system Operating system Operating system Operating system
2 load/store pipes Distributed Memory SIMD Shared Memory UMA Bus-Based SMP Crossbar-Based SMP Sun Enterprise 10000 NUMA Bus-Based NUMA
ASCI Program Accelerated Strategic Computing Initiative Big impulse to the HPC industry Architecture: clusters of RISC-based SMP nodes Goals (1995 2004) 1 Teraflops: Intel/Sandia ASCI Red 3 Teraflops: SGI/LLNL ASCI Blue 10 Teraflops: IBM/LLNL ASCI White 30 Teraflops: ? 100 Teraflops: ? Intel/Sandia ASCI Red
160 m2 200-MHz Pentium Pro Nodes: service, compute, I/O, and system Six-link router chip (dimensional, wormhole routing) Link BW: 400MB/sec (full duplex) Top 500 HPC Computer Rmax (GFlops) 4938 IBM
ASCI White, SP Power3 375 MHz Intel ASCI Red 2379 SNL Mnftr Hitachi ASCI BluePacific SST, IBM SP 604e ASCI Blue
Mountain SP Power3 375 MHz SP Power3 375 MHz SR8000F1/112 SP Power3 375 MHz 8 way SR8000F1/100 Cray Inc. T3E1200 IBM SGI
IBM IBM Hitachi IBM Site Country Year Area # Proc Rpeak
(GFlops) USA Research 2000 Energy 8192 12288 USA 1999 Research 9632
3207 LLNL USA Research 1999 Energy 5808 3868 1608 LANL USA
6144 3072 1336 2004 1104 1656 LLNL 2144 1417 ANO
USA 1179 NCP USA 1998 Research Research 2000 Aerospace Research 2000 Weather 1035 LRM Germany
2000 Academic 112 1344 UCSD USA 2000 Research 1152 1728 917
HEARO Japan 2000 Research 100 1200 892 Govern't USA
1998 Classified 1084 1300.8 NAVOCE 929 Architectures CPU Processor Type Customer
Tough Mudder £2514. Southdowns Way Cycle Challenge £2187 Christmas Tree Event £3079 (£3386) Quiz Night £2308 (£1527) Summer Fete £2700 (£2198) Film Night and Disco £696 (£793) Cake Sales £678 (£768) Sample Sales £3065 (£1636) Other (sponsorship, easyfundraising, PE bags...
Competency: 206.00 Draw Wall Sections Objective 206.01 Identify terms and definitions related to wall sections. Wall Sections & Details Terms & Definitions Anchor bolt - threaded rod inserted in masonry construction to anchor sill plate to foundation Blocking - Framing...
ROME Economic Reasons High taxes wiped out the middle class Trade and businesses went down High unemployment due to slave labor Political Reasons All of the fighting to be "emperor" Many different Emperors over a short time span Social Reasons...
Assignment 2. Cell division. Mitosis stages and details. Meiosis stages and details. Practical examining dividing cells (garlic root tip squash) How cell division can increase variation in reproduction.
KS EAS 39:2000 Hygiene in the food and drink manufacturing industry - Code of practice. NOTES: ... The pyramid illustrates how foods should be selected and indicates the foods that should be eaten more (at the base of the food...
Most similar work toour system. Passive & active markers. Problems: Cluttered environments. Low & dynamic lighting conditions [Censi, 2013] « Low-latency localization by active LED markers tracking using a dynamic vision sensor» Event based camera. Track LEDs blinking at kiloherz...
FHA Limited 203(k) Mortgage Program. Providing borrowers an affordable, stable financing solution that combines the purchase or refinance of the home along with the costs of the improvements into a single loan
Ready to download the document? Go ahead and hit continue!