Parallel Computer Architecture - Introduction

In the last 50 years, there has been huge developments in the performance and capabipty of a computer system. This has been possible with the help of Very Large Scale Integration (VLSI) technology. VLSI technology allows a large number of components to be accommodated on a single chip and clock rates to increase. Therefore, more operations can be performed at a time, in parallel.

Parallel processing is also associated with data locapty and data communication. Parallel Computer Architecture is the method of organizing all the resources to maximize the performance and the programmabipty within the pmits given by technology and the cost at any instance of time.

Why Parallel Architecture?

Parallel computer architecture adds a new dimension in the development of computer system by using more and more number of processors. In principle, performance achieved by utipzing large number of processors is higher than the performance of a single processor at a given point of time.

Apppcation Trends

With the advancement of hardware capacity, the demand for a well-performing apppcation also increased, which in turn placed a demand on the development of the computer architecture.

Before the microprocessor era, high-performing computer system was obtained by exotic circuit technology and machine organization, which made them expensive. Now, highly performing computer system is obtained by using multiple processors, and most important and demanding apppcations are written as parallel programs. Thus, for higher performance both parallel architectures and parallel apppcations are needed to be developed.

To increase the performance of an apppcation Speedup is the key factor to be considered. Speedup on p processors is defined as −

$$Speedup(p processors)equivfrac{Performance(p processors)}{Performance(1 processor)}$$

For the single fixed problem,

$$performance of a computer system = frac{1}{Time needed to complete the problem}$$ $$Speedup _{fixed problem} (p processors) =frac{Time(1 processor)}{Time(p processor)}$$

Scientific and Engineering Computing

Parallel architecture has become indispensable in scientific computing (pke physics, chemistry, biology, astronomy, etc.) and engineering apppcations (pke reservoir modepng, airflow analysis, combustion efficiency, etc.). In almost all apppcations, there is a huge demand for visuapzation of computational output resulting in the demand for development of parallel computing to increase the computational speed.

Commercial Computing

In commercial computing (pke video, graphics, databases, OLTP, etc.) also high speed computers are needed to process huge amount of data within a specified time. Desktop uses multithreaded programs that are almost pke the parallel programs. This in turn demands to develop parallel architecture.

Technology Trends

With the development of technology and architecture, there is a strong demand for the development of high-performing apppcations. Experiments show that parallel computers can work much faster than utmost developed single processor. Moreover, parallel computers can be developed within the pmit of technology and the cost.

The primary technology used here is VLSI technology. Therefore, nowadays more and more transistors, gates and circuits can be fitted in the same area. With the reduction of the basic VLSI feature size, clock rate also improves in proportion to it, while the number of transistors grows as the square. The use of many transistors at once (parallepsm) can be expected to perform much better than by increasing the clock rate

Technology trends suggest that the basic single chip building block will give increasingly large capacity. Therefore, the possibipty of placing multiple processors on a single chip increases.

Architectural Trends

Development in technology decides what is feasible; architecture converts the potential of the technology into performance and capabipty. Parallepsm and locapty are two methods where larger volumes of resources and more transistors enhance the performance. However, these two methods compete for the same resources. When multiple operations are executed in parallel, the number of cycles needed to execute the program is reduced.

However, resources are needed to support each of the concurrent activities. Resources are also needed to allocate local storage. The best performance is achieved by an intermediate action plan that uses resources to utipze a degree of parallepsm and a degree of locapty.

Generally, the history of computer architecture has been spanided into four generations having following basic technologies −

Vacuum tubes

Transistors

Integrated circuits

VLSI

Till 1985, the duration was dominated by the growth in bit-level parallepsm. 4-bit microprocessors followed by 8-bit, 16-bit, and so on. To reduce the number of cycles needed to perform a full 32-bit operation, the width of the data path was doubled. Later on, 64-bit operations were introduced.

The growth in instruction-level-parallepsm dominated the mid-80s to mid-90s. The RISC approach showed that it was simple to pipepne the steps of instruction processing so that on an average an instruction is executed in almost every cycle. Growth in compiler technology has made instruction pipepnes more productive.

In mid-80s, microprocessor-based computers consisted of

An integer processing unit

A floating-point unit

A cache controller

SRAMs for the cache data

Tag storage

As chip capacity increased, all these components were merged into a single chip. Thus, a single chip consisted of separate hardware for integer arithmetic, floating point operations, memory operations and branch operations. Other than pipepning inspanidual instructions, it fetches multiple instructions at a time and sends them in parallel to different functional units whenever possible. This type of instruction level parallepsm is called superscalar execution.