RISC Pipeline in Computer Architecture

RISC Pipeline in Computer Architecture – The ability to use the instruction pipelining concept in the RISC architecture is very efficient. The simplicity of the instruction set can be utilized to implement an instruction pipeline using a small number os sub operations, with each being executed in one clock cycle.

Due to fixed length instruction format, the decoding of the operation can occur at the same time as the register selection. Since the arithmetic, logic, and shift operations are done on a register basis, there is no need for extra fetching or effective address decoding steps to perform the operation. So the pipelining concept can be effectively used in this scenario.

Therefore, the total operations can be categorized as one segment will fetch the instruction from program memory, another segment executes the instruction in the ALU, and the third segment may be used to store the result of the ALU operation in a destination register. The data transfer instructions in RISC are limited to only Load and Store instructions. To prevent conflicts in data transfer, we will be using two separate buses, one for storing the instructions and the other for storing the data. The two memories can sometimes operate at the same speed as the CPU clock and are referred to as cache memories.

RISC pipelining is the ability to execute instructions at the rate of one per clock cycle, and it is supported by the compiler that translates the high-level language into a machine language program.

Example of three segment instruction pipeline:

We want to perform an operation in which there are some arithmetic, logic, or shift operations. Therefore, as per the instruction cycle, we will be having the following steps:

I – Instruction fetch
A – ALU Operation
E – Execute Instruction

The I (IF) segment will be fetching the instruction from program memory. The instruction is decoded, and an ALU operation is performed in the A (ALU) segment. In the A segment, the ALU operation instruction will be fetched, and the effective address will be retrieved, and finally, in the E (EI) segment, the instruction will be executed.

Delayed Load:

Consider the following instructions:

LOAD: R1 ← M[address 1]
LOAD: R2 ← M[address 2]
ADD: R3 ← R1 + R2
STORE: M[address 3] ← R3

The tables below will show the pipelining concept with data conflict and without data conflict.

Clock cycles	1	2	3	4	5	6
1. Load R1	I	A	E
2. Load R2		I	A	E
3. Add R1 + R2			I	A	E
4. Store R3				I	A	E

Pipeline timing with data conflict

Clock cycles	1	2	3	4	5	6	7
1. Load R1	I	A	E
2. Load R2		I	A	E
3. No-operation			I	A	E
4. Store R1 + R2				I	A	E
5. Store R3					I	A	E

Pipeline timing with delayed load

Both tables show the three-segment pipelining timing

Delayed Branch:

The method used in most RISC processors is to rely on the compiler to redefine the branches so that they affect at the proper time in the pipeline. This method is referred to as a delayed branch.

An example of a delayed branch is show below:

Clock cycles:	1	2	3	4	5	6	7	8	9	10
1. Load	I	A	E
2. Increment		I	A	E
3. Add			I	A	E
4. Subtract				I	A	E
5. Branch to X					I	A	E
6. No-operation						I	A	E
7. No-operation							I	A	E
8. Instruction in X								I	A	E

Using no-operation instructions

Clock cycles:	1	2	3	4	5	6	7	8
1. Load	I	A	E
2. Increment		I	A	E
3. Branch to X			I	A	E
4. Add				I	A	E
5. Subtract					I	A	E
6. Instruction in X						I	A	E

Rearranging the instructions

The program for this example consists of five instruction-

Load from memory to R1
Increment R2
Add R3 to R4
Subtract R5 from R6
Branch to address X

The compiler inserts two no-op instructions after the branch. The branch address X is transferred to the PC in clock cycle 7. The fetching of the instruction at X is delayed by two clock cycles by the no-op instructions. The instruction at X starts the fetch phase at clock cycle 8 after the program counter PC has been updated.

The program in “Rearranging the instructions” table is rearranged by placing the add and subtract instructions after the branch instruction instead of before, as in the original program. Inspection of the pipeline timing shows that PC is updated to the value of X in clock cycle 5. but the add and subtract instructions are fetched from memory and executed in the proper sequence.

RISC Pipeline in Computer Architecture

Pages

Our Tools

Engineering Subjects

Programming Tutorials

NCERT