Skip to content
The Computer Science
TheCScience
  • Engineering Subjects
    • Human Values
    • Computer System Architecture
    • Digital Communication
    • Internet of Things
  • NCERT Solutions
    • Class 12
    • Class 11
  • HackerRank solutions
    • HackerRank Algorithms Problems Solutions
    • HackerRank C solutions
    • HackerRank C++ problems solutions
    • HackerRank Java problems solutions
    • HackerRank Python problems solutions
The Computer Science
TheCScience

RISC Pipeline in Computer Architecture

YASH PAL, March 4, 2026March 4, 2026

RISC Pipeline in Computer Architecture – The ability to use the instruction pipelining concept in the RISC architecture is very efficient. The simplicity of the instruction set can be utilized to implement an instruction pipeline using a small number os sub operations, with each being executed in one clock cycle.

Due to fixed length instruction format, the decoding of the operation can occur at the same time as the register selection. Since the arithmetic, logic, and shift operations are done on a register basis, there is no need for extra fetching or effective address decoding steps to perform the operation. So the pipelining concept can be effectively used in this scenario.

Therefore, the total operations can be categorized as one segment will fetch the instruction from program memory, another segment executes the instruction in the ALU, and the third segment may be used to store the result of the ALU operation in a destination register. The data transfer instructions in RISC are limited to only Load and Store instructions. To prevent conflicts in data transfer, we will be using two separate buses, one for storing the instructions and the other for storing the data. The two memories can sometimes operate at the same speed as the CPU clock and are referred to as cache memories.

RISC pipelining is the ability to execute instructions at the rate of one per clock cycle, and it is supported by the compiler that translates the high-level language into a machine language program.

Example of three segment instruction pipeline:

We want to perform an operation in which there are some arithmetic, logic, or shift operations. Therefore, as per the instruction cycle, we will be having the following steps:

  • I – Instruction fetch
  • A – ALU Operation
  • E – Execute Instruction

The I (IF) segment will be fetching the instruction from program memory. The instruction is decoded, and an ALU operation is performed in the A (ALU) segment. In the A segment, the ALU operation instruction will be fetched, and the effective address will be retrieved, and finally, in the E (EI) segment, the instruction will be executed.

Delayed Load:

Consider the following instructions:

  1. LOAD: R1 ← M[address 1]
  2. LOAD: R2 ← M[address 2]
  3. ADD: R3 ← R1 + R2
  4. STORE: M[address 3] ← R3

The tables below will show the pipelining concept with data conflict and without data conflict.

Clock cycles123456
1. Load R1IAE
2. Load R2IAE
3. Add R1 + R2IAE
4. Store R3IAE
Pipeline timing with data conflict
Clock cycles1234567
1. Load R1IAE
2. Load R2IAE
3. No-operationIAE
4. Store R1 + R2IAE
5. Store R3IAE
Pipeline timing with delayed load

Both tables show the three-segment pipelining timing

Delayed Branch:

The method used in most RISC processors is to rely on the compiler to redefine the branches so that they affect at the proper time in the pipeline. This method is referred to as a delayed branch.

An example of a delayed branch is show below:

Clock cycles:12345678910
1. LoadIAE
2. IncrementIAE
3. AddIAE
4. SubtractIAE
5. Branch to XIAE
6. No-operationIAE
7. No-operationIAE
8. Instruction in XIAE
Using no-operation instructions
Clock cycles:12345678
1. LoadIAE
2. IncrementIAE
3. Branch to XIAE
4. AddIAE
5. SubtractIAE
6. Instruction in XIAE
Rearranging the instructions

The program for this example consists of five instruction-

  • Load from memory to R1
  • Increment R2
  • Add R3 to R4
  • Subtract R5 from R6
  • Branch to address X

The compiler inserts two no-op instructions after the branch. The branch address X is transferred to the PC in clock cycle 7. The fetching of the instruction at X is delayed by two clock cycles by the no-op instructions. The instruction at X starts the fetch phase at clock cycle 8 after the program counter PC has been updated.

The program in “Rearranging the instructions” table is rearranged by placing the add and subtract instructions after the branch instruction instead of before, as in the original program. Inspection of the pipeline timing shows that PC is updated to the value of X in clock cycle 5. but the add and subtract instructions are fetched from memory and executed in the proper sequence.

Computer System Architecture engineering subjects Computer System Architecture

Post navigation

Previous post
Next post

Computer Architecture fundamentals
Basic structure of a computer
Functional Units of Computer
Development of Computers
Von Neuman and Harvard machine Architecture
Flynn Classification
Computer Structure Architecture
Interfacing Logic Devices
Levels of Design abstraction
Performance Metrics

Register Transfer Language
Memory Transfer
Arithmetic Micro-operations
Arithmetic Complements
Logic Micro-operations
Shift Micro-operations
Bus Architecture
Data Transfer
Central Processing Unit
CPU Bus Architecture

Computer Register and Types
Common Bus System
Instruction Format
Instruction Types
Instruction Cycle
Addressing Modes
Design of a basic computer

Basic function of a Computer
General register organization
Stack organization
Infix to Reverse Polish Notation Conversion
Instruction Types and their classifications
Data transfer and manipulation
Program control
RISC characteristics
CISC characteristics

Pipeline
Types of Pipeline
Arithmetic Pipeline
Instruction Pipeline
Hazards
Vector Processing

Data Representation
Addition and Subtraction
Adder Circuits
Shift and Add Multiplication Method
Booth's Algorithm
Restoring Division Algorithm
Non-Restoring Division Algorithm
Array Multiplier

Memory Classification
Memory Characteristics
Memory Organization
Memory Types
Associative Memory
Cache Memory
Virtual Memory

Input Output Interface
Modes of Data Transfer
Priority Interrupt
Direct Memory Access
Input-Output Processor
Serial Communication

TheCScience

We at TheCScience.com are working towards the goal to give free education to every person by publishing in dept article about Secondary, Senior-Secondary, and Graduation level subjects.

Pages

About US

Contact US

Privacy Policy

DMCA

Our Tools

Hosting - get 20% off

Engineering Subjects

Internet of Things

Human Values

Digital Communication

Computer System Architecture

Programming Tutorials

Data Structure and Algorithm

C

Java

NCERT

Class 12th

©2026 TheCScience | WordPress Theme by SuperbThemes