Vector Processing in Computer Architecture YASH PAL, December 27, 2025March 4, 2026 Vector Processing in Computer Architecture – Many scientific problems can not be solved by conventional computers within a reasonable amount of time. To achieve the required level of high performance, it is necessary to utilize the fastest and most reliable hardware and apply innovative procedures. Vector Processing In Computer Architecture, the vector processing technique is used to provide high computational capability to the computer system. Computers with vector processing capabilities are required in specialized applications like medical diagnosis, aerodynamics, artificial intelligence, image processing, etc. A vector computer or vector processor is a machine designed to efficiently handle arithmetic operations on elements of arrays, called vectors. Such machines are especially useful in high-performance scientific computing, where matrix and vector arithmetic are quite common. The cray Y-MP and the convex C3880 are two examples of vector processors used today. Vector Arithmetic A vector V of length n is represented as a row vector. V = [V1,V2,V3……..,Vn] It may be represented as a column vector if the data are listed in a column. A conventional sequential computer is capable of processing operands one at a time. In contrast to this, vector processing is performed by breaking the single computation into subscripted variables. When mapping a vector to a computer program, it is declared as an array of one dimension. In Fortran, vector V is declared by the statement DIMENSION V(N) where N is an integer variable holding the value of the length of the vector. Arithmetic operations can also be performed on vectors. Two vectors are added by adding the corresponding elements. S=X+Y=(X1+Y1, X2+Y2,………….,XN+YN) In Fortran, vector addition could be performed by the following code. DO I=1, NS (I)=X(I)+Y(I)END DO Where S is the vector representing the final sum of vector X and vector Y. S, X, and Y vectors have been declared as arrays of dimension N. This operation is sometimes called element-wise addition. Similarly, subtraction of two vectors is also an element-wise operation. Vector multiplication is also one of the most computationally intensive operations performed in computers. The multiplication of two n x n matrices consists of n2 inner products or n3 multiply-add operations. An n x m matrix of numbers has n rows and m columns and may be considered as constituting a set of n row vectors or a set of m column vectors. For example, the multiplication of two 3 x 3 matrices A and B will be written as [a11 a12 a13] [b11 b12 b13] [c11 c21 c31][a21 a22 a23] x [b21 b22 b23] = [c21 c22 c32][a31 a32 a33] [b31 b32 b33] [c31 c23 c33] The product matrix C is a 3 x 3 matrix whose elements are related to the elements of A and B by the inner product. Cij = Σ aik x bkj where k = 1 to 3 For example, the number in the first row and first column of matrix C is calculated by letting i=1 and j = 1 to obtain. C11 = a11 b11 + a12 b21 + a13 b31 This requires multiplication and addition operations. The figure shows the pipeline to calculate an inner product. Pipeline for calculating an inner product Memory Interleaving To allow faster access to vector elements stored in memory, the memory of a vector processor is often divided into memory banks. This is named as interleaved memory. Interleaved memory banks associate successive memory addresses with successive banks cyclically, thus word 0 is stored in bank 0, word 1 is in bank 1, word (n-1) is in bank (n-1), word n is in bank 0, word (n+1) is in bank 1, etc., where n is the number of memory banks. As with many other computer architectural features, n is usually a power of 2: n = 2kwhere k = 1,2,3,………. One memory access (load or store) of a data value in a memory bank takes several clock cycles to complete. Each memory bank allows only one data value to be read or stored in a single memory access, but more than one memory bank may be accessed at the same time. When the elements of a vector stored in an interleaved memory are read into a vector register, the reads are staggered across the memory banks so that one vector element is read from a bank per clock cycle. If one memory access takes a clock cycle, then n elements of a vector may be fetched at one memory access; this is n times faster than the same number of memory accesses to a single bank. Supercomputers: A commercial computer with vector instructions and pipelined floating-point arithmetic operations is referred to as a supercomputer. Supercomputers are very powerful, high-performance machines used mostly for scientific computations. To speed up the operation, the components are packed tightly together to minimize the distance that the electronic signals have to travel. Supercomputers also use special techniques for removing the heat from circuits to prevent them from burning up because of their close proximity. Some examples of them are: In radar and signal processing for the detection of space/underwater targets. In remote sensing for earth resource exploration. In computational wind tunnel experiments. In 3D stop-action computer-assisted tomography. Weather forecasting. Medical diagnosis. Vector Instruction Fields: Vector instructions are usually specified by the following fields – OperationcodeBaseaddress 1Base address 2BaseaddressdestinationVectorlengthInstruction format for vector processor Opcode (operation code): This field is used to select the functional unit or to reconfigure a multifunctional unit to perform the specified operation. Base addresses: In case of memory reference instruction, this field specifies the base addresses needed for source operands and result vectors. If the operands and results are located in the vector register file, the designated vector registers must be specified. Address increment: This field specifies the space between the two elements in the main memory. Usually, the elements are consecutively stored, thus the increment is 1. However, with variable increments, higher flexibility can be offered in the applications. Address offset: This field specifies the offset to the base address. Using the base address and the offset, the effective memory address can be calculated. The offset can be either positive or negative. Vector length: This field determines the termination of a vector instruction. Vector length affects the processing efficiency because the additional subdividing is required for long vectors. Related Questions and Answers Why does vector processing technique used? The vector processing technique is used to provide high computational capability to the computer system. Computers with vector processing capabilities are required in specialized applications. Computer System Architecture engineering subjects Computer System Architectureengineering subjects