Real Numbers Representation | Computer Architecture YASH PAL, February 23, 2026February 23, 2026 Real Numbers – Real Numbers are numbers that include fractional values after the decimal point. There are two types of representation of real numbers. Fixed point representation Floating point representation Fixed Point Representation This representation has a fixed number of bits for the integer part and for the fractional part. For example, if the given fixed-point representation is IIII.FFFF, then you can store the minimum value as 000.0001, and the maximum value is 9999.9999. There are three fields of a fixed-point number representation: the sign field, integer field, and fractional field. Fixed point representation we can represent these numbers using: Signed representation: range from -(2(k-1)-1) to (2(k-1)-1), for k bits. 1’s complement representation: range from -(2(k-1)-1) to (2(k-1)-1), for k bits. 2’s complement representation: range from -(2(k-1)-1) to (2(k-1)-1), for k bits. 2’s complement representation is preferred in computer systems because of its unambiguous property and ease of arithmetic operations. Example: Let’s assume a number is using a 32-bit format, which reserves 1 bit for the sign, 15 bits for the integer part, and 16 bits for the fractional part. Then, -43.625 is represented as follows: Sign bitInteger partFractional part10000000001010111010000000000000 Where 0 is used to represent (+), and 1 is used to represent (-). 000000000101011 is 15 bit binary value for decimal 43, and 1010000000000000 is 16 bit binary value for fractional 0.625. The advantage of using a fixed-point representation is performance, and the disadvantage is a relatively limited range of values that they can represent. So, it is usually inadequate for numerical analysis as it does not allow enough numbers and accuracy. A number whose representation exceeds 32 bits would have to be stored inexact. Smallest Sign bitInteger partFractional part10000000000000000000000000000001 Largest Sign bitInteger partFractional part01111111111111111111111111111111 These are the smallest positive number and the largest positive number which can be stored in a 32-bit representation as given above format. Therefore, the smallest positive number is 2-16 = 0.000015 approximate and the largest positive number is (215-1) + (1-2-16) = 215(1-2-16) = 32768, and the gap between these numbers is 2-16. We can move the radix point either left or right with the help of only the integer field, which is 1. Floating Point Representation This representation does not reserve a specific number of bits for the integer part or the fractional part. Insted it reserves a certain number of bits for the number (called the mantissa or significand) and a certain number of bits to say where within that number the decimal place sits (called the exponent). The floating number representation of a number has two parts: the first part represents a signed fixed-point number called the mantissa. The second part designates the position of the decimal (or binary) point and is called the exponent. The fixed point mantissa may be a fraction or an integer. Floating-point is always interpreted to represent a number in the following form: Mxre. Only the mantissa m and the exponent e are physically represented in the register (including their sign). A floating-point binary number is represented in a similar manner except that it uses base 2 for the exponent. A floating-point number is said to be normalized if the most significant digit of the mantissa is 1. Sign bitExponentMantissa So, the actual number is (-1)s(1+m)x2(e-Bias) Where,s is the sign bit,m is the mantissa,e is the exponent value,Bias is the bias number. Note that signed integers and exponents are represented by either sign representation, one’s complement representation, or two’s complement representation. The floating point representation is more flexible. Any non-zero number can be represented in the normalized form of ±(1.b1b2b3…)2x2n. This is the normalized form of a number x. Example: Suppose a number is using a 32-bit format: the 1-bit sign bit, 8 bits for the signed exponent, and 23 bits for the fractional part. The leading bit 1 is not stored (as it is always 1 for a normalized number) and is referred to as a “hidden bit”. Then -53.5 is normalized as -53.5=(-110101.1)2=(-1.101011)x25, which is represented as following below, Sign bitExponent partMantissa part1000001011010110000000000000000 Where 00000101 is the 8-bit binary value of the exponent value +5. Note that the 8-bit exponent field is used to store integer exponents -126 = n = 127. The smallest normalized positive number that fits into 32 bits is (1.000000000000000 00000000)2 x 2-126 = 2-126 ≈ 1.18 x 10-38, and the largest normalized positive number that fits into 32 bits is (1.11111111111111111111111)2 x 2127 = (224-1) x 2104 ≈ 3.40 x 1038. These numbers are represented as below- Smallest Sign bitExponent partMantissa1100000100000000000000000000000 Largest Sign bitExponent partMantissa10111111111111111111111111111111 Computer System Architecture engineering subjects Computer System Architecture