Skip to content
  • Home
  • Contact Us
  • About Us
  • Privacy Policy
  • DMCA
  • Linkedin
  • Pinterest
  • Facebook
thecscience

TheCScience

TheCScience is a blog that publishes daily tutorials and guides on engineering subjects and everything that related to computer science and technology

  • Home
  • Human values
  • Microprocessor
  • Digital communication
  • Linux
  • outsystems guide
  • Toggle search form

Representation of Characters in a computer

Posted on July 4, 2021November 19, 2022 By YASH PAL No Comments on Representation of Characters in a computer
To represent a character in a computer we used two symbols 0 and 1. all the data to b stored and processed in computers are transformed or coded as strings of two symbols, one symbol to represent each state.
Representation of Characters in a computer
0 and 1 are known as bits, an abbreviation for binary digits. and there are four unique combinations of two bits.
00 01 10 11

and there are 8 unique combinations or strings of 3 bits.
000 001 010 011 100 101 110 111

each unique string of bits may be used to represent or code a symbol. we have 26 letters in the alphabet and in order to code the 26 capital or upper case letters of English, at least 26 unique strings of bits are needed. but with the 4 bits (2x2x2x2) we only have 16 unique strings that are not sufficient for us. but with the 5 bits (2x2x2x2x2) we have 32 bits that are sufficient for us to represent the English letters. so we picked 26 unique combinations of bits to represent each English word in the computer as you can see in the given table.

Bit string Letter Bit string Letter
00000 A 10000 Q
00001 B 10001 R
00010 C 10010 S
00011 D 10011 T
00100 E 10100 U
00101 F 10101 V
00110 G 10110 W
00111 H 10111 X
01000 I 11000 Y
01001 J 11001 Z
01010 K 10010
01011 L 10010
01100 M 10010
01101 N 10010
01110 O 10010
01111 P 10010

but there is a problem, data processing using computers requires the processing of not only the 26 capital English letters but also the 26 lower case English letters, 10 digits, and around 32 other characters such as punctuation marks, arithmetic operators symbols, and parentheses. thus the total number of characters to be coded is 26 + 26 + 10 + 32 = 94.

with the strings of 6 bits each, it is possible to code only 64 characters. thus 6 bits are insufficient for coding these 94 characters. but we can use strings of 7 bits each that will have (2x2x2x2x2x2x2x2) 128 unique bit strings and can thus code up to 128 characters. so the strings of 7 bits each are sufficient to code 94 characters.

The most popular standard is known as ASCII(American standard code for information interchange). this standard uses 7 bits to code each character as we can see in the given below table.

Least significant bits of code Most significant bits b6 b5 b4
b3 b2 b1 b0 000 001 010 011 100 101 110 111
0 0 0 0 NUL DLE SPACE 0 @ P p
0 0 0 1 SOH DC1 ! 1 A Q a q
0 0 1 0 STX DC2 “ 2 B R b r
0 0 1 1 ETX DC3 # 3 C S c s
0 1 0 0 EOT DC4 $ 4 D T d t
0 1 0 1 ENQ NAK % 5 E U e u
0 1 1 0 ACK SYN & 6 F V f v
0 1 1 1 BEL ETB ‘ 7 G W g w
1 0 0 0 BS CAN ( 8 H X h x
1 0 0 1 HT EM ) 9 I Y i y
1 0 1 0 LF SUB * : J Z j z
1 0 1 1 VT ESC + ; K [ k {
1 1 0 0 FF FS ‘ &lt L l |
1 1 0 1 CR GS – = M ] m }
1 1 1 0 SO RS . &gt N ^ n ~
1 1 1 1 SI US / ? O _ o DEL

for example, we type RAMA J in the computer then the bit representation of this string is
1010010 1000001 1001101 1000001 0100000 1001010
R A M A SPACE J

the blank between the RAMA and J also needs a code and this code is essential to leave a blank between RAMA and J when the string is printed.

A string of bits used to represent a character is known as a byte. characters coded in ASCII will need only 7 bits. and the need to accommodate characters of languages other than English was foreseen while designing ASCII and thus 8 bits were specified to represent characters. Thus a byte is commonly understood as a string of 8 bits.

The international standards organization standardized an 8-bit code (ISO 646) for Latin script used in Europe in addition to English letters. this was widely used in Europe. and the ASCII code is a proper subset of this code. but in 1991 the group proposed a standard called Unicode which was a 16-bit code called Unicode. and the primary idea of Unicode is to separate the coding of characters from their graphical representation called glyphs.

Also, read
  • Computer Fundamentals
  • Input and Output Units
  • Error detecting codes
  • Computer memory Hierarchy
computer fundamentals, engineering subjects Tags:computer fundamentals, engineering subjects

Post navigation

Previous Post: Basic Computer Fundamentals
Next Post: Input and Output units in computer

Related Posts

Recyclability and Self-Regulation in Nature engineering subjects
computer memory hierarchy in computer Computer Memory Hierarchy and Characteristics computer fundamentals
Holistic Perception of Harmony at All Levels of Existence engineering subjects
Pin Configuration of 8085 Microprocessor engineering subjects
Understanding the Harmony in the Society engineering subjects
Maslow’s Hierarchy of Needs engineering subjects

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Pick Your Subject
Human Values

Copyright © 2023 TheCScience.

Powered by PressBook Grid Blogs theme