ASCII: The American Standard for Information Interchange

In May of 1961, the American Standards Association's X3.2 subcommittee met to discuss standardising a new system for encoding characters in telecommunications and computing devices. The resulting 7-bit standard, first published in 1963, allowed a total of no more than 128 code points to be assigned in an organised manner that would allow devices to communicate with one another unambiguously. The initial published standard was revised several times, however, and some of the changes made in those revisions are as important as the original standard when looking at ASCII's impact on computing in the present day. This page aims to present some key information about the initial standard and its early revisions in order to contextualise them in a way that will hopefully be informative and interesting, even if you know nothing whatsoever about the topic beforehand.

Each of the characters of the ASCII are encoded in a byte containing 7 bits of data, each bit being represented by a binary value of either 0 or 1. (Today, it is generally more common to think of a byte as consisting of multiples of 8 bits.) I don't want to get overly bogged down in how computers encode data in general here, but I will briefly acknowledge the concept of bit indexing only to explain that a byte is written, in effect, from right to left. In other words, the "first" bit, which is known as the least significant bit (LSb), is the one written at the end of the number. Unsurprisingly, the bit written first (the one on the far left) is known as the most significant bit (MSb). For example, in the ASCII character 1001100 (which happens to be the letter L), the bits are numbered as follows:

bit 7 (MSb)	bit 6	bit 5	bit 4	bit 3	bit 2	bit 1 (LSb)
1	0	0	1	1	0	0

(Note: Yes, if you're wondering, the ASCII indexes bits from 1, not 0. If you weren't wondering, don't worry about it.)

The complete array of ASCII code points (each code point representing one character) is defined in a table which helps to visualise how the characters are organised. Remember the bit order when reading the tables, i.e. bits 5 to 7 are the ones on the left, and bits 1 to 6 are the ones on the right.

Not every character is a glyph: several are "control characters" which have special functions that I'll explain later.

American Standard Code for Information Interchange (ASA X3.4-1963)

The original publication of the ASCII defined the following 100 code points, leaving 28 undefined for future standardisation:

bits 5 to 7 → bits 1 to 4 ↓	000	001	010	011	100	101	111
0000	NULL	DC₀	ƀ	0	@^*	P
0001	SOM	DC₁	!	1	A	Q
0010	EOA	DC₂	"	2	B	R
0011	EOM	DC₃	#	3	C	S
0100	EOT	DC₄ (STOP)	$	4	D	T
0101	WRU	ERR	%	5	E	U
0110	RU	SYNC	&	6	F	V
0111	BELL	LEM	'	7	G	W
1000	FE₀	S₀	(	8	H	X
1001	HT/SK	S₁	)	9	I	Y
1010	LF	S₂	*	:^*	J	Z
1011	V_TAB	S₃	+	;^*	K	[^*
1100	FF	S₄	,	<	L	\^*	ACK
1101	CR	S₅	-	=	M	]^*	①
1110	SO	S₆	.	>	N	↑^*	ESC
1111	SI	S₇	/	?	O	←^*	DEL

All blank positions in this table represent codes that are theoretically available for use but not standardised.

^* The document also suggests some non-standard functions that could hypothetically be assigned to some code points:

"The five graphics immediately following the letter Z can be replaced by the additional letters required for complete expression of certain European alphabets. Further, the single position preceding the letter A can be used for those alphabets requiring 32 characters. In most cases, only three additional letters will be required.

For those applications requiring use of the sterling monetary system or duodecimal arithmetic, the digits 10 and 11 can replace the two graphics immediately following the digit 9."

Control characters

The ASCII includes a number of “control characters”, which it organises (somewhat loosely) into four categories:

Transmission controls
Format effectors
Device controls
Information separators

However, it doesn't clearly designate which ones are intended to belong to which groups, other than explaining that they are largely chunked together as they appear in the code block. Context more or less fills in the gaps, though.

Abbr.	Full name	Type (inferred)
SOM	Start of message	Transmission control
EOA	End of address	Transmission control
EOM	End of message	Transmission control
WRU	"Who are you?"	Transmission control
RU	"Are you...?"	Transmission control
FE₀	Format effector	Format effector
HT/SK	Horizontal tabulation / skip (punched card)	Format effector
LF	Line feed	Format effector
V_TAB	Vertical tabulation	Format effector
FF	Form feed	Format effector
CR	Carriage return	Format effector
SO	Shift out	Format effector
SI	Shift in	Format effector
DC_0–4	Device control 0–4	Device controls
ERR	Error	Transmission control
SYNC	Synchronous idle	Transmission control
LEM	Logical end of media	Transmission control
S_0–7	Separator 0–7	Information separators
ƀ	Word separator / space (generally non-printing)	Format effector
ACK	Acknowledge	Transmission control
①	described as an unassigned control character	N/A
ESC	Escape	Format effector
DEL	Delete	not strictly a control character at all; rather, represents all positions of a row on a punched card or perforated tape being punched out: 1111111

ASCII: The American Standard for Information Interchange

American Standard Code for Information Interchange (ASA X3.4-1963)

Control characters

Coming soon: ASA X3.4-1965 and beyond