A Systems Study

Encoding is how reality becomes data

Every letter you read. Every frame you watch.
Every note you hear. A single principle underlies all of it.

Scroll to explore
256
values per byte
1,114,112
Unicode code points
44,100
audio samples/sec
01

Everything is
voltage

At the physical layer, information is stored as electrical states. High or low. On or off. One or zero. From this single binary constraint, all digital information emerges.

1 bit
1
2 states
4 bits
1
0
1
0
16 states
8 bits
0
1
0
0
0
0
0
1
256 states
65
0x41
A
0255

Drag to explore all 256 byte values

02

Characters as
coordinates

Every character is assigned a unique number — a code point. That number is then encoded as bytes. ASCII was the first map. Unicode became the universal standard.

Character
A
Unicode Point
U+0041
Decimal
65
Binary (UTF-8)
01000001
Select a character
ASCII
0 – 127
7-bit. English only. 128 characters.
Latin-1
0 – 255
8-bit. Western Europe. 256 characters.
03

Images as
matrices

An image is a two-dimensional grid of pixels. Each pixel stores three values — red, green, blue — each an integer from 0 to 255. A 1920×1080 image contains over 6 million such values.

R
153
G
204
B
255
#99CCFF
Resolution as density
8×8
16×16
32×32
64×64
RGB Screens — additive light
CMYK Print — subtractive ink
YCbCr Video — luminance + chroma
04

Sound as
samples

Sound is continuous pressure variation. To store it digitally, we take snapshots — samples — at regular intervals. The Nyquist theorem states that sampling at twice the maximum frequency perfectly reconstructs the original signal.

44,100
samples / second
65,536
amplitude levels
1,411
kbps uncompressed
Nyquist Theorem fs ≥ 2 × fmax

Human hearing reaches ~20 kHz. CD quality samples at 44.1 kHz — more than twice the maximum audible frequency.

05

Motion as
difference

Video is not stored as a sequence of independent images. Instead, codecs exploit temporal redundancy — only the differences between frames are stored. This is the core insight behind all modern video compression.

I-frame (Keyframe) — full image stored
P-frame (Predicted) — difference from previous
B-frame (Bidirectional) — interpolated from both directions
Motion Vector Concept

Instead of storing a full frame, the encoder stores vectors: "block at (x,y) moved to (x+Δx, y+Δy)"

H.264
2003
reference
H.265 / HEVC
2013
~50% smaller
AV1
2018
~40% smaller
06

Redundancy
eliminated

Compression exploits patterns. If a pattern repeats, store it once and reference it. The more predictable the data, the higher the compression ratio.

Run-Length Encoding
Lossless
Every bit recoverable. ZIP, PNG, FLAC.
100%
~60%
Lossy
Imperceptible data discarded. JPEG, MP3, H.264.
100%
~5%
07

One principle.
Four expressions.

Text, image, audio, and video differ only in what they represent and how their data is structured — not in how they are ultimately stored. All collapse to the same substrate.

Binary
0 & 1
Text
code points → bytes
Image
pixels → RGB triplets
Audio
samples → amplitudes
Video
frames → delta streams
Representation is abstraction. Every encoding is a mapping from human-meaningful signals to machine-operational integers.