Binary Data and Buffers: Working with Raw Computer Data
When you save a photo, download a file, or stream a video, your computer isn't working with the image you see or the text you read—it's working with binary data: raw sequences of 1s and 0s. Understanding binary data and how to handle it through buffers is essential for any developer working with files, networks, or real-time data processing.
Let's explore what binary data really is, why we need special tools (buffers) to work with it, and how modern applications use these concepts every day.
What Is Binary Data?
Binary data is information stored as a sequence of bytes, where each byte is 8 bits (eight 1s and 0s). It's the most fundamental way computers store and transmit all types of information.
Why "Binary"?
Computers operate using electrical signals that are either ON (1) or OFF (0). Every piece of data—whether it's text, images, videos, or executable programs—must ultimately be represented as a sequence of these two states.
Think of it this way: Imagine you're communicating using only two flashlights: one red, one green. To send a message, you flash them in sequence. Red might mean 0, green might mean 1. That's essentially how computers store information—as sequences of binary digits (bits).
Binary vs Text: A Key Distinction
It's important to understand that all data is binary at the lowest level, but we often categorize data based on how we interpret it:
Text Data (Character-Based):
What you see: "Hello"
What it means: A sequence of characters with human meaning
How it's stored: Binary (but interpreted as text using encoding like UTF-8)
Binary Data (Raw Bytes):
What you see: [72, 101, 108, 108, 111] or [0x48, 0x65, 0x6C, 0x6C, 0x6F]
What it means: Raw bytes that could represent anything
How it's stored: Binary (interpreted as numbers or raw data)
The difference isn't in how it's stored, but in how we interpret it:
- Text data: We interpret bytes as characters using an encoding system (like UTF-8)
- Binary data: We work directly with the raw bytes without text interpretation
Common Examples of Binary Data
Here are everyday scenarios where you work with binary data:
1. Image Files
A JPEG photo is binary data representing:
- Image dimensions
- Color information for each pixel
- Compression metadata
- Thumbnail data
2. Video Files
An MP4 video contains binary data for:
- Video frames (sequences of images)
- Audio data
- Timing information
- Codec specifications
3. PDF Documents
A PDF file stores binary data including:
- Text content and fonts
- Images and graphics
- Page layout information
- Metadata (author, creation date)
4. Network Communication
When you visit a website:
- HTTP headers (text)
- HTML content (text)
- Images, CSS, JavaScript (binary)
- All transmitted as binary data over the network
5. Database Records
Databases store binary data like:
- Serialized objects
- Encrypted passwords
- File attachments
- Compressed data
Why We Need Buffers
Now that we understand binary data, let's explore why we need special tools called buffers to work with it.
What Is a Buffer?
A buffer is a temporary storage area in memory that holds binary data while it's being processed, transferred, or transformed.
Real-world analogy: Think of a buffer like a loading dock at a warehouse:
- Trucks deliver packages (data arrives)
- Packages wait on the dock temporarily (buffer holds data)
- Workers process packages and send them inside (application processes data)
- The dock has limited space (buffer has fixed size)
Without the loading dock, trucks would have to wait for workers to process each package immediately. With the dock (buffer), trucks can unload quickly, and workers can process packages at their own pace.
The Problem Buffers Solve
Imagine you're downloading a large file. Without buffers:
❌ Without Buffers:
1. Network sends 1 byte → Wait for app to process → Send next byte
2. Application must immediately handle each byte as it arrives
3. Can't handle speed differences between network and processing
4. Extremely slow and inefficient
Result: Download takes hours instead of minutes
With buffers:
✅ With Buffers:
1. Network sends chunks of data → Stored in buffer
2. Application reads from buffer when ready
3. Buffer handles speed differences smoothly
4. Efficient use of memory and processing time
Result: Download completes quickly and smoothly
Key Benefits of Buffers
1. Speed Differences Management
Fast source → Buffer → Slow destination
Example: Reading from fast SSD (500 MB/s) and processing
data slowly (50 MB/s). Buffer absorbs the speed difference.
2. Data Accumulation
Small pieces → Buffer → Complete chunk
Example: Network packets arrive in small pieces. Buffer
collects them until you have enough to process.
3. Memory Efficiency
Large file → Buffer (small) → Process in chunks
Example: Processing a 1 GB video file with only a 10 MB
buffer, reading and processing in manageable chunks.
4. Data Transformation
Format A → Buffer → Transform → Format B
Example: Reading binary image data, holding it in buffer,
and converting to a different format.
How Buffers Work: A Step-by-Step Example
Let's see how buffers work in a practical scenario: downloading and saving an image file.
Scenario: Downloading an Image
Step-by-Step Process:
1. Request Image
└─> Your app: "I want image.jpg"
└─> Server: "Starting download..."
2. Server Sends Data in Chunks
└─> Chunk 1: [bytes 0-1023] (1 KB)
└─> Chunk 2: [bytes 1024-2047] (1 KB)
└─> Chunk 3: [bytes 2048-3071] (1 KB)
└─> ... continues until complete
3. Buffer Receives Chunks
┌─────────────────────────────────┐
│ Buffer (4 KB capacity) │
├─────────────────────────────────┤
│ [Chunk 1][Chunk 2][Chunk 3]... │
└─────────────────────────────────┘
4. Application Processes Buffer
└─> Reads data from buffer
└─> Writes to file: image.jpg
└─> Buffer cleared for next chunks
5. Repeat Until Complete
└─> Download continues chunk by chunk
└─> Each chunk: received → buffered → processed
└─> Final result: Complete image.jpg saved
Visual Representation
Network Stream (Fast) ──┐
│
▼
┌─────────┐
│ Buffer │ ◄── Temporary holding area
│ [...] │ (Fixed size, e.g., 4 KB)
└─────────┘
│
▼
Application Processing (Slower)
│
▼
Save to File System
What happens at each stage:
- Network sends data → Fast, unpredictable arrival times
- Buffer holds data → Smooths out timing differences
- Application reads data → At its own pace
- File system writes → Slower than network, but buffer prevents data loss
Real-World Use Cases for Binary Data and Buffers
Understanding buffers becomes crucial in these common development scenarios:
1. File Upload and Download
Scenario: User uploads a 100 MB video file to your server.
// Without understanding buffers (naive approach)
// ❌ Problem: Tries to load entire 100 MB into memory at once
async function uploadFileNaive(file: File): Promise<void> {
const entireFile = await file.arrayBuffer(); // Loads all 100 MB!
await fetch("/upload", {
method: "POST",
body: entireFile, // Sends all at once
});
// Issues: Memory spike, slow start, no progress tracking
}
// With buffer understanding (better approach)
// ✅ Solution: Process file in manageable chunks
async function uploadFileBuffered(file: File): Promise<void> {
const chunkSize = 1024 * 1024; // 1 MB chunks
const totalChunks = Math.ceil(file.size / chunkSize);
for (let i = 0; i < totalChunks; i++) {
const start = i * chunkSize;
const end = Math.min(start + chunkSize, file.size);
const chunk = file.slice(start, end); // Only loads 1 MB
await fetch("/upload", {
method: "POST",
headers: {
"Content-Range": `bytes ${start}-${end}/${file.size}`,
},
body: chunk, // Sends 1 MB at a time
});
console.log(`Uploaded ${i + 1}/${totalChunks} chunks`);
}
// Benefits: Low memory, progress tracking, resumable uploads
}
2. Image Processing
Scenario: Converting image formats or applying filters.
// Reading and processing an image file
import * as fs from "fs";
async function processImage(inputPath: string): Promise<Buffer> {
// Read image file into buffer
const imageBuffer: Buffer = await fs.promises.readFile(inputPath);
console.log(`Image size: ${imageBuffer.length} bytes`);
console.log(`First 10 bytes: ${imageBuffer.slice(0, 10).toString("hex")}`);
// Image data is now in binary form, ready for processing
// You could:
// - Resize the image
// - Apply filters
// - Convert formats
// - Extract metadata
return imageBuffer;
}
// Example: Checking image type by reading file header
function detectImageType(buffer: Buffer): string {
// PNG files start with: 89 50 4E 47
if (
buffer[0] === 0x89 &&
buffer[1] === 0x50 &&
buffer[2] === 0x4e &&
buffer[3] === 0x47
) {
return "PNG";
}
// JPEG files start with: FF D8 FF
if (buffer[0] === 0xff && buffer[1] === 0xd8 && buffer[2] === 0xff) {
return "JPEG";
}
// GIF files start with: 47 49 46
if (buffer[0] === 0x47 && buffer[1] === 0x49 && buffer[2] === 0x46) {
return "GIF";
}
return "Unknown";
}
// Usage
async function identifyImage(): Promise<void> {
const buffer = await fs.promises.readFile("photo.jpg");
const type = detectImageType(buffer);
console.log(`Image type: ${type}`); // Output: Image type: JPEG
}
3. Network Communication
Scenario: Sending and receiving data over HTTP or WebSockets.
// Example: Receiving binary data from an API
async function downloadBinaryData(url: string): Promise<Buffer> {
const response = await fetch(url);
// Get binary data as ArrayBuffer (browser) or Buffer (Node.js)
const arrayBuffer = await response.arrayBuffer();
// Convert to Node.js Buffer for easier manipulation
const buffer = Buffer.from(arrayBuffer);
console.log(`Downloaded ${buffer.length} bytes`);
return buffer;
}
// Example: Sending binary data in a POST request
async function uploadBinaryData(data: Buffer): Promise<void> {
await fetch("https://api.example.com/upload", {
method: "POST",
headers: {
"Content-Type": "application/octet-stream", // Binary data
"Content-Length": data.length.toString(),
},
body: data, // Send raw binary data
});
}
4. Database Operations
Scenario: Storing and retrieving binary data in databases.
import { Pool } from "pg";
const pool = new Pool({
host: "localhost",
database: "myapp",
user: "postgres",
password: "password",
});
// Storing binary data (e.g., profile picture)
async function saveProfilePicture(
userId: number,
imageBuffer: Buffer
): Promise<void> {
await pool.query(
"UPDATE users SET profile_picture = $1 WHERE id = $2",
[imageBuffer, userId] // Database stores as binary (bytea in PostgreSQL)
);
}
// Retrieving binary data
async function getProfilePicture(userId: number): Promise<Buffer | null> {
const result = await pool.query(
"SELECT profile_picture FROM users WHERE id = $1",
[userId]
);
if (result.rows.length === 0) {
return null;
}
// Returns as Buffer, ready to send to client or save to file
return result.rows[0].profile_picture;
}
// Usage example
async function updateUserPicture(userId: number): Promise<void> {
// Read image file
const imageBuffer = await fs.promises.readFile("new-profile.jpg");
// Save to database
await saveProfilePicture(userId, imageBuffer);
console.log("Profile picture updated");
}
5. File Compression and Decompression
Scenario: Compressing files before upload or decompressing downloaded files.
import * as zlib from "zlib";
import { promisify } from "util";
// Convert callback-based functions to promises
const gzip = promisify(zlib.gzip);
const gunzip = promisify(zlib.gunzip);
// Compressing data
async function compressData(data: Buffer): Promise<Buffer> {
const compressed: Buffer = await gzip(data);
const originalSize = data.length;
const compressedSize = compressed.length;
const ratio = ((1 - compressedSize / originalSize) * 100).toFixed(2);
console.log(`Original: ${originalSize} bytes`);
console.log(`Compressed: ${compressedSize} bytes`);
console.log(`Saved: ${ratio}% of space`);
return compressed;
}
// Decompressing data
async function decompressData(compressed: Buffer): Promise<Buffer> {
const decompressed: Buffer = await gunzip(compressed);
console.log(`Decompressed to ${decompressed.length} bytes`);
return decompressed;
}
// Practical example: Compress before upload, decompress after download
async function uploadWithCompression(filePath: string): Promise<void> {
// Read file
const originalData = await fs.promises.readFile(filePath);
// Compress
const compressed = await compressData(originalData);
// Upload (smaller size = faster upload)
await uploadBinaryData(compressed);
console.log("Compressed file uploaded successfully");
}
async function downloadWithDecompression(url: string): Promise<Buffer> {
// Download compressed data
const compressed = await downloadBinaryData(url);
// Decompress
const original = await decompressData(compressed);
return original;
}
How Different Environments Handle Buffers
JavaScript/TypeScript runs in two main environments, and each handles binary data slightly differently:
Node.js: Buffer Class
Node.js provides a built-in Buffer
class specifically designed for handling binary data.
import * as fs from "fs";
// Creating buffers in Node.js
const buffer1 = Buffer.from("Hello", "utf8"); // From string
const buffer2 = Buffer.from([72, 101, 108, 108, 111]); // From byte array
const buffer3 = Buffer.alloc(10); // Allocate 10 bytes (filled with zeros)
const buffer4 = Buffer.allocUnsafe(10); // Faster but may contain old data
// Reading buffer contents
console.log(buffer1.toString("utf8")); // "Hello"
console.log(buffer1.toString("hex")); // "48656c6c6f"
console.log(buffer1.length); // 5 bytes
// Accessing individual bytes
console.log(buffer1[0]); // 72 (H in ASCII)
console.log(buffer1[1]); // 101 (e in ASCII)
// Practical example: Reading a file
async function readFileAsBuffer(filePath: string): Promise<void> {
const buffer = await fs.promises.readFile(filePath);
console.log(`File size: ${buffer.length} bytes`);
console.log(`First byte: 0x${buffer[0].toString(16)}`);
console.log(`As text: ${buffer.toString("utf8")}`);
}
Key features of Node.js Buffer:
- Fixed-size, allocated in memory
- Efficient for I/O operations (files, network)
- Integrates with all Node.js APIs
- Can convert to/from strings with various encodings
Browser: ArrayBuffer and Typed Arrays
Browsers use ArrayBuffer
and Typed Arrays for binary data.
// Creating ArrayBuffer in browser
const arrayBuffer = new ArrayBuffer(10); // 10 bytes
// Can't directly access ArrayBuffer, need a "view"
const uint8View = new Uint8Array(arrayBuffer);
const uint16View = new Uint16Array(arrayBuffer);
// Writing data
uint8View[0] = 72; // H
uint8View[1] = 101; // e
uint8View[2] = 108; // l
uint8View[3] = 108; // l
uint8View[4] = 111; // o
// Reading data
console.log(uint8View[0]); // 72
// Converting to string
const decoder = new TextDecoder("utf-8");
const text = decoder.decode(uint8View.slice(0, 5));
console.log(text); // "Hello"
// Practical example: Reading uploaded file
async function readUploadedFile(file: File): Promise<void> {
const arrayBuffer = await file.arrayBuffer();
const uint8Array = new Uint8Array(arrayBuffer);
console.log(`File size: ${uint8Array.length} bytes`);
console.log(`First byte: 0x${uint8Array[0].toString(16)}`);
}
Key features of browser ArrayBuffer:
- More flexible with Typed Array views
- Used by Fetch API, File API, WebSockets
- Can be transferred between workers efficiently
- Standard across modern browsers
Converting Between Environments
When working in both Node.js and browsers, you might need to convert between Buffer and ArrayBuffer:
// Node.js Buffer → ArrayBuffer (for browser compatibility)
function bufferToArrayBuffer(buffer: Buffer): ArrayBuffer {
return buffer.buffer.slice(
buffer.byteOffset,
buffer.byteOffset + buffer.byteLength
);
}
// ArrayBuffer → Node.js Buffer
function arrayBufferToBuffer(arrayBuffer: ArrayBuffer): Buffer {
return Buffer.from(arrayBuffer);
}
// Practical example: Universal file reader
async function readBinaryFile(input: string | File): Promise<Uint8Array> {
if (typeof input === "string") {
// Node.js: reading from file path
const buffer = await fs.promises.readFile(input);
return new Uint8Array(buffer);
} else {
// Browser: reading from File object
const arrayBuffer = await input.arrayBuffer();
return new Uint8Array(arrayBuffer);
}
}
Common Operations with Binary Data
Here are essential operations you'll frequently perform with binary data:
1. Reading Specific Bytes
// Reading file header to identify file type
function readFileHeader(buffer: Buffer): void {
// Read first 4 bytes
const header = buffer.slice(0, 4);
console.log("Header bytes:");
for (let i = 0; i < header.length; i++) {
console.log(` Byte ${i}: 0x${header[i].toString(16).padStart(2, "0")}`);
}
}
// Example with PNG file
// PNG header: 89 50 4E 47 0D 0A 1A 0A
const pngBuffer = Buffer.from([0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a]);
readFileHeader(pngBuffer);
// Output:
// Byte 0: 0x89
// Byte 1: 0x50
// Byte 2: 0x4e
// Byte 3: 0x47
2. Concatenating Buffers
// Combining multiple buffers into one
function concatenateBuffers(buffers: Buffer[]): Buffer {
// Calculate total length
const totalLength = buffers.reduce((sum, buf) => sum + buf.length, 0);
// Allocate new buffer
const result = Buffer.allocUnsafe(totalLength);
// Copy each buffer into result
let offset = 0;
for (const buffer of buffers) {
buffer.copy(result, offset);
offset += buffer.length;
}
return result;
}
// Easier way using Buffer.concat
function concatenateBuffersEasy(buffers: Buffer[]): Buffer {
return Buffer.concat(buffers);
}
// Example: Combining file chunks
const chunk1 = Buffer.from("Hello ");
const chunk2 = Buffer.from("World");
const chunk3 = Buffer.from("!");
const combined = Buffer.concat([chunk1, chunk2, chunk3]);
console.log(combined.toString()); // "Hello World!"
3. Comparing Buffers
// Check if two buffers contain the same data
function buffersEqual(buf1: Buffer, buf2: Buffer): boolean {
if (buf1.length !== buf2.length) {
return false;
}
return buf1.equals(buf2); // Built-in method
}
// Compare buffers byte by byte
function compareBuffers(buf1: Buffer, buf2: Buffer): number {
return buf1.compare(buf2);
// Returns:
// 0 if equal
// -1 if buf1 < buf2
// 1 if buf1 > buf2
}
// Example: Verifying file integrity
async function verifyFileIntegrity(
originalPath: string,
copyPath: string
): Promise<boolean> {
const original = await fs.promises.readFile(originalPath);
const copy = await fs.promises.readFile(copyPath);
const identical = original.equals(copy);
console.log(`Files identical: ${identical}`);
return identical;
}
4. Searching Within Buffers
// Find a byte sequence within a buffer
function findSequence(buffer: Buffer, sequence: Buffer): number {
return buffer.indexOf(sequence);
}
// Example: Finding JPEG start marker
function findJpegStartMarker(buffer: Buffer): number {
const jpegMarker = Buffer.from([0xff, 0xd8, 0xff]); // JPEG SOI marker
const position = buffer.indexOf(jpegMarker);
if (position === -1) {
console.log("Not a JPEG file");
} else {
console.log(`JPEG marker found at byte ${position}`);
}
return position;
}
5. Extracting Subsets
// Extract a portion of a buffer
function extractBytes(buffer: Buffer, start: number, length: number): Buffer {
return buffer.slice(start, start + length);
}
// Example: Reading metadata from file
interface FileMetadata {
type: string;
width: number;
height: number;
}
function extractImageMetadata(buffer: Buffer): FileMetadata | null {
// This is simplified - real image parsing is more complex
const header = buffer.slice(0, 10);
// Check if PNG
if (header[0] === 0x89 && header[1] === 0x50) {
// PNG width is at bytes 16-19 (big-endian)
const width = buffer.readUInt32BE(16);
const height = buffer.readUInt32BE(20);
return {
type: "PNG",
width,
height,
};
}
return null;
}
Performance Considerations
Understanding how buffers affect performance helps you write efficient code:
Memory Usage
// ❌ Bad: Loading entire file into memory
async function processLargeFileBad(filePath: string): Promise<void> {
const entireFile = await fs.promises.readFile(filePath);
// If file is 1 GB, this uses 1 GB of RAM!
// Process the file...
processData(entireFile);
}
// ✅ Good: Processing file in chunks
async function processLargeFileGood(filePath: string): Promise<void> {
const chunkSize = 1024 * 1024; // 1 MB chunks
const fileHandle = await fs.promises.open(filePath, "r");
const buffer = Buffer.alloc(chunkSize);
let bytesRead: number;
let position = 0;
do {
// Read 1 MB at a time
({ bytesRead } = await fileHandle.read(buffer, 0, chunkSize, position));
if (bytesRead > 0) {
// Process just this chunk
processData(buffer.slice(0, bytesRead));
position += bytesRead;
}
} while (bytesRead > 0);
await fileHandle.close();
// Never uses more than 1 MB of RAM, regardless of file size!
}
Buffer Allocation
// Understanding allocation performance
function testBufferAllocation(): void {
console.time("allocUnsafe");
for (let i = 0; i < 1000000; i++) {
Buffer.allocUnsafe(1024); // Faster but may contain old data
}
console.timeEnd("allocUnsafe"); // ~100ms
console.time("alloc");
for (let i = 0; i < 1000000; i++) {
Buffer.alloc(1024); // Slower but zero-filled (secure)
}
console.timeEnd("alloc"); // ~500ms
console.time("from");
const data = new Uint8Array(1024);
for (let i = 0; i < 1000000; i++) {
Buffer.from(data); // Copies data
}
console.timeEnd("from"); // ~200ms
}
// Best practice: Choose based on use case
function chooseAllocationMethod(scenario: string): Buffer {
switch (scenario) {
case "temporary-calculation":
// Fast, will overwrite anyway
return Buffer.allocUnsafe(1024);
case "user-facing-data":
// Secure, no leftover data
return Buffer.alloc(1024);
case "from-existing-data":
// Copy existing data
const source = new Uint8Array(1024);
return Buffer.from(source);
default:
// Default to safe option
return Buffer.alloc(1024);
}
}
Common Pitfalls and How to Avoid Them
Pitfall 1: Not Specifying Encoding
// ❌ Wrong: Ambiguous encoding
const buffer1 = Buffer.from("Hello");
// Default encoding is 'utf8', but it's implicit
// ✅ Correct: Explicit encoding
const buffer2 = Buffer.from("Hello", "utf8");
// Clear intent, easier to maintain
// Why it matters with non-ASCII text
const text = "Café";
const bufferImplicit = Buffer.from(text);
const bufferExplicit = Buffer.from(text, "utf8");
console.log(bufferImplicit.toString("latin1")); // Garbled!
console.log(bufferExplicit.toString("utf8")); // Correct: "Café"
Pitfall 2: Modifying Buffers Without Understanding Slices
// Buffers created with slice() share memory!
const original = Buffer.from("Hello World");
const slice = original.slice(0, 5); // "Hello"
// Modifying the slice affects the original!
slice[0] = 74; // Change 'H' (72) to 'J' (74)
console.log(original.toString()); // "Jello World" - Original changed!
console.log(slice.toString()); // "Jello"
// ✅ Solution: Copy instead of slice when you need independence
const copy = Buffer.from(original.slice(0, 5));
copy[0] = 74;
console.log(original.toString()); // "Hello World" - Unchanged
console.log(copy.toString()); // "Jello"
Pitfall 3: Assuming Buffer is Always Available
// Node.js code won't work in browser
function readFileNode(path: string): Buffer {
// ❌ Won't work in browser
return fs.readFileSync(path);
}
// ✅ Better: Write environment-agnostic code
async function readFileCrossPlatform(
input: string | File
): Promise<Uint8Array> {
if (typeof window === "undefined") {
// Node.js environment
const fs = await import("fs");
const buffer = await fs.promises.readFile(input as string);
return new Uint8Array(buffer);
} else {
// Browser environment
const file = input as File;
const arrayBuffer = await file.arrayBuffer();
return new Uint8Array(arrayBuffer);
}
}
Summary: Key Takeaways
Understanding binary data and buffers is fundamental to working with files, networks, and real-time data. Here's what you need to remember:
What Is Binary Data?
- Raw bytes: Sequences of 1s and 0s that represent all computer data
- Interpretation matters: Same bytes can be text, image, video, or anything else
- Universal format: Everything is binary at the lowest level
What Are Buffers?
- Temporary storage: Hold binary data during processing or transfer
- Speed management: Handle differences between fast sources and slow destinations
- Memory efficiency: Process large data in manageable chunks
- Data transformation: Intermediate storage during format conversion
When to Use Buffers
✅ Reading/writing files
- Process files in chunks instead of loading entirely
- Better memory usage, especially for large files
✅ Network communication
- Receive/send data in packets
- Handle streaming data efficiently
✅ Image/video processing
- Manipulate binary media data
- Convert between formats
✅ Database operations
- Store and retrieve binary data
- Handle BLOBs (Binary Large Object)