Most beginners think the OS is just something that “runs programs.”
In HFT, the OS is:
A real-time resource allocator that can make or break latency guarantees.
Understanding OS behavior is not optional — it is foundational.
1. What the Operating System Actually Does
At a high level, the OS is responsible for:
- CPU scheduling
- Memory management
- I/O handling
- Process and thread isolation
In normal applications, these responsibilities are invisible.
In HFT, every one of them introduces latency, jitter, or unpredictability.
2. Processes vs Threads (Why It Matters)
A process has its own virtual memory space. A thread shares memory with other threads in the same process.
Context switching between:
- Processes → expensive
- Threads → cheaper, but still not free
Each context switch:
- Flushes CPU pipelines
- Evicts cache lines
- Adds microsecond-level delay
HFT systems minimize context switches aggressively.
3. Context Switches: The Hidden Latency
A context switch occurs when:
- The OS pauses one thread
- Saves its state
- Restores another thread’s state
This happens when:
- Threads block on I/O
- The scheduler preempts execution
Even a “small” context switch can cost thousands of CPU cycles.
In HFT, those cycles may be more expensive than an entire trading decision.
4. CPU Scheduling: Fairness vs Determinism
General-purpose OS schedulers are designed for:
- Fairness
- Throughput
- Responsiveness
HFT wants:
- Determinism
- Predictable execution windows
This is why HFT systems:
- Pin threads to specific CPU cores
- Disable unnecessary background processes
- Avoid oversubscription
You don’t want your trading thread to “wait its turn.”
5. CPU Affinity and Core Pinning
Modern CPUs have:
- Multiple cores
- Per-core caches
When a thread moves between cores:
- Cache contents are lost
- Latency spikes
By pinning threads to cores:
- Cache locality is preserved
- Execution becomes predictable
HFT systems often dedicate entire cores to a single critical thread.
6. NUMA: Memory Is Not Equally Close
NUMA = Non-Uniform Memory Access
On multi-socket systems:
- Each CPU has its own local memory
- Accessing remote memory is slower
If your thread runs on CPU 0 but accesses memory attached to CPU 1:
- Latency increases
- Cache coherence traffic increases
HFT systems carefully align:
- Threads
- Memory allocation
- Network interrupts
Ignoring NUMA can silently double latency.
7. System Calls: Crossing the Kernel Boundary
A system call is when user code asks the kernel to do something.
Examples:
- Read from a socket
- Write to disk
- Allocate memory
System calls:
- Switch CPU mode
- Flush registers
- Introduce unpredictable delays
HFT hot paths minimize or eliminate system calls entirely.
8. Interrupts: The OS Interrupts You
Hardware devices interrupt the CPU to signal events.
Examples:
- Network packets arriving
- Timers expiring
Interrupts:
- Pause your running code
- Execute kernel handlers
- Evict cache lines
HFT systems:
- Control interrupt affinity
- Isolate network interrupts
- Reduce timer interrupts
Less interruption = more determinism.
9. Why Busy Polling Exists
Most applications wait (block) for I/O.
Blocking:
- Causes context switches
- Introduces scheduler latency
HFT systems often busy poll:
- Continuously check for data
- Avoid sleeping
This wastes CPU but guarantees:
- Immediate reaction
- Stable latency
In HFT, CPU cycles are cheaper than uncertainty.
10. Linux Is Tuned, Not Taken as Default
Out-of-the-box Linux is not low-latency optimized.
HFT setups often:
- Disable power-saving states
- Use real-time scheduling policies
- Tune kernel parameters
The OS becomes part of the trading system.
11. Beginner Mental Model
Think of the OS as:
A traffic controller deciding when and where your code runs
If you don’t control it, it will optimize for someone else’s goals.
12. What Comes Next?
Now that we understand how the OS schedules execution, we move to the next bottleneck.
- How packets actually arrive
- Kernel networking stack costs
- Why kernel bypass exists
➡ Article 4: Low-Latency Networking Fundamentals
