Building Axiom
Engineering a trading engine that treats execution as the product, not the strategy.
Building Axiom
Engineering a Trading Engine Where Execution is the Product
Most trading projects I've seen are built the wrong way around. Someone has an idea, they write a backtest, the backtest looks good, and then they spend the next three months duct-taping a live execution loop onto the bottom of it. The strategy is the love child. The execution is the afterthought.
Axiom is built the other way around.
The strategy is a plugin. The engine is the product. If the strategy has a bad day, we lose a little money. If the engine has a bad day, we lose a lot more than that, because an engine that silently desyncs from the broker, double-fires an order, or forgets a position during a reconnect can vaporize capital faster than any bad signal ever could.
So I built the boring parts first. This post walks through the structure.
The Core Loop
Axiom runs as a single-threaded async event loop on an NVIDIA AGX Orin under Docker. One process. One loop. No multi-process IPC, no queue servers, no orchestrators. If something goes wrong, there is exactly one place to look.
At the top of the loop is a WebSocket feed from the broker. Bars flow in, state updates, signals evaluate, orders route out. That's it.
┌──────────────────────────────────────────────────────────────┐
│ EXTERNAL SERVICES │
│ │
│ Broker WS (market data) Broker Trading Stream │
│ Broker REST (orders) │
└────────┬──────────────────────────────┬──────────────────────┘
│ bars, trades │ fills, cancels, rejects
▼ ▼
┌──────────────────────────────────────────────────────────────┐
│ EXECUTION ENGINE (async) │
│ │
│ WebSocketHandler ──► _on_bar_received() │
│ │ │
│ ▼ │
│ IndicatorEngine ─► SessionManager ─► StrategyRegistry │
│ │ │
│ ▼ │
│ PositionSizer ──► OrderExecutor ──► OSM │
│ │ │
│ ▼ │
│ StateManager │
└───────────────────────────┬──────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ PERSISTENCE LAYER │
│ │
│ Parquet bars SQLite checkpoint Snapshots (EOD) │
└──────────────────────────────────────────────────────────────┘
Everything hangs off of _on_bar_received(). That function is the heartbeat. A bar comes in, the engine reacts, the engine persists, the engine waits. That model is boring on purpose. Boring is auditable.
What Happens When a Bar Arrives
Every bar runs through the same deterministic pipeline. No branching based on strategy. No conditional shortcuts. Every decision the engine will ever make is a function of the state it has right now plus the bar it just received.
Bar arrives
│
▼
[ Store to Parquet (raw truth, append only) ]
│
▼
[ IndicatorEngine.on_bar() ]
│ update VWAP, slope, range, event gates
▼
┌────────────────────────┐
│ Position for symbol? │
└──────────┬─────────────┘
│yes ─► update unrealized PnL ─► TP/SL check ─► maybe force exit
│no ─► continue
▼
[ Session phase? ]
│
├─ WARMUP / CLOSED ─► skip
├─ EOD_LIQUIDATION ─► close all positions, done
└─ ACTIVE ─► continue
│
▼
[ Symbol warmup complete? (≥ 50 bars of state) ]
│
▼
[ StrategyRegistry.run_strategy() for each enabled strategy ]
│
▼
[ Signal? ] ─► size it ─► execute ─► escalate stale orders ─► checkpoint
│
▼
Wait for next bar
A few things worth calling out here.
The raw bar gets stored before anything else touches it. Parquet, append-only, one file per day. If the engine crashes, if the indicators get a bug, if the strategy needs to be rewritten, the underlying truth is already on disk. You can replay the day from scratch without ever having to trust the in-memory state that produced the live decisions. That separation has paid for itself more times than I want to admit.
Warmup is enforced at the symbol level, not the engine level. If a symbol just entered the universe and we don't have 50 bars of indicator state for it, the engine will not open a position on it. Period. No "we'll get close enough" logic. Either we have the context or we stay flat.
Checkpointing is cheap and constant. Every 100 bars the engine writes its live state to a local SQLite file. That file is what allows the engine to be killed mid-session and restarted without losing track of what it owns.
Strategies Don't Decide Anything
This is the design principle that shaped the rest of the system.
A strategy in Axiom is a function that receives bars and indicator state and either returns a Signal or returns None. That is the entire contract. Strategies do not know about position sizing. They do not know about capital. They do not know about bracket orders, session state, escalation policy, kill-switches, or reconciliation. They cannot open a position. They can only suggest one.
Every guardrail lives in the engine:
- Position sizing is centralized in the
PositionSizer, driven by config, not strategy. - Max position percentage, floor-to-whole-shares, and equity sanity checks are centralized.
- Session phase gates decide whether a signal is even eligible to be executed.
- The kill-switch (daily and weekly loss limits) blocks entries regardless of what any strategy says.
- Per-symbol cooldowns after a stop loss prevent re-entry thrash.
Order Lifecycle and Escalation
The part of the engine I spent the most time on is the order state machine. Brokers do not always do what you ask them to do on the first try. Limit orders sit unfilled. Fills come in partials. Rejects happen. Sometimes a fill acknowledgement arrives after you already sent a cancel. The engine has to be right in all of those cases.
Orders live in an explicit state machine with an audit trail for every transition:
create_order()
│
▼
┌─────────┐
│ PENDING │
└────┬────┘
│ submit
▼
┌───────────┐
│ SUBMITTED │──► REJECTED ──► [end]
└─────┬─────┘
│ broker ACK
▼
┌──────────┐
│ ACCEPTED │
└─────┬────┘
┌─────────┬──────────┬─────────────┐
▼ ▼ ▼ ▼
┌────────┐ ┌──────────┐ ┌────────────┐ ┌─────────┐
│ FILLED │ │PARTIAL_F │ │ REPRICING │ │CANCELED │
└────────┘ └─────┬────┘ └─────┬──────┘ └─────────┘
│ │
▼ ▼
FILLED ESCALATING ─► SUBMITTED (as market)
The escalation policy is simple. An entry order starts as a limit at a calculated fair price. If the broker has not accepted it and filled it within two seconds, the engine cancels it and reprices halfway to the last trade. If the second attempt times out, the engine escalates to a market order. After three attempts, the engine gives up on that signal rather than chase forever.
┌───────────┐ 2s timeout ┌───────────┐ 2s timeout ┌──────────┐
│ LIMIT │ ─────────────► │ REPRICE │ ─────────────► │ MARKET │
│ (at VWAP)│ │ (halfway) │ │ (fill) │
└───────────┘ └───────────┘ └──────────┘
This matters because it removes a tradeoff that people usually hand-wave. "Do we use limit orders or market orders?" becomes "we start disciplined, we escalate if we need to, and we give up if the market is telling us no." The policy is encoded in the engine, not re-litigated by every strategy.
Belt-and-Suspenders Risk Management
This one took me a while to get right. The first version relied entirely on the broker's bracket orders for take-profit and stop-loss. That worked until the first time a bracket leg didn't register and I had a position sitting naked. Now Axiom runs two independent layers.
Layer 1 is hardware. When an entry fills, the engine submits a bracket order to the broker with the TP and SL legs attached. The broker holds those legs. If the price hits a level, the broker fills the leg and the engine learns about it through the trading stream.
Layer 2 is software. On every incoming bar, the engine independently recomputes unrealized PnL against the entry and checks it against the configured TP and SL thresholds. If the software layer decides the position should close and the bracket legs are still active at the broker, the engine defers. If the software layer decides the position should close and there are no active bracket legs, the engine submits a market exit itself.
On top of that, there's a reconciliation loop that runs every 20 bars. It compares local position state against broker position state. If the broker says we're flat but the engine thinks we have a position, the engine cleans up its own state. If the broker says we have a position the engine doesn't know about, that's logged loudly.
Layer 1 (broker): TP leg ─────► fill ──┐
SL leg ─────► fill ──┤
├──► Trading Stream event
Layer 2 (engine): bar tick ─► check ───┤ │
│ ▼
│ close local position
│ record realized PnL
│ check kill-switch
│
Reconciler (20b): broker state ◄──────►│
diff, clean up
The phrase I use for this internally is "three ways to notice the same exit." If the trading stream misses a fill event, the reconciler will catch it. If the reconciler is slow, the software layer will catch it. If the software layer has a bug, the bracket leg still fills at the broker. Any one layer failing is not a capital event.
Session Awareness
Markets are not open twenty-four hours, and the hours they are open are not homogeneous. The first fifteen minutes are a different distribution than the last thirty. The engine treats session phase as a first-class input.
04:00 09:30 09:45 15:30 16:00 20:00
│──PRE_MARKET──│WARM │────── ACTIVE ────────│ WARN │──AFTER_HOURS──│
│ │ │ │
│ │ positions can open │ │ force close
│ │ strategies run │ │ all positions
│ │ │ │
│ │ no new │
│ indicator warmup entries │
│ (50 bars)
The state transitions are strict:
WARMUP: the open just happened, no new positions. Indicators are filling up their buffers.ACTIVE: normal operation.EOD_WARNING: thirty minutes before close. No new entries. Existing positions can still hit TP or SL.EOD_LIQUIDATION: one minute before close. The engine force-closes everything regardless of PnL. There are no overnight holds.
Backtest and Live Are the Same Code Path
This is the property I refused to compromise on. The _on_bar_received() function does not know, and does not need to know, whether the bar came from a live WebSocket or from a parquet file being replayed at a thousand bars per second.
Live: Broker WebSocket ─────► Bar ─┐
│
├──► _on_bar_received()
│ │
Backtest: Parquet ─► sorted stream ────┘ ▼
same everything below
That means anything I can prove on historical data is also true about the live path. Indicators behave the same. Session logic behaves the same. Order escalation, risk layers, reconciliation logic, all of it is exercised by backtests. I can't fix a bug in backtest and accidentally leave it unfixed in live. They share the mutation.
The only things that get mocked in backtest mode are the broker-facing calls, and those are gated behind a single interface that the engine talks to. Everything above that interface is the real engine.
State, Crashes, and the Hard Part
The honest part of this post is that most of the time I spend on Axiom is not on strategy or on indicators. It's on making sure the engine cannot drift into a state it doesn't understand.
Three things keep that in check.
Checkpointing. Every 100 bars, full state to SQLite. Positions, bar states, trade counts, cooldowns, kill-switch status. That file is the engine's memory of itself.
Reconciliation on startup. The first thing the engine does when it starts is fetch the broker's positions, orders, and account equity, and compare them to what the checkpoint said was true. The broker is the source of truth. If the checkpoint is wrong, the engine updates itself to match reality before it does anything else.
No guessing. This is the principle I care about the most. If the engine sees something it doesn't understand, it does not make assumptions. It logs loudly, refuses the action, and, in the worst case, halts. A stopped engine that didn't make a trade is always recoverable. A running engine making trades on bad state is a disaster.
Why It's Built This Way
The alternative to building this carefully is building it quickly, and I've done that, and the output of that approach is a codebase you stop trusting after two weeks of live operation. You start adding manual checks before every run. You start second-guessing every fill. You start restarting the engine "just in case."
Axiom is built so that I don't have to trust it based on gut feeling. I trust it because the invariants are named, enforced in one place, and covered by tests that hit the same code the live engine uses. If it does something I don't expect, there is a log line, a state transition, and a checkpoint that explain why.
The strategy will keep changing. The gates will get tuned. Tiers will be added and pruned. But the execution engine underneath is the part that has to hold still, because everything else is an experiment running on top of it.
TLDR... The engine is the product. The strategy is a plugin. Build the boring parts first, make them auditable, and never let the thing that touches your money make decisions on state it isn't sure about.