Cloud Resources | AI-Native Solutions & Data Intelligence

Building an ML trading system that works in a backtest is easy. Building one that survives real markets — with slippage, latency, regime changes, and correlated drawdowns — is an entirely different discipline. TradeNova is our answer: a 5-agent ensemble running on AWS EKS that combines reinforcement learning, gradient boosting, and Bayesian self-optimization into a production-grade trading platform at $280/month in infrastructure cost.

Why Multi-Agent?

Single-strategy systems are fragile. A trend follower prints money in directional markets and hemorrhages in chop. A mean-reversion system thrives in ranges and capitulates during breakouts. The academic literature on portfolio diversification applies equally to strategy diversification — uncorrelated return streams reduce drawdowns more reliably than any single alpha source.

TradeNova operationalizes this principle with five specialized agents, each an independent ML model with its own feature pipeline, training loop, and risk budget.

The 5 Agent Types

Trend Agent: A reinforcement learning agent built on stable-baselines3 (PPO) that ingests multi-timeframe price action, ADX, and Supertrend indicators. It learns to enter trending instruments early and trail stops dynamically.
MeanReversion Agent: A LightGBM classifier trained on Bollinger Band z-scores, RSI divergences, and order-flow imbalance features. It identifies overextended moves and trades the snap-back.
Volatility Agent: Specializes in VIX-regime transitions using a hidden Markov model for state detection paired with a PyTorch policy network for position sizing.
EMA Agent: A fast-reacting agent that trades exponential moving average crossovers with adaptive lookback periods tuned by Bayesian optimization. Designed for high-frequency signals in liquid instruments.
Options Agent: Prices and executes options spreads based on implied-vs-realized volatility divergence, using a PyTorch model trained on historical options chains and Greeks surfaces.

Master Picks: Unified Scoring

Each agent produces independent trade proposals. The Master Picks layer aggregates them into a unified scoring system on a 0–350+ point scale. Points are allocated across dimensions: signal strength (0–100), agent confidence (0–50), cross-agent agreement (0–75), regime alignment (0–75), and risk-budget availability (0–50+).

Proposals scoring below a dynamic threshold — calibrated nightly using the previous 30 days' hit rate — are filtered out. When multiple agents converge on the same instrument and direction, the agreement bonus amplifies the score, creating a natural wisdom of crowds effect.

The 7-Layer Market Weather System

Before any trade is evaluated, the system computes a holistic market context through seven analytical layers:

Macro regime: Bull, bear, or transition based on broad index trends and yield curve signals
Volatility regime: Low, normal, elevated, or crisis using VIX term structure
Sector rotation: Relative strength across 11 GICS sectors with momentum scoring
Correlation regime: Dispersion vs. correlation clustering across the S&P 500
Liquidity: Bid-ask spread trends, volume profiles, and market depth metrics
Sentiment: Put/call ratios, AAII surveys, and social-media NLP scores
Calendar effects: FOMC dates, earnings seasons, options expiration cycles, and seasonality patterns

Each agent receives the current weather vector as input features, allowing it to condition its signals on regime context without hardcoded rules.

The Moonshot Engine

Separately from the core agents, the Moonshot engine scans for asymmetric setups — trades with 5x+ reward-to-risk ratios that conventional scoring would rank modestly. Moonshots receive a small, capped allocation (never exceeding 2% of portfolio) and are evaluated on a separate P&L track to avoid contaminating the main performance metrics.

Self-Improving ML

TradeNova's most powerful feature is its self-improving feedback loop. Every closed trade is logged with full feature snapshots. Each night, a retraining pipeline runs:

Bayesian weight updating: Agent weights in Master Picks are adjusted based on trailing 30-day Sharpe ratios using Thompson sampling.
Nightly retraining: Each agent's model is retrained on the expanded dataset with walk-forward validation to prevent look-ahead bias.
Regime drift detection: If an agent's out-of-sample performance degrades beyond a threshold, its allocation is automatically reduced until the next successful retraining cycle.

5-Tier Profit Cascade

Position management follows a structured exit framework. As a trade moves in favor, profits are locked in across five tiers — at 1R, 2R, 3R, 5R, and 8R multiples of initial risk. Each tier closes a percentage of the position and tightens the trailing stop on the remainder. This ensures that winning trades contribute realized gains while allowing runners to capture tail moves.

Infrastructure at $280/Month

The entire system runs on AWS EKS with infrastructure defined in Terraform. Cost optimization is aggressive: Spot instances for nightly retraining, reserved instances for the always-on inference pods, and S3 Intelligent-Tiering for historical data storage.

EKS cluster: 3 t3.medium nodes (on-demand) for inference — $110/mo
Spot training: g4dn.xlarge GPU instances for nightly retraining — $85/mo average
Data & networking: S3, ECR, NAT Gateway, CloudWatch — $55/mo
Managed services: RDS PostgreSQL (db.t3.micro) for trade logs — $30/mo

At $280/month total, TradeNova demonstrates that production ML systems don't require six-figure cloud bills. Thoughtful architecture — right-sizing instances, exploiting spot pricing, and separating training from inference — makes sophisticated multi-agent ML accessible to teams of any size.

Technology Stack

Language: Python 3.11 with async execution via asyncio
Infrastructure: AWS EKS orchestrated by Terraform
Deep learning: PyTorch for policy networks and options pricing
Reinforcement learning: stable-baselines3 (PPO, A2C)
Gradient boosting: LightGBM for classification and feature ranking

The lesson from TradeNova is architectural: don't build one model that tries to learn everything. Build specialized agents, give them independent training loops, and let a meta-learner discover the optimal blend. The market is a multi-regime system — your trading system should be too.

Multi-Agent ML Systems: Reinforcement Learning Meets Production Trading