# Design Document — Dual-Pipeline Signal Engine ## Overview The dual-pipeline signal engine is a new service at `services/signal_engine/` that runs as an independent Kubernetes deployment alongside the existing aggregation → recommendation pipeline. It implements a concurrent dual-pipeline architecture where both a heuristic (deterministic scoring) and probabilistic (Bayesian inference) pipeline evaluate the same normalized inputs per ticker per evaluation tick, producing independent BUY/WATCH/SKIP verdicts. A delta analyzer compares the two verdicts, and an output formatter assembles a structured `SignalOutput` contract published to the existing `trading_decisions` Redis queue. The engine introduces several new components — Input Normalizer, Signal Library (Fibonacci, MA Stack, RSI, Cup & Handle, Elliott Wave), Multi-Timeframe Engine, Hard Filter Engine, Exit Engine, Delta Analyzer, and Output Formatter — while reusing existing infrastructure: `compute_signal_weight`, `compute_bayesian_posterior`, `classify_regime`, `WeightedSignal`, `BayesianPosterior`, and `RegimeClassification` from `services/aggregation/`. The service is toggled via `dual_pipeline_enabled` in the `risk_configs` table (default: false, fail-safe). When disabled, the existing pipeline operates unchanged. When enabled, the signal engine runs alongside the existing pipeline with support for shadow mode (dual-pipeline output persisted but not forwarded to trading). ### Design Rationale - **Separate service, not inline extension**: The signal engine has a fundamentally different evaluation cadence (multi-timeframe technical signals) and data flow (OHLCV bars, not document intelligence). Embedding it in the aggregation worker would couple two distinct concerns. - **Reuse existing math**: The Bayesian posterior, regime classification, and signal weighting functions are battle-tested. The probabilistic pipeline wraps them with regime-based priors and likelihood ratio accumulation rather than reimplementing. - **Concurrent pipelines via asyncio.gather**: Both pipelines share the same `NormalizedInput` reference and run concurrently. If one fails, the other completes normally with the failed pipeline producing a SKIP verdict. - **Signal clustering for correlation penalty**: The Bayesian pipeline groups signals into four clusters (momentum, structure, volatility, fundamentals) and applies exponential decay within each cluster to prevent likelihood ratio stacking inflation from correlated signals. --- ## Architecture ### High-Level Flow ```mermaid graph TD A[Evaluation Tick
Redis queue: signal_engine] --> B[Input Normalizer] B --> C[Hard Filter Engine] C -->|filtered out| D[SKIP verdict for both pipelines] C -->|passed| E[Signal Library] E --> F[Multi-Timeframe Engine] F --> G{asyncio.gather} G --> H[Heuristic Pipeline] G --> I[Probabilistic Pipeline] H --> J[Delta Analyzer] I --> J J --> K[Output Formatter] K --> L[SignalOutput] L --> M[Redis: trading_decisions queue] L --> N[PostgreSQL: signal_engine_outputs] subgraph Exit Path B --> O[Exit Engine] O --> K end ``` ### Trigger Mechanism The signal engine polls a new Redis queue `stonks:queue:signal_engine`. Evaluation ticks are enqueued by the scheduler service after aggregation completes for a ticker. The queue message contains `{"ticker": "AAPL", "triggered_at": "2024-01-15T10:00:00Z"}`. ### Integration Points | Component | Integration | Direction | |---|---|---| | Scheduler | Enqueues ticks to `signal_engine` queue | Scheduler → Signal Engine | | Market data tables | OHLCV bars, closing prices, returns | Signal Engine reads | | `macro_impact_records` | Macro bias computation | Signal Engine reads | | `trend_windows` | Fundamental/valuation context | Signal Engine reads | | `risk_configs` | Feature flags, thresholds | Signal Engine reads | | `classify_regime()` | Regime classification for priors | Signal Engine calls | | `compute_signal_weight()` | Heuristic signal weighting | Signal Engine calls | | `compute_bayesian_posterior()` | Bayesian accumulation | Signal Engine calls | | Redis `trading_decisions` | SignalOutput publication | Signal Engine → Trading Engine | | `signal_engine_outputs` table | Persistence for audit | Signal Engine writes | | Redis rolling agreement | Delta analyzer metrics | Signal Engine writes | --- ## Components and Interfaces ### Module Structure ``` services/signal_engine/ ├── __init__.py ├── main.py # Entry point: asyncio event loop, queue polling ├── worker.py # Top-level orchestrator per evaluation tick ├── config.py # SignalEngineConfig, loaded from risk_configs + env ├── models.py # All Pydantic models (NormalizedInput, SignalResult, etc.) ├── normalizer.py # Input Normalizer — fetches and assembles NormalizedInput ├── signals/ │ ├── __init__.py │ ├── base.py # SignalEvaluator protocol, SignalResult model │ ├── fibonacci.py # Fibonacci retracement evaluator │ ├── ma_stack.py # Moving average stack evaluator │ ├── rsi.py # RSI evaluator │ ├── cup_handle.py # Cup & Handle pattern detector │ └── elliott_wave.py # Elliott Wave detector ├── confluence.py # Multi-Timeframe Confluence Engine ├── hard_filter.py # Hard Filter Engine ├── heuristic.py # Heuristic Pipeline (Pipeline A) ├── probabilistic.py # Probabilistic Pipeline (Pipeline B) ├── correlation.py # Signal cluster classification + correlation penalty ├── exit_engine.py # Exit Engine — position-level exit management ├── delta.py # Delta Analyzer ├── formatter.py # Output Formatter └── persistence.py # Database persistence for signal_engine_outputs ``` ### Key Function Signatures #### `main.py` — Entry Point ```python async def main() -> None: """Start the signal engine worker loop. Connects to PostgreSQL and Redis, loads config from risk_configs, and polls the signal_engine queue indefinitely. """ ``` #### `worker.py` — Orchestrator ```python async def evaluate_tick( pool: asyncpg.Pool, redis: redis.asyncio.Redis, ticker: str, config: SignalEngineConfig, ) -> SignalOutput | None: """Run a full evaluation tick for a single ticker. 1. Normalize inputs 2. Evaluate exit conditions for open positions 3. Run hard filters 4. Evaluate signals across timeframes 5. Run both pipelines concurrently 6. Compute delta analysis 7. Format and publish output Returns None if the ticker is hard-filtered or both pipelines fail. """ ``` #### `normalizer.py` — Input Normalizer ```python async def normalize_input( pool: asyncpg.Pool, ticker: str, config: SignalEngineConfig, ) -> NormalizedInput: """Fetch and assemble all data needed for a single evaluation tick. Sources: - OHLCV bars from market_data_bars (M30, H1, H4, D, W, M) - Fundamental metrics from trend_windows + companies - Macro context from macro_impact_records + global_events - Open position state from the trading engine's portfolio Missing data sources produce sentinel values (None/empty list) with a logged warning. """ ``` #### `signals/base.py` — Signal Evaluator Protocol ```python from typing import Protocol class SignalEvaluator(Protocol): """Protocol for all signal evaluators in the Signal Library.""" def evaluate( self, bars: list[OHLCVBar], timeframe: str, ) -> SignalResult | None: """Evaluate a signal on a single timeframe's bar data. Returns None when insufficient data is available. """ ... ``` #### `confluence.py` — Multi-Timeframe Engine ```python def compute_confluence( signal_results: dict[str, dict[str, SignalResult]], weights: dict[str, float], ) -> list[ConfluenceSignal]: """Compute weighted confluence scores across timeframes. Args: signal_results: {signal_type: {timeframe: SignalResult}} weights: {timeframe: weight} e.g. {"M30": 0.03, "D": 0.30, ...} Returns: List of ConfluenceSignal objects that pass the minimum confluence threshold (≥2 timeframes, ≥1 of D/W/M). """ ``` #### `hard_filter.py` — Hard Filter Engine ```python def evaluate_hard_filters( normalized: NormalizedInput, config: HardFilterConfig, ) -> HardFilterResult: """Evaluate pre-pipeline hard filters. Checks: - macro_bias == -1.0 → SKIP - valuation_score < threshold → SKIP - earnings_proximity_days <= threshold → SKIP Returns HardFilterResult with filtered=True/False and all triggered reasons. """ ``` #### `heuristic.py` — Heuristic Pipeline ```python def run_heuristic_pipeline( normalized: NormalizedInput, confluence_signals: list[ConfluenceSignal], config: HeuristicConfig, ) -> HeuristicResult: """Run the deterministic heuristic pipeline. Computes S_total = S_company + S_macro + S_competitive using existing compute_signal_weight() and weighted sentiment averaging. Produces BUY/WATCH/SKIP verdict based on confidence and score thresholds. """ ``` #### `probabilistic.py` — Probabilistic Pipeline ```python def run_probabilistic_pipeline( normalized: NormalizedInput, confluence_signals: list[ConfluenceSignal], regime: RegimeClassification, config: ProbabilisticConfig, ) -> ProbabilisticResult: """Run the Bayesian probabilistic pipeline. 1. Initialize regime-based prior (bull=0.58, range=0.50, bear=0.42) 2. Compute likelihood ratios per signal with correlation penalty 3. Accumulate via log-odds: logit(P_post) = logit(P_prior) + Σ log(LR_i) 4. Apply entropy gating 5. Compute EV_R = P_up · E[win_R] - (1 - P_up) · 1.0 6. Produce BUY/WATCH/SKIP verdict """ ``` #### `correlation.py` — Signal Correlation Penalty ```python class SignalCluster(str, Enum): MOMENTUM = "momentum" # MA stack, RSI STRUCTURE = "structure" # Fibonacci, Elliott Wave VOLATILITY = "volatility" # ATR-based, Bollinger-derived FUNDAMENTALS = "fundamentals" # valuation, earnings, macro def classify_signal(signal_type: str) -> SignalCluster: """Map a signal type to its correlation cluster.""" def apply_correlation_penalty( likelihood_ratios: list[LikelihoodRatio], ) -> list[LikelihoodRatio]: """Apply within-cluster decay penalty to correlated signals. Within each cluster, signals are ranked by LR magnitude. The strongest contributes at full weight; subsequent signals contribute at 0.5^(n-1) decay. Cross-cluster signals are independent (no penalty). """ ``` #### `exit_engine.py` — Exit Engine ```python def evaluate_exits( positions: list[OpenPositionState], current_prices: dict[str, float], config: ExitConfig, ) -> list[ExitSignal]: """Evaluate exit conditions for all open positions. Checks: stop_loss hit, target_1 hit (EXIT_HALF), target_2 hit (EXIT_FULL), trailing stop hit (EXIT_FULL for remaining). Trailing stop activates after EXIT_HALF and ratchets upward only. """ ``` #### `delta.py` — Delta Analyzer ```python async def analyze_delta( heuristic: HeuristicResult, probabilistic: ProbabilisticResult, redis: redis.asyncio.Redis, ticker: str, ) -> DeltaResult: """Compare pipeline verdicts and track agreement metrics. Computes agreement flag, confidence delta, disagreement reasons. Updates rolling 100-evaluation agreement rate in Redis. Logs warning when agreement rate drops below 0.50. """ ``` #### `formatter.py` — Output Formatter ```python def format_output( ticker: str, price: float, heuristic: HeuristicResult, probabilistic: ProbabilisticResult, delta: DeltaResult, exit_signals: list[ExitSignal], config: SignalEngineConfig, ) -> SignalOutput: """Assemble the structured SignalOutput contract. Populates trade_plan based on verdict combination: - Both BUY → dual_confirmed, full position sizing - Probabilistic-only BUY → probabilistic_only, 50% position sizing - Heuristic-only BUY → standard position sizing - No BUY → no trade_plan (WATCH/SKIP persisted for analysis) """ def signal_output_to_recommendation(output: SignalOutput) -> Recommendation: """Map a SignalOutput to the existing Recommendation schema. Enables the trading engine to consume dual-pipeline outputs without modification to its core evaluate_recommendation logic. """ ``` #### `persistence.py` — Database Persistence ```python async def persist_signal_output( pool: asyncpg.Pool, output: SignalOutput, ) -> None: """Persist a SignalOutput to the signal_engine_outputs table. Logs and continues on database errors (persistence failure does not block signal emission to the trading queue). """ ``` --- ## Data Models All new data models are Pydantic `BaseModel` subclasses defined in `services/signal_engine/models.py`. Existing models (`WeightedSignal`, `BayesianPosterior`, `RegimeClassification`, `TrendSummary`, `Recommendation`, `PositionSizing`) are imported from `services/aggregation/` and `services/shared/schemas.py`. ### OHLCVBar ```python class OHLCVBar(BaseModel): """Single OHLCV bar for a timeframe.""" timestamp: datetime open: float high: float low: float close: float volume: float ``` ### NormalizedInput ```python class NormalizedInput(BaseModel): """Unified input structure consumed by both pipelines.""" ticker: str evaluated_at: datetime # Multi-timeframe OHLCV bars bars: dict[str, list[OHLCVBar]] # {"M30": [...], "H1": [...], ...} # Fundamental metrics valuation_score: float | None = None # [0.0, 1.0] earnings_proximity_days: int | None = None # Macro context macro_bias: float = 0.0 # [-1.0, 1.0] # Open position state (for exit engine) open_positions: list[OpenPositionState] = Field(default_factory=list) # Market data for regime classification closing_prices: list[float] = Field(default_factory=list) returns: list[float] = Field(default_factory=list) # Current price (latest close from shortest available timeframe) current_price: float | None = None ``` ### OpenPositionState ```python class OpenPositionState(BaseModel): """Snapshot of an open position for exit evaluation.""" position_id: str ticker: str entry_price: float current_price: float stop_loss: float target_1: float target_2: float trailing_stop: float | None = None partial_exit_done: bool = False atr: float | None = None ``` ### SignalResult ```python class SignalDirection(str, Enum): BULLISH = "bullish" BEARISH = "bearish" NEUTRAL = "neutral" class SignalResult(BaseModel): """Output from a single signal evaluator on a single timeframe.""" signal_type: str # e.g. "fibonacci", "ma_stack", "rsi" timeframe: str # e.g. "D", "H4" strength: float = Field(ge=0.0, le=1.0) direction: SignalDirection confidence: float = Field(ge=0.0, le=1.0) metadata: dict = Field(default_factory=dict) # signal-specific details ``` ### ConfluenceSignal ```python class ConfluenceSignal(BaseModel): """A signal that passed multi-timeframe confluence filtering.""" signal_type: str direction: SignalDirection confluence_score: float # weighted sum across timeframes active_timeframes: list[str] # which timeframes triggered per_timeframe: dict[str, float] # {timeframe: strength} ``` ### Verdict ```python class Verdict(str, Enum): BUY = "BUY" WATCH = "WATCH" SKIP = "SKIP" ``` ### HeuristicResult ```python class HeuristicResult(BaseModel): """Output from the heuristic (deterministic) pipeline.""" verdict: Verdict confidence: float = Field(ge=0.0, le=1.0) s_total: float s_company: float s_macro: float s_competitive: float signal_weights: list[dict] = Field(default_factory=list) reasoning: list[str] = Field(default_factory=list) ``` ### LikelihoodRatio ```python class LikelihoodRatio(BaseModel): """A single signal's likelihood ratio for Bayesian updating.""" signal_type: str cluster: str # SignalCluster value lr: float # P(sig|up) / P(sig|down) log_lr: float # log(lr) penalized_log_lr: float # after correlation penalty hit_rate: float strength: float ``` ### ProbabilisticResult ```python class ProbabilisticResult(BaseModel): """Output from the probabilistic (Bayesian) pipeline.""" verdict: Verdict p_up: float = Field(ge=0.0, le=1.0) entropy: float = Field(ge=0.0, le=1.0) ev_r: float prior: float posterior: float likelihood_ratios: list[LikelihoodRatio] = Field(default_factory=list) regime: str reasoning: list[str] = Field(default_factory=list) ``` ### DeltaResult ```python class DeltaResult(BaseModel): """Output from the delta analyzer comparing both pipelines.""" agreement: bool confidence_delta: float heuristic_verdict: str probabilistic_verdict: str disagreement_reasons: list[str] = Field(default_factory=list) rolling_agreement_rate: float | None = None ``` ### ExitSignal ```python class ExitType(str, Enum): EXIT_HALF = "EXIT_HALF" EXIT_FULL = "EXIT_FULL" class ExitSignal(BaseModel): """An exit signal for an open position.""" position_id: str ticker: str exit_type: ExitType reason: str # "stop_hit", "target_1_hit", "target_2_hit", "trailing_stop_hit" price: float ``` ### TradePlan ```python class TradePlan(BaseModel): """Optional trade plan attached to a BUY signal.""" entry_price: float stop_loss: float target_1: float target_2: float position_size_pct: float = Field(ge=0.0, le=1.0) max_loss_pct: float = Field(ge=0.0, le=1.0) dual_confirmed: bool = False probabilistic_only: bool = False ``` ### SignalOutput ```python class SignalOutput(BaseModel): """The structured output contract consumed by the trading engine and audit systems.""" output_id: str = Field(default_factory=lambda: str(uuid.uuid4())) ticker: str timestamp: datetime price: float # Heuristic pipeline results heuristic_verdict: str heuristic_confidence: float heuristic_s_total: float # Probabilistic pipeline results probabilistic_verdict: str probabilistic_p_up: float probabilistic_entropy: float probabilistic_ev_r: float # Delta analysis delta_agreement: bool delta_confidence_delta: float delta_reasons: list[str] = Field(default_factory=list) # Optional trade plan (populated when at least one pipeline says BUY) trade_plan: TradePlan | None = None # Exit signals for open positions exit_signals: list[ExitSignal] = Field(default_factory=list) # Full pipeline results for audit (stored as JSONB) heuristic_detail: dict = Field(default_factory=dict) probabilistic_detail: dict = Field(default_factory=dict) # Pipeline mode metadata pipeline_mode: str = "dual_pipeline" shadow_mode: bool = False ``` ### SignalEngineConfig ```python @dataclass class SignalEngineConfig: """Configuration loaded from risk_configs + environment.""" dual_pipeline_enabled: bool = False heuristic_pipeline_enabled: bool = True probabilistic_pipeline_enabled: bool = True shadow_mode: bool = False # Timeframe weights timeframe_weights: dict[str, float] = field(default_factory=lambda: { "M30": 0.03, "H1": 0.07, "H4": 0.15, "D": 0.30, "W": 0.30, "M": 0.15, }) # Hard filter thresholds hard_filter_valuation_min: float = 0.3 hard_filter_earnings_days: int = 5 hard_filter_macro_bias_skip: float = -1.0 # Heuristic verdict thresholds heuristic_buy_confidence: float = 0.70 heuristic_buy_s_total: float = 1.2 heuristic_buy_valuation_min: float = 0.5 heuristic_watch_confidence: float = 0.55 # Probabilistic verdict thresholds prob_buy_p_up: float = 0.60 prob_buy_entropy_max: float = 0.90 prob_buy_ev_r_min: float = 1.5 prob_buy_valuation_min: float = 0.5 prob_watch_p_up: float = 0.55 prob_watch_entropy_max: float = 0.95 prob_entropy_skip: float = 0.95 # Regime priors regime_prior_bull: float = 0.58 regime_prior_range: float = 0.50 regime_prior_bear: float = 0.42 # Exit engine trailing_stop_atr_multiplier: float = 2.0 # Polling polling_interval_seconds: int = 30 ``` ### HardFilterConfig / HeuristicConfig / ProbabilisticConfig / ExitConfig These are derived from `SignalEngineConfig` fields for cleaner function signatures — simple `@dataclass` wrappers over the relevant subset of config values. --- ### Database Migration (039) ```sql -- Migration 039: Signal Engine Outputs -- Creates the signal_engine_outputs table for persisting dual-pipeline evaluations. CREATE TABLE IF NOT EXISTS signal_engine_outputs ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), ticker TEXT NOT NULL, evaluated_at TIMESTAMPTZ NOT NULL, price NUMERIC NOT NULL, -- Heuristic pipeline heuristic_verdict TEXT NOT NULL, heuristic_confidence NUMERIC NOT NULL, heuristic_s_total NUMERIC NOT NULL, -- Probabilistic pipeline probabilistic_verdict TEXT NOT NULL, probabilistic_p_up NUMERIC NOT NULL, probabilistic_entropy NUMERIC NOT NULL, probabilistic_ev_r NUMERIC NOT NULL, -- Delta analysis delta_agreement BOOLEAN NOT NULL, delta_confidence_delta NUMERIC NOT NULL, delta_reasons JSONB NOT NULL DEFAULT '[]'::jsonb, -- Trade plan (null when no BUY verdict) trade_plan JSONB, -- Full output for audit full_output JSONB NOT NULL, -- Exit signals exit_signals JSONB NOT NULL DEFAULT '[]'::jsonb, -- Metadata pipeline_mode TEXT NOT NULL DEFAULT 'dual_pipeline', shadow_mode BOOLEAN NOT NULL DEFAULT FALSE, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); -- Index for per-ticker time-range queries CREATE INDEX IF NOT EXISTS idx_signal_engine_outputs_ticker_time ON signal_engine_outputs (ticker, evaluated_at); -- Index for global time-range queries CREATE INDEX IF NOT EXISTS idx_signal_engine_outputs_evaluated ON signal_engine_outputs (evaluated_at); -- Index for filtering by verdict CREATE INDEX IF NOT EXISTS idx_signal_engine_outputs_verdicts ON signal_engine_outputs (heuristic_verdict, probabilistic_verdict); ``` ### Helm / Deployment Configuration Add to `values.yaml` under `services:`: ```yaml signalEngine: replicas: 1 pipeline: true image: signal-engine command: "python -m services.signal_engine.main" tier: processing secrets: [stonks-core-secrets, stonks-market-secrets] resources: requests: { cpu: 100m, memory: 128Mi } limits: { cpu: 500m, memory: 256Mi } ``` Add to `redis_keys.py`: ```python QUEUE_SIGNAL_ENGINE = "signal_engine" ``` The service uses the existing `stonks-config` ConfigMap and `stonks-core-secrets` for database/Redis credentials. No new ingress or network policy is needed — the signal engine is a queue-polling worker with no HTTP interface. ---