From 4954318f7bbcce19e24215a4cf0d7bf8668f8658 Mon Sep 17 00:00:00 2001 From: Celes Renata Date: Tue, 28 Apr 2026 17:01:03 +0000 Subject: [PATCH] docs: add comprehensive mathematical reference for all pipeline equations --- docs/equations.md | 651 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 651 insertions(+) create mode 100644 docs/equations.md diff --git a/docs/equations.md b/docs/equations.md new file mode 100644 index 0000000..0aaa138 --- /dev/null +++ b/docs/equations.md @@ -0,0 +1,651 @@ +# Stonks Oracle — Mathematical Reference + +Every equation, formula, threshold, and constant used in the signal processing, aggregation, recommendation, and trading pipeline. Organized by pipeline stage. + +Code references are provided so each formula can be traced to its implementation. + +--- + +## 1. Signal Scoring + +**Source:** `services/aggregation/scoring.py` + +### 1.1 Combined Signal Weight + +Each document signal receives a composite weight: + +``` +W_combined = G_conf × W_recency × W_credibility × (1 + B_novelty) × M_context +``` + +| Component | Symbol | Formula | Range | +|---|---|---|---| +| Confidence gate | G_conf | 1 if extraction_confidence ≥ 0.2, else 0 | {0, 1} | +| Recency decay | W_recency | 2^(−t_age / t_half) | [0.01, 1.0] | +| Credibility | W_credibility | clamp(credibility, 0.1, 1.0)^α | [0.1, 1.0] | +| Novelty bonus | B_novelty | novelty_score × 0.25 | [0, 0.25] | +| Market context | M_context | 1 + boost_vol + boost_vol_surge | [1.0, 1.45] | + +### 1.2 Recency Decay + +``` +W_recency = max( 2^(−t_age / t_half), 0.01 ) +``` + +where `t_age` is document age in hours and half-lives by window are: + +| Window | t_half (hours) | +|---|---| +| intraday | 2 | +| 1d | 12 | +| 7d | 72 | +| 30d | 240 | +| 90d | 720 | + +### 1.3 Credibility Weight + +``` +W_credibility = clamp(c_raw, 0.1, 1.0)^α where α = 1.0 (default) +``` + +α > 1 penalizes low-credibility sources more aggressively; α < 1 flattens the curve. + +### 1.4 Market Context Multiplier + +``` +boost_vol = min( ln(1 + max(σ − 1.0, 0)) × 0.15, 0.30 ) + +boost_surge = 0.15 if ΔV% > 50%, else 0 + +M_context = 1.0 + boost_vol + boost_surge +``` + +where σ is price volatility and ΔV% is volume change percentage. + +### 1.5 Weighted Sentiment Average + +``` +S_avg = Σ(W_combined_i × impact_i × sentiment_i) / Σ(W_combined_i × impact_i) +``` + +- sentiment_i ∈ {+1.0 (positive), −1.0 (negative), 0.0 (neutral/mixed)} +- impact_i ∈ [0, 1] from extraction +- Returns 0.0 when denominator = 0 + +--- + +## 2. Trend Summary Assembly + +**Source:** `services/aggregation/worker.py` + +### 2.1 Trend Direction + +| Condition | Direction | +|---|---| +| S_avg ≥ 0.15 | Bullish | +| S_avg ≤ −0.15 | Bearish | +| contradiction > 0.10 AND |S_avg| < 0.30 | Mixed | +| otherwise | Neutral | + +### 2.2 Trend Strength + +``` +strength = min(|S_avg|, 1.0) +``` + +### 2.3 Contradiction Score + +**Source:** `services/aggregation/contradiction.py` + +``` +contradiction = W_minority / (W_positive + W_negative) +``` + +where: +``` +W_positive = Σ(W_combined_i × impact_i) for signals with sentiment > 0 +W_negative = Σ(W_combined_i × impact_i) for signals with sentiment < 0 +W_minority = min(W_positive, W_negative) +``` + +Range: [0, 1]. 0 = full agreement, 0.5 = equal-weight disagreement. + +### 2.4 Trend Confidence + +``` +confidence = clamp(0.3 × F_count + 0.3 × C_avg + 0.4 × A_agreement − P_contradiction, 0, 1) +``` + +| Component | Formula | +|---|---| +| F_count (source count) | min(N_unique / 15, 0.8) | +| C_avg (extraction confidence) | mean of extraction confidences | +| A_agreement (signal agreement) | fraction_same_direction × min(1, log₂(N_unique + 1) / log₂(8)) | +| P_contradiction | contradiction_score × 0.4 | + +--- + +## 3. Macro Impact Scoring (Layer 2) + +**Source:** `services/aggregation/interpolation.py` + +### 3.1 Overlap Components + +**Geographic overlap:** +``` +O_geo = Σ revenue_pct_r for each event region r in company's revenue mix +``` +Range: [0, 1] + +**Supply chain overlap:** +``` +O_supply = |event_regions ∩ supply_regions| / |supply_regions| +``` + +**Commodity overlap:** +``` +O_commodity = |event_commodities ∩ company_commodities| / |company_commodities| +``` + +**Sector overlap:** +``` +O_sector = 1.0 if company_sector ∈ event_affected_sectors, else 0.0 +``` + +### 3.2 Raw Macro Impact Score + +``` +S_raw = W_severity × (0.35 × O_geo + 0.25 × O_supply + 0.25 × O_commodity + 0.15 × O_sector) +``` + +Severity weights: + +| Severity | W_severity | +|---|---| +| critical | 1.0 | +| high | 0.75 | +| moderate | 0.5 | +| low | 0.25 | + +### 3.3 Resilience Modifier + +For international events, the raw score is adjusted by market position: + +``` +S_final = clamp(S_raw × R_tier, 0, 1) +``` + +| Market Position Tier | R_tier | +|---|---| +| Global leader | 0.70 | +| Multinational | 0.85 | +| Regional | 1.00 | +| Domestic | 1.20 | + +For domestic-only events, R_tier = 1.0 regardless of tier. + +### 3.4 Macro Impact Confidence + +``` +confidence = min(event_confidence × min(O_total + 0.3, 1.0), 1.0) +``` + +where O_total = O_geo + O_supply + O_commodity + O_sector. + +### 3.5 Accelerated Staleness Decay + +For short-term events older than 48 hours: + +``` +decay_standard = e^(−0.693 × t_age_hours / t_half_hours) (t_half default = 168h) +decay_accelerated = decay_standard × 0.5 +``` + +### 3.6 Macro Signal as WeightedSignal + +When merged into the aggregation engine: + +``` +impact_score_macro = macro_impact_score × W_macro (W_macro = 0.3 default) +sentiment_value = +1 if positive, −1 if negative +``` + +Recency decay uses the global event's publication time. + +--- + +## 4. Competitive Signals (Layer 3) + +### 4.1 Pattern Confidence + +**Source:** `services/aggregation/pattern_matcher.py` + +``` +confidence = F_sample × 0.4 + F_consistency × 0.4 + F_recency × 0.2 +``` + +| Factor | Formula | +|---|---| +| F_sample | min(N_samples / 20, 1.0) | +| F_consistency | max(pct_bullish, pct_bearish) | +| F_recency | 1.0 if age ≤ 7d; 0.7 if age ≤ 90d; 0.4 otherwise | + +**Modifiers:** +- Major corporate decision (m&a, earnings, legal): confidence × 1.3 +- Insufficient data (N_samples < min_pattern_samples): cap at 0.25 +- Stale data (age > staleness_window_days): confidence × staleness_decay_penalty + +**Lookback windows:** +- Routine signals: 180 days +- Major corporate decisions: 365 days + +### 4.2 Cross-Company Signal Strength + +**Source:** `services/aggregation/signal_propagation.py` + +``` +S_competitive = clamp(S_pattern_avg × R_relationship × C_pattern × I_source, 0, 1) +``` + +| Component | Description | +|---|---| +| S_pattern_avg | Average historical outcome strength [0, 1] | +| R_relationship | Relationship strength from competitor_relationships [0, 1] | +| C_pattern | Pattern confidence from §4.1 | +| I_source | Source document's impact_score [0, 1] | + +**Threshold gate:** Skipped if R_relationship < propagation_strength_threshold (default 0.2). + +### 4.3 Competitive Signal as WeightedSignal + +``` +impact_score_competitive = S_competitive × W_competitive (W_competitive = 0.2 default) +direction = majority historical outcome (bullish or bearish) +``` + +--- + +## 5. Trend Projection + +**Source:** `services/aggregation/projection.py` + +### 5.1 Trend Momentum + +``` +momentum = S_current_signed − S_previous_signed +``` + +where `S_signed = direction_sign × strength` (bullish = +1, bearish = −1, neutral = 0). + +When no previous data exists: +``` +momentum = direction_sign × strength × 0.5 +``` + +Range: [−1, 1] + +### 5.2 Macro Decay Projection + +For each active macro event projected forward by `H` days: + +``` +F_future = 2^(−(t_current + H) / t_half) +I_projected = macro_impact_score × F_future × W_severity +``` + +Decay half-lives: + +| Duration | t_half (days) | +|---|---| +| short_term | 1.0 | +| medium_term | 7.0 | +| long_term | 30.0 | + +Aggregate direction: bullish if W_pos > 1.2 × W_neg; bearish if W_neg > 1.2 × W_pos; mixed if both > 0. + +### 5.3 Projection Blending + +``` +W_macro_blend = min(S_macro_projected × 0.4, 0.4) +W_company = 1.0 − W_macro_blend + +S_blended = W_company × S_momentum_projected + W_macro_blend × S_macro_signed +``` + +**Catalyst boost:** `min(N_catalysts × 0.02, 0.1)` added to projected strength. + +**Projected confidence:** +``` +C_projected = C_base × 0.8 + min(S_macro × 0.15, 0.1) +``` + +**Divergence detection:** Flagged when projected direction ≠ current trend direction. + +--- + +## 6. Data Quality Suppression + +**Source:** `services/recommendation/suppression.py` + +### 6.1 Data Quality Score + +``` +Q = 0.4 × Q_confidence + 0.3 × Q_freshness + 0.3 × Q_coverage +``` + +| Component | Formula | +|---|---| +| Q_confidence | min(C_avg_extraction / 0.8, 1.0) | +| Q_freshness | max(0, 1 − t_newest_hours / 168) | +| Q_coverage | (N_valid / N_total) × min(N_valid / 10, 1.0) | + +**Suppression triggers** (any one → informational only): + +| Check | Threshold | +|---|---| +| Avg extraction confidence | < 0.40 | +| Evidence staleness | > 168 hours (7 days) | +| Source type diversity | < 1 distinct type | +| Extraction failure rate | > 50% | +| Valid document count | < 2 | +| Data quality score | < 0.30 | + +### 6.2 Safety Suppression + +- **Macro-only:** If trend driven solely by macro signals with zero company evidence → forced informational +- **Pattern-only:** If trend driven solely by pattern/competitive signals with no company or macro support → forced informational + +--- + +## 7. Recommendation Eligibility + +**Source:** `services/recommendation/eligibility.py` + +### 7.1 Gate Checks (all must pass) + +| Check | Threshold | +|---|---| +| Confidence | ≥ 0.35 | +| Trend strength | ≥ 0.10 | +| Contradiction score | ≤ 0.60 | +| Evidence count | ≥ 2 | +| Direction | ≠ neutral | + +### 7.2 Action Mapping + +| Condition | Action | +|---|---| +| Bullish AND strength ≥ 0.25 | BUY | +| Bearish AND strength ≥ 0.25 | SELL | +| Directional AND confidence ≥ 0.50 | HOLD | +| Mixed or weak | WATCH | + +### 7.3 Mode Escalation + +| Mode | Requirements | +|---|---| +| live_eligible | confidence ≥ 0.70, contradiction ≤ 0.25, evidence ≥ 5 | +| paper_eligible | confidence ≥ 0.50 | +| informational | everything else (WATCH/HOLD always informational) | + +### 7.4 Position Sizing + +``` +portfolio_pct = base + C_factor × S_factor × range × P_contradiction × P_evidence +``` + +| Component | Formula | Default | +|---|---|---| +| base | base_portfolio_pct | 0.01 (1%) | +| range | max_portfolio_pct − base_portfolio_pct | 0.09 (9%) | +| C_factor | confidence_sizing_weight × confidence | 0.8 × confidence | +| S_factor | 0.5 + 0.5 × trend_strength | [0.5, 1.0] | +| P_contradiction | 1 − (contradiction_penalty × contradiction_score) | penalty = 0.5 | +| P_evidence | 0.50 if evidence < 3; 0.75 if evidence < 5; 1.0 otherwise | | + +Clamped to [base × 0.5, max_portfolio_pct]. + +**Max loss percentage** uses the same structure with base = 0.003 (0.3%) and max = 0.02 (2%). + +--- + +## 8. Trading Engine — Position Sizing + +**Source:** `services/trading/position_sizer.py` + +### 8.1 Base Allocation + +``` +raw_pct = (max_position_pct × 0.5) × (confidence / min_confidence) × multiplier +clamped_pct = min(raw_pct, max_position_pct) +dollar_amount = min(active_pool × clamped_pct, absolute_position_cap) +``` + +### 8.2 Correlation Reduction + +``` +ρ_avg = Σ(ρ_i × w_i) / Σ(w_i) for existing positions +``` + +| ρ_avg | Action | +|---|---| +| > 0.8 | Reject order | +| 0.5 < ρ_avg ≤ 0.8 | Reduce: factor = 1 − (ρ_avg − 0.5) / 0.3 | +| ≤ 0.5 | No reduction | + +### 8.3 Sector Exposure Reduction + +``` +available = max(max_sector_pct × active_pool − current_sector_exposure, 0) +dollar_amount = min(dollar_amount, available) +``` + +### 8.4 Diversification Bonus + +If < 3 sectors held AND entering a new sector: dollar_amount × 1.2 (capped at max_position_pct). + +### 8.5 Earnings Proximity + +| Days to earnings | Action | +|---|---| +| ≤ 1 | Reject | +| 1–3 | 50% reduction | +| > 3 | No adjustment | + +### 8.6 Portfolio Heat Check + +``` +heat_new = dollar_amount × atr_multiplier × 0.02 +heat_max = max_portfolio_heat × active_pool + +Reject if: heat_current + heat_new > heat_max +``` + +### 8.7 Share Rounding + +``` +shares = floor(dollar_amount / current_price) +final_dollar = shares × current_price +``` + +Reject if shares = 0. + +--- + +## 9. Stop-Loss and Take-Profit + +**Source:** `services/trading/stop_loss_manager.py` + +### 9.1 Initial Levels + +``` +stop_distance = ATR × M_atr +stop_loss = entry_price − stop_distance +take_profit = entry_price + stop_distance × R_reward_risk +``` + +| Trade type | M_atr | R_reward_risk | +|---|---|---| +| Standard | risk_tier.stop_loss_atr_multiplier | risk_tier.reward_risk_ratio | +| Micro-trade | 1.0 | 1.5 | + +### 9.2 Dynamic Tightening + +| Condition | Effective multiplier | +|---|---| +| High-severity macro event | base × 0.5 | +| Earnings within 3 days | base × 0.7 | +| Portfolio heat > 80% of max | base × 0.7 | +| Normal | base | + +### 9.3 Trailing Stop Activation + +Activates when: +``` +favorable_move = current_price − entry_price > 0.5 × (take_profit − entry_price) +``` + +Once active, stop-loss floor = entry_price (breakeven). + +--- + +## 10. Risk Management + +### 10.1 Position Limits + +**Source:** `services/risk/engine.py` + +| Limit | Default | Formula | +|---|---|---| +| Max position % | 5% | position_value / portfolio_value ≤ 0.05 | +| Max position value | $10,000 | existing + new ≤ $10,000 | +| Max shares/order | 1,000 | quantity ≤ 1,000 | +| Max sector % | 25% | sector_value / portfolio_value ≤ 0.25 | +| Max daily loss % | 2% | |daily_pnl| / portfolio_value ≤ 0.02 | +| Max daily loss $ | $1,000 | |daily_pnl| ≤ $1,000 | +| Max daily trades | 20 | trade_count < 20 | + +### 10.2 Order Clamping + +**Source:** `services/risk/engine.py` — `clamp_order_to_position_limits()` + +When a buy order exceeds position limits, instead of rejecting: + +``` +max_allowed_value = min( + max_position_value − existing_value, + max_position_pct × portfolio_value − existing_value +) +clamped_shares = min( floor(max_allowed_value / price_per_share), max_shares_per_order ) +``` + +### 10.3 News Shock Lockout + +Trigger: impact_score ≥ 0.80 for catalyst ∈ {earnings, legal, m_and_a} +Duration: 60 minutes (configurable) + +### 10.4 Symbol Cooldown + +Duration: 15 minutes between trades on same symbol. +Max concurrent positions per symbol: 1. + +--- + +## 11. Circuit Breaker + +**Source:** `services/trading/circuit_breaker.py` + +| Trigger | Condition | Cooldown | +|---|---|---| +| Daily loss | |daily_pnl| / portfolio_value > 0.05 | 2 hours | +| Single position | position_loss_pct > 0.15 | 48 hours | +| Volatility | ≥ 3 stop-losses within 30-minute window | 2 hours | + +--- + +## 12. Risk Tier Auto-Adjustment + +**Source:** `services/trading/risk_tier_controller.py` + +Tiers: conservative → moderate → aggressive + +**Downgrade** (any one triggers, drops one level): +- 30-day win rate < 40% +- Current drawdown > 15% + +**Upgrade** (all must be true, raises one level): +- 30-day win rate > 55% +- Reserve pool > 20% of portfolio +- Current drawdown < 5% + +--- + +## 13. Portfolio Rebalancing + +**Source:** `services/trading/rebalancer.py` + +### 13.1 Single-Stock Rebalancing + +``` +excess = market_value − max_position_pct × active_pool +sell_qty = min( floor(excess / current_price), position_quantity ) +``` + +### 13.2 Sector Rebalancing + +``` +sector_excess = Σ(market_value_i) − max_sector_pct × active_pool +``` + +Sell from lowest-confidence positions first until excess is covered. + +### 13.3 Max Positions Enforcement + +``` +excess_count = N_positions − max_positions +``` + +Sell entire lowest-confidence positions until count is within limit. + +--- + +## Constants Summary + +| Constant | Value | Location | +|---|---|---| +| Confidence gate floor | 0.20 | scoring.py | +| Min recency weight | 0.01 | scoring.py | +| Credibility floor/ceiling | 0.10 / 1.0 | scoring.py | +| Novelty bonus max | 0.25 (25%) | scoring.py | +| Volatility boost threshold | 1.0 price units | scoring.py | +| Volatility boost max | 0.30 (30%) | scoring.py | +| Volume surge threshold | 50% | scoring.py | +| Volume surge boost | 0.15 (15%) | scoring.py | +| Bullish/bearish threshold | ±0.15 | worker.py | +| Mixed threshold | contradiction > 0.10, |S| < 0.30 | worker.py | +| Macro signal weight | 0.30 | config.py | +| Competitive signal weight | 0.20 | config.py | +| Macro confidence threshold | 0.40 | interpolation.py | +| Staleness accelerated decay | 0.50× | interpolation.py | +| Short-term staleness hours | 48 | interpolation.py | +| Pattern min samples | configurable | pattern_matcher.py | +| Major decision weight multiplier | 1.3× | pattern_matcher.py | +| Routine lookback | 180 days | pattern_matcher.py | +| Major decision lookback | 365 days | pattern_matcher.py | +| Propagation strength threshold | 0.20 | signal_propagation.py | +| Data quality min score | 0.30 | suppression.py | +| Evidence staleness max | 168 hours (7 days) | suppression.py | +| Recommendation min confidence | 0.35 | eligibility.py | +| Recommendation min strength | 0.10 | eligibility.py | +| Action strength threshold | 0.25 | eligibility.py | +| Live confidence threshold | 0.70 | eligibility.py | +| Paper confidence threshold | 0.50 | eligibility.py | +| Base portfolio allocation | 1% | eligibility.py | +| Max portfolio allocation | 10% | eligibility.py | +| Circuit breaker daily loss | 5% | circuit_breaker.py | +| Circuit breaker single position | 15% | circuit_breaker.py | +| Stop-loss cluster threshold | 3 hits / 30 min | circuit_breaker.py | +| Tier downgrade win rate | < 40% | risk_tier_controller.py | +| Tier upgrade win rate | > 55% | risk_tier_controller.py | +| Tier upgrade max drawdown | < 5% | risk_tier_controller.py | +| Tier upgrade min reserve | > 20% | risk_tier_controller.py |