docs: add comprehensive mathematical reference for all pipeline equations
ci/woodpecker/push/test Pipeline was successful
ci/woodpecker/push/build-1 Pipeline was successful
ci/woodpecker/push/build-2 Pipeline was successful
ci/woodpecker/push/build-3 Pipeline was successful
ci/woodpecker/push/finalize Pipeline was successful
Build and Push / lint-and-test (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.adapters.broker_adapter name:broker-adapter]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.aggregation.worker name:aggregation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.extractor.worker name:extractor]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.ingestion.worker name:ingestion]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.lake_publisher.worker name:lake-publisher]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.parser.worker name:parser]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.recommendation.worker name:recommendation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.scheduler.app name:scheduler]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.api.app:app --host 0.0.0.0 --port 8000 name:query-api]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.risk.app:app --host 0.0.0.0 --port 8000 name:risk]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.symbol_registry.app:app --host 0.0.0.0 --port 8000 name:symbol-registry]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.trading.app:app --host 0.0.0.0 --port 8000 name:trading-engine]) (push) Has been cancelled
Build and Push / build-dashboard (push) Has been cancelled
Build and Push / build-superset (push) Has been cancelled
Build and Push / integration-test (push) Has been cancelled
Build and Push / beta-gate (push) Has been cancelled

This commit is contained in:
Celes Renata
2026-04-28 17:01:03 +00:00
parent 3b22f5e1fc
commit 4954318f7b
+651
View File
@@ -0,0 +1,651 @@
# Stonks Oracle — Mathematical Reference
Every equation, formula, threshold, and constant used in the signal processing, aggregation, recommendation, and trading pipeline. Organized by pipeline stage.
Code references are provided so each formula can be traced to its implementation.
---
## 1. Signal Scoring
**Source:** `services/aggregation/scoring.py`
### 1.1 Combined Signal Weight
Each document signal receives a composite weight:
```
W_combined = G_conf × W_recency × W_credibility × (1 + B_novelty) × M_context
```
| Component | Symbol | Formula | Range |
|---|---|---|---|
| Confidence gate | G_conf | 1 if extraction_confidence ≥ 0.2, else 0 | {0, 1} |
| Recency decay | W_recency | 2^(t_age / t_half) | [0.01, 1.0] |
| Credibility | W_credibility | clamp(credibility, 0.1, 1.0)^α | [0.1, 1.0] |
| Novelty bonus | B_novelty | novelty_score × 0.25 | [0, 0.25] |
| Market context | M_context | 1 + boost_vol + boost_vol_surge | [1.0, 1.45] |
### 1.2 Recency Decay
```
W_recency = max( 2^(t_age / t_half), 0.01 )
```
where `t_age` is document age in hours and half-lives by window are:
| Window | t_half (hours) |
|---|---|
| intraday | 2 |
| 1d | 12 |
| 7d | 72 |
| 30d | 240 |
| 90d | 720 |
### 1.3 Credibility Weight
```
W_credibility = clamp(c_raw, 0.1, 1.0)^α where α = 1.0 (default)
```
α > 1 penalizes low-credibility sources more aggressively; α < 1 flattens the curve.
### 1.4 Market Context Multiplier
```
boost_vol = min( ln(1 + max(σ 1.0, 0)) × 0.15, 0.30 )
boost_surge = 0.15 if ΔV% > 50%, else 0
M_context = 1.0 + boost_vol + boost_surge
```
where σ is price volatility and ΔV% is volume change percentage.
### 1.5 Weighted Sentiment Average
```
S_avg = Σ(W_combined_i × impact_i × sentiment_i) / Σ(W_combined_i × impact_i)
```
- sentiment_i ∈ {+1.0 (positive), 1.0 (negative), 0.0 (neutral/mixed)}
- impact_i ∈ [0, 1] from extraction
- Returns 0.0 when denominator = 0
---
## 2. Trend Summary Assembly
**Source:** `services/aggregation/worker.py`
### 2.1 Trend Direction
| Condition | Direction |
|---|---|
| S_avg ≥ 0.15 | Bullish |
| S_avg ≤ 0.15 | Bearish |
| contradiction > 0.10 AND |S_avg| < 0.30 | Mixed |
| otherwise | Neutral |
### 2.2 Trend Strength
```
strength = min(|S_avg|, 1.0)
```
### 2.3 Contradiction Score
**Source:** `services/aggregation/contradiction.py`
```
contradiction = W_minority / (W_positive + W_negative)
```
where:
```
W_positive = Σ(W_combined_i × impact_i) for signals with sentiment > 0
W_negative = Σ(W_combined_i × impact_i) for signals with sentiment < 0
W_minority = min(W_positive, W_negative)
```
Range: [0, 1]. 0 = full agreement, 0.5 = equal-weight disagreement.
### 2.4 Trend Confidence
```
confidence = clamp(0.3 × F_count + 0.3 × C_avg + 0.4 × A_agreement P_contradiction, 0, 1)
```
| Component | Formula |
|---|---|
| F_count (source count) | min(N_unique / 15, 0.8) |
| C_avg (extraction confidence) | mean of extraction confidences |
| A_agreement (signal agreement) | fraction_same_direction × min(1, log₂(N_unique + 1) / log₂(8)) |
| P_contradiction | contradiction_score × 0.4 |
---
## 3. Macro Impact Scoring (Layer 2)
**Source:** `services/aggregation/interpolation.py`
### 3.1 Overlap Components
**Geographic overlap:**
```
O_geo = Σ revenue_pct_r for each event region r in company's revenue mix
```
Range: [0, 1]
**Supply chain overlap:**
```
O_supply = |event_regions ∩ supply_regions| / |supply_regions|
```
**Commodity overlap:**
```
O_commodity = |event_commodities ∩ company_commodities| / |company_commodities|
```
**Sector overlap:**
```
O_sector = 1.0 if company_sector ∈ event_affected_sectors, else 0.0
```
### 3.2 Raw Macro Impact Score
```
S_raw = W_severity × (0.35 × O_geo + 0.25 × O_supply + 0.25 × O_commodity + 0.15 × O_sector)
```
Severity weights:
| Severity | W_severity |
|---|---|
| critical | 1.0 |
| high | 0.75 |
| moderate | 0.5 |
| low | 0.25 |
### 3.3 Resilience Modifier
For international events, the raw score is adjusted by market position:
```
S_final = clamp(S_raw × R_tier, 0, 1)
```
| Market Position Tier | R_tier |
|---|---|
| Global leader | 0.70 |
| Multinational | 0.85 |
| Regional | 1.00 |
| Domestic | 1.20 |
For domestic-only events, R_tier = 1.0 regardless of tier.
### 3.4 Macro Impact Confidence
```
confidence = min(event_confidence × min(O_total + 0.3, 1.0), 1.0)
```
where O_total = O_geo + O_supply + O_commodity + O_sector.
### 3.5 Accelerated Staleness Decay
For short-term events older than 48 hours:
```
decay_standard = e^(0.693 × t_age_hours / t_half_hours) (t_half default = 168h)
decay_accelerated = decay_standard × 0.5
```
### 3.6 Macro Signal as WeightedSignal
When merged into the aggregation engine:
```
impact_score_macro = macro_impact_score × W_macro (W_macro = 0.3 default)
sentiment_value = +1 if positive, 1 if negative
```
Recency decay uses the global event's publication time.
---
## 4. Competitive Signals (Layer 3)
### 4.1 Pattern Confidence
**Source:** `services/aggregation/pattern_matcher.py`
```
confidence = F_sample × 0.4 + F_consistency × 0.4 + F_recency × 0.2
```
| Factor | Formula |
|---|---|
| F_sample | min(N_samples / 20, 1.0) |
| F_consistency | max(pct_bullish, pct_bearish) |
| F_recency | 1.0 if age ≤ 7d; 0.7 if age ≤ 90d; 0.4 otherwise |
**Modifiers:**
- Major corporate decision (m&a, earnings, legal): confidence × 1.3
- Insufficient data (N_samples < min_pattern_samples): cap at 0.25
- Stale data (age > staleness_window_days): confidence × staleness_decay_penalty
**Lookback windows:**
- Routine signals: 180 days
- Major corporate decisions: 365 days
### 4.2 Cross-Company Signal Strength
**Source:** `services/aggregation/signal_propagation.py`
```
S_competitive = clamp(S_pattern_avg × R_relationship × C_pattern × I_source, 0, 1)
```
| Component | Description |
|---|---|
| S_pattern_avg | Average historical outcome strength [0, 1] |
| R_relationship | Relationship strength from competitor_relationships [0, 1] |
| C_pattern | Pattern confidence from §4.1 |
| I_source | Source document's impact_score [0, 1] |
**Threshold gate:** Skipped if R_relationship < propagation_strength_threshold (default 0.2).
### 4.3 Competitive Signal as WeightedSignal
```
impact_score_competitive = S_competitive × W_competitive (W_competitive = 0.2 default)
direction = majority historical outcome (bullish or bearish)
```
---
## 5. Trend Projection
**Source:** `services/aggregation/projection.py`
### 5.1 Trend Momentum
```
momentum = S_current_signed S_previous_signed
```
where `S_signed = direction_sign × strength` (bullish = +1, bearish = 1, neutral = 0).
When no previous data exists:
```
momentum = direction_sign × strength × 0.5
```
Range: [1, 1]
### 5.2 Macro Decay Projection
For each active macro event projected forward by `H` days:
```
F_future = 2^((t_current + H) / t_half)
I_projected = macro_impact_score × F_future × W_severity
```
Decay half-lives:
| Duration | t_half (days) |
|---|---|
| short_term | 1.0 |
| medium_term | 7.0 |
| long_term | 30.0 |
Aggregate direction: bullish if W_pos > 1.2 × W_neg; bearish if W_neg > 1.2 × W_pos; mixed if both > 0.
### 5.3 Projection Blending
```
W_macro_blend = min(S_macro_projected × 0.4, 0.4)
W_company = 1.0 W_macro_blend
S_blended = W_company × S_momentum_projected + W_macro_blend × S_macro_signed
```
**Catalyst boost:** `min(N_catalysts × 0.02, 0.1)` added to projected strength.
**Projected confidence:**
```
C_projected = C_base × 0.8 + min(S_macro × 0.15, 0.1)
```
**Divergence detection:** Flagged when projected direction ≠ current trend direction.
---
## 6. Data Quality Suppression
**Source:** `services/recommendation/suppression.py`
### 6.1 Data Quality Score
```
Q = 0.4 × Q_confidence + 0.3 × Q_freshness + 0.3 × Q_coverage
```
| Component | Formula |
|---|---|
| Q_confidence | min(C_avg_extraction / 0.8, 1.0) |
| Q_freshness | max(0, 1 t_newest_hours / 168) |
| Q_coverage | (N_valid / N_total) × min(N_valid / 10, 1.0) |
**Suppression triggers** (any one → informational only):
| Check | Threshold |
|---|---|
| Avg extraction confidence | < 0.40 |
| Evidence staleness | > 168 hours (7 days) |
| Source type diversity | < 1 distinct type |
| Extraction failure rate | > 50% |
| Valid document count | < 2 |
| Data quality score | < 0.30 |
### 6.2 Safety Suppression
- **Macro-only:** If trend driven solely by macro signals with zero company evidence → forced informational
- **Pattern-only:** If trend driven solely by pattern/competitive signals with no company or macro support → forced informational
---
## 7. Recommendation Eligibility
**Source:** `services/recommendation/eligibility.py`
### 7.1 Gate Checks (all must pass)
| Check | Threshold |
|---|---|
| Confidence | ≥ 0.35 |
| Trend strength | ≥ 0.10 |
| Contradiction score | ≤ 0.60 |
| Evidence count | ≥ 2 |
| Direction | ≠ neutral |
### 7.2 Action Mapping
| Condition | Action |
|---|---|
| Bullish AND strength ≥ 0.25 | BUY |
| Bearish AND strength ≥ 0.25 | SELL |
| Directional AND confidence ≥ 0.50 | HOLD |
| Mixed or weak | WATCH |
### 7.3 Mode Escalation
| Mode | Requirements |
|---|---|
| live_eligible | confidence ≥ 0.70, contradiction ≤ 0.25, evidence ≥ 5 |
| paper_eligible | confidence ≥ 0.50 |
| informational | everything else (WATCH/HOLD always informational) |
### 7.4 Position Sizing
```
portfolio_pct = base + C_factor × S_factor × range × P_contradiction × P_evidence
```
| Component | Formula | Default |
|---|---|---|
| base | base_portfolio_pct | 0.01 (1%) |
| range | max_portfolio_pct base_portfolio_pct | 0.09 (9%) |
| C_factor | confidence_sizing_weight × confidence | 0.8 × confidence |
| S_factor | 0.5 + 0.5 × trend_strength | [0.5, 1.0] |
| P_contradiction | 1 (contradiction_penalty × contradiction_score) | penalty = 0.5 |
| P_evidence | 0.50 if evidence < 3; 0.75 if evidence < 5; 1.0 otherwise | |
Clamped to [base × 0.5, max_portfolio_pct].
**Max loss percentage** uses the same structure with base = 0.003 (0.3%) and max = 0.02 (2%).
---
## 8. Trading Engine — Position Sizing
**Source:** `services/trading/position_sizer.py`
### 8.1 Base Allocation
```
raw_pct = (max_position_pct × 0.5) × (confidence / min_confidence) × multiplier
clamped_pct = min(raw_pct, max_position_pct)
dollar_amount = min(active_pool × clamped_pct, absolute_position_cap)
```
### 8.2 Correlation Reduction
```
ρ_avg = Σ(ρ_i × w_i) / Σ(w_i) for existing positions
```
| ρ_avg | Action |
|---|---|
| > 0.8 | Reject order |
| 0.5 < ρ_avg ≤ 0.8 | Reduce: factor = 1 (ρ_avg 0.5) / 0.3 |
| ≤ 0.5 | No reduction |
### 8.3 Sector Exposure Reduction
```
available = max(max_sector_pct × active_pool current_sector_exposure, 0)
dollar_amount = min(dollar_amount, available)
```
### 8.4 Diversification Bonus
If < 3 sectors held AND entering a new sector: dollar_amount × 1.2 (capped at max_position_pct).
### 8.5 Earnings Proximity
| Days to earnings | Action |
|---|---|
| ≤ 1 | Reject |
| 13 | 50% reduction |
| > 3 | No adjustment |
### 8.6 Portfolio Heat Check
```
heat_new = dollar_amount × atr_multiplier × 0.02
heat_max = max_portfolio_heat × active_pool
Reject if: heat_current + heat_new > heat_max
```
### 8.7 Share Rounding
```
shares = floor(dollar_amount / current_price)
final_dollar = shares × current_price
```
Reject if shares = 0.
---
## 9. Stop-Loss and Take-Profit
**Source:** `services/trading/stop_loss_manager.py`
### 9.1 Initial Levels
```
stop_distance = ATR × M_atr
stop_loss = entry_price stop_distance
take_profit = entry_price + stop_distance × R_reward_risk
```
| Trade type | M_atr | R_reward_risk |
|---|---|---|
| Standard | risk_tier.stop_loss_atr_multiplier | risk_tier.reward_risk_ratio |
| Micro-trade | 1.0 | 1.5 |
### 9.2 Dynamic Tightening
| Condition | Effective multiplier |
|---|---|
| High-severity macro event | base × 0.5 |
| Earnings within 3 days | base × 0.7 |
| Portfolio heat > 80% of max | base × 0.7 |
| Normal | base |
### 9.3 Trailing Stop Activation
Activates when:
```
favorable_move = current_price entry_price > 0.5 × (take_profit entry_price)
```
Once active, stop-loss floor = entry_price (breakeven).
---
## 10. Risk Management
### 10.1 Position Limits
**Source:** `services/risk/engine.py`
| Limit | Default | Formula |
|---|---|---|
| Max position % | 5% | position_value / portfolio_value ≤ 0.05 |
| Max position value | $10,000 | existing + new ≤ $10,000 |
| Max shares/order | 1,000 | quantity ≤ 1,000 |
| Max sector % | 25% | sector_value / portfolio_value ≤ 0.25 |
| Max daily loss % | 2% | |daily_pnl| / portfolio_value ≤ 0.02 |
| Max daily loss $ | $1,000 | |daily_pnl| ≤ $1,000 |
| Max daily trades | 20 | trade_count < 20 |
### 10.2 Order Clamping
**Source:** `services/risk/engine.py``clamp_order_to_position_limits()`
When a buy order exceeds position limits, instead of rejecting:
```
max_allowed_value = min(
max_position_value existing_value,
max_position_pct × portfolio_value existing_value
)
clamped_shares = min( floor(max_allowed_value / price_per_share), max_shares_per_order )
```
### 10.3 News Shock Lockout
Trigger: impact_score ≥ 0.80 for catalyst ∈ {earnings, legal, m_and_a}
Duration: 60 minutes (configurable)
### 10.4 Symbol Cooldown
Duration: 15 minutes between trades on same symbol.
Max concurrent positions per symbol: 1.
---
## 11. Circuit Breaker
**Source:** `services/trading/circuit_breaker.py`
| Trigger | Condition | Cooldown |
|---|---|---|
| Daily loss | |daily_pnl| / portfolio_value > 0.05 | 2 hours |
| Single position | position_loss_pct > 0.15 | 48 hours |
| Volatility | ≥ 3 stop-losses within 30-minute window | 2 hours |
---
## 12. Risk Tier Auto-Adjustment
**Source:** `services/trading/risk_tier_controller.py`
Tiers: conservative → moderate → aggressive
**Downgrade** (any one triggers, drops one level):
- 30-day win rate < 40%
- Current drawdown > 15%
**Upgrade** (all must be true, raises one level):
- 30-day win rate > 55%
- Reserve pool > 20% of portfolio
- Current drawdown < 5%
---
## 13. Portfolio Rebalancing
**Source:** `services/trading/rebalancer.py`
### 13.1 Single-Stock Rebalancing
```
excess = market_value max_position_pct × active_pool
sell_qty = min( floor(excess / current_price), position_quantity )
```
### 13.2 Sector Rebalancing
```
sector_excess = Σ(market_value_i) max_sector_pct × active_pool
```
Sell from lowest-confidence positions first until excess is covered.
### 13.3 Max Positions Enforcement
```
excess_count = N_positions max_positions
```
Sell entire lowest-confidence positions until count is within limit.
---
## Constants Summary
| Constant | Value | Location |
|---|---|---|
| Confidence gate floor | 0.20 | scoring.py |
| Min recency weight | 0.01 | scoring.py |
| Credibility floor/ceiling | 0.10 / 1.0 | scoring.py |
| Novelty bonus max | 0.25 (25%) | scoring.py |
| Volatility boost threshold | 1.0 price units | scoring.py |
| Volatility boost max | 0.30 (30%) | scoring.py |
| Volume surge threshold | 50% | scoring.py |
| Volume surge boost | 0.15 (15%) | scoring.py |
| Bullish/bearish threshold | ±0.15 | worker.py |
| Mixed threshold | contradiction > 0.10, |S| < 0.30 | worker.py |
| Macro signal weight | 0.30 | config.py |
| Competitive signal weight | 0.20 | config.py |
| Macro confidence threshold | 0.40 | interpolation.py |
| Staleness accelerated decay | 0.50× | interpolation.py |
| Short-term staleness hours | 48 | interpolation.py |
| Pattern min samples | configurable | pattern_matcher.py |
| Major decision weight multiplier | 1.3× | pattern_matcher.py |
| Routine lookback | 180 days | pattern_matcher.py |
| Major decision lookback | 365 days | pattern_matcher.py |
| Propagation strength threshold | 0.20 | signal_propagation.py |
| Data quality min score | 0.30 | suppression.py |
| Evidence staleness max | 168 hours (7 days) | suppression.py |
| Recommendation min confidence | 0.35 | eligibility.py |
| Recommendation min strength | 0.10 | eligibility.py |
| Action strength threshold | 0.25 | eligibility.py |
| Live confidence threshold | 0.70 | eligibility.py |
| Paper confidence threshold | 0.50 | eligibility.py |
| Base portfolio allocation | 1% | eligibility.py |
| Max portfolio allocation | 10% | eligibility.py |
| Circuit breaker daily loss | 5% | circuit_breaker.py |
| Circuit breaker single position | 15% | circuit_breaker.py |
| Stop-loss cluster threshold | 3 hits / 30 min | circuit_breaker.py |
| Tier downgrade win rate | < 40% | risk_tier_controller.py |
| Tier upgrade win rate | > 55% | risk_tier_controller.py |
| Tier upgrade max drawdown | < 5% | risk_tier_controller.py |
| Tier upgrade min reserve | > 20% | risk_tier_controller.py |