feat: trading feedback engine — periodic performance reports with AI summarization
ci/woodpecker/push/test Pipeline was successful
ci/woodpecker/push/build-2 Pipeline was successful
ci/woodpecker/push/build-3 Pipeline was successful
ci/woodpecker/push/build-1 Pipeline was successful
ci/woodpecker/push/finalize Pipeline was successful
Build and Push / lint-and-test (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.adapters.broker_adapter name:broker-adapter]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.aggregation.worker name:aggregation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.extractor.worker name:extractor]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.ingestion.worker name:ingestion]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.lake_publisher.worker name:lake-publisher]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.parser.worker name:parser]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.recommendation.worker name:recommendation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.scheduler.app name:scheduler]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.api.app:app --host 0.0.0.0 --port 8000 name:query-api]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.risk.app:app --host 0.0.0.0 --port 8000 name:risk]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.symbol_registry.app:app --host 0.0.0.0 --port 8000 name:symbol-registry]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.trading.app:app --host 0.0.0.0 --port 8000 name:trading-engine]) (push) Has been cancelled
Build and Push / build-dashboard (push) Has been cancelled
Build and Push / build-superset (push) Has been cancelled
Build and Push / integration-test (push) Has been cancelled
Build and Push / beta-gate (push) Has been cancelled

- Migration 038: trading_reports table + report-summarizer agent seed
- 6 reporting modules: models, collector, sections, validator, summarizer, generator
- API endpoints: GET /api/reports (paginated, filterable), GET /api/reports/{id}
- Frontend hooks: useReports, useReport with TanStack Query
- Scheduler: daily (after 16:30 ET) and weekly (Saturday) report triggers
- Redis queue consumer for async report generation with retry/dedup
- 5 property-based tests (chunking, serialization, validation, accuracy, deltas)
- 109 unit/integration tests across all modules
- 6 frontend hook tests with MSW mocks
This commit is contained in:
Celes Renata
2026-05-01 22:13:09 +00:00
parent 376fcb4bb4
commit bc077bfcc8
28 changed files with 6771 additions and 1 deletions
+110
View File
@@ -0,0 +1,110 @@
# Feature: trading-feedback-engine, Property 1: Chunking round-trip and size constraint
"""Property-based tests for report data chunking.
Feature: trading-feedback-engine
Tests the chunking round-trip and size constraint property from the design
specification: for any input string, splitting it into chunks with a maximum
size limit produces chunks where (a) every chunk is ≤ the size limit in
characters (for chunks that don't contain a single oversized line), (b) no
chunk is empty (except when the input itself is empty, which produces exactly
one empty chunk), and (c) concatenating all chunks in order reconstructs the
original input string.
"""
from __future__ import annotations
from hypothesis import given, settings
from hypothesis import strategies as st
from services.reporting.summarizer import chunk_data
# ---------------------------------------------------------------------------
# Property 1: Chunking Round-Trip and Size Constraint
# Validates: Requirements 2.2
# ---------------------------------------------------------------------------
@given(
text=st.text(),
max_chars=st.integers(min_value=1, max_value=10000),
)
@settings(max_examples=100)
def test_chunk_data_round_trip(text: str, max_chars: int) -> None:
"""**Validates: Requirements 2.2**
For any input string and any max_chars ≥ 1, concatenating all chunks
produced by chunk_data SHALL reconstruct the original input string
exactly (round-trip property).
"""
chunks = chunk_data(text, max_chars)
reconstructed = "".join(chunks)
assert reconstructed == text, (
f"Round-trip failed: concatenation of {len(chunks)} chunks does not "
f"equal original input.\n"
f" original length: {len(text)}\n"
f" reconstructed length: {len(reconstructed)}\n"
f" max_chars: {max_chars}"
)
@given(
text=st.text(),
max_chars=st.integers(min_value=1, max_value=10000),
)
@settings(max_examples=100)
def test_chunk_data_no_empty_chunks(text: str, max_chars: int) -> None:
"""**Validates: Requirements 2.2**
For any input string and any max_chars ≥ 1, chunk_data SHALL produce
no empty chunks — except when the input itself is empty, in which case
it SHALL produce exactly one empty chunk.
"""
chunks = chunk_data(text, max_chars)
if text == "":
assert chunks == [""], (
f"Empty input should produce exactly [''], got {chunks!r}"
)
else:
for i, chunk in enumerate(chunks):
assert chunk != "", (
f"Chunk {i} is empty for non-empty input.\n"
f" input length: {len(text)}\n"
f" max_chars: {max_chars}\n"
f" total chunks: {len(chunks)}"
)
@given(
text=st.text(),
max_chars=st.integers(min_value=1, max_value=10000),
)
@settings(max_examples=100)
def test_chunk_data_size_constraint(text: str, max_chars: int) -> None:
"""**Validates: Requirements 2.2**
For any input string and any max_chars ≥ 1, every chunk produced by
chunk_data SHALL be ≤ max_chars in length — UNLESS the chunk contains
a single line that by itself exceeds max_chars (since chunk_data never
breaks mid-line, such a line is emitted as its own chunk).
A chunk is considered "oversized due to a single long line" when it
consists of exactly one segment (a line with its trailing newline, or
the final line without one) whose length exceeds max_chars.
"""
chunks = chunk_data(text, max_chars)
for i, chunk in enumerate(chunks):
if len(chunk) > max_chars:
# This chunk exceeds the limit. It must be because it contains
# a single line that is itself longer than max_chars.
# A single-segment chunk has at most one newline (at the end).
lines_in_chunk = chunk.split("\n")
# If the chunk ends with \n, split produces a trailing empty string
non_empty_lines = [ln for ln in lines_in_chunk if ln]
assert len(non_empty_lines) <= 1, (
f"Chunk {i} exceeds max_chars={max_chars} "
f"(len={len(chunk)}) but contains multiple non-empty lines, "
f"which should not happen.\n"
f" lines: {non_empty_lines!r}"
)
+423
View File
@@ -0,0 +1,423 @@
# Feature: trading-feedback-engine, Property 4: Recommendation accuracy aggregation
# Feature: trading-feedback-engine, Property 5: Portfolio period-over-period delta computation
"""Property-based tests for report section builders.
Feature: trading-feedback-engine
Property 4 tests the recommendation accuracy aggregation property from the
design specification: for any non-empty list of trading decisions with
associated prediction outcomes, the computed acted_win_rate SHALL equal the
count of profitable outcomes divided by total acted outcomes with prediction
data, and all rate values SHALL be in [0.0, 1.0].
Property 5 tests the portfolio period-over-period delta computation property
from the design specification: for any two valid portfolio snapshots (current
and previous), the period-over-period deltas SHALL equal (current - previous)
for each field. When no previous snapshot exists, the deltas SHALL be zero.
"""
from __future__ import annotations
import uuid
from hypothesis import given, settings
from hypothesis import strategies as st
from services.reporting.collector import CollectedData
from services.reporting.sections import (
build_pnl_section,
build_recommendation_accuracy_section,
)
# ---------------------------------------------------------------------------
# Property 4: Recommendation Accuracy Aggregation
# Validates: Requirements 1.4
# ---------------------------------------------------------------------------
# Strategy: generate a list of unique tickers, then build matching
# trading_decisions, recommendations, and prediction_outcomes.
_ticker_strategy = st.text(
alphabet=st.characters(whitelist_categories=("Lu",)),
min_size=1,
max_size=5,
)
_confidence_strategy = st.floats(
min_value=0.0, max_value=1.0, allow_nan=False, allow_infinity=False,
)
_excess_return_strategy = st.floats(
min_value=-1.0, max_value=1.0, allow_nan=False, allow_infinity=False,
)
@st.composite
def recommendation_accuracy_data(draw: st.DrawFn) -> tuple[CollectedData, dict]:
"""Generate CollectedData with matching trading decisions, recommendations,
and prediction outcomes for testing recommendation accuracy.
Returns (CollectedData, expected_values) where expected_values contains
the independently computed expected results.
"""
# Generate 1-20 trading decisions with unique tickers
n = draw(st.integers(min_value=1, max_value=20))
tickers = [draw(_ticker_strategy) for _ in range(n)]
# Ensure unique tickers by appending index
tickers = [f"{t}{i}" for i, t in enumerate(tickers)]
decisions = draw(
st.lists(
st.sampled_from(["act", "skip"]),
min_size=n,
max_size=n,
)
)
confidences = draw(
st.lists(
_confidence_strategy,
min_size=n,
max_size=n,
)
)
profitable_flags = draw(
st.lists(
st.booleans(),
min_size=n,
max_size=n,
)
)
direction_correct_flags = draw(
st.lists(
st.booleans(),
min_size=n,
max_size=n,
)
)
excess_returns = draw(
st.lists(
_excess_return_strategy,
min_size=n,
max_size=n,
)
)
trading_decisions = []
recommendations = []
prediction_outcomes = []
# Track expected values
exp_act_count = 0
exp_skip_count = 0
exp_acted_wins = 0
exp_acted_with_outcome = 0
exp_confidence_acted: list[float] = []
exp_confidence_skipped: list[float] = []
for i in range(n):
rec_id = str(uuid.uuid4())
ticker = tickers[i]
decision = decisions[i]
confidence = confidences[i]
profitable = profitable_flags[i]
direction_correct = direction_correct_flags[i]
excess_return = excess_returns[i]
trading_decisions.append(
{
"id": str(uuid.uuid4()),
"recommendation_id": rec_id,
"decision": decision,
"ticker": ticker,
}
)
recommendations.append(
{
"id": rec_id,
"confidence": confidence,
}
)
prediction_outcomes.append(
{
"ticker": ticker,
"profitable": profitable,
"direction_correct": direction_correct,
"excess_return_vs_spy": excess_return,
}
)
if decision == "act":
exp_act_count += 1
exp_confidence_acted.append(confidence)
# Every acted decision has a matching prediction outcome by ticker
exp_acted_with_outcome += 1
if profitable:
exp_acted_wins += 1
else:
exp_skip_count += 1
exp_confidence_skipped.append(confidence)
data = CollectedData(
trading_decisions=trading_decisions,
recommendations=recommendations,
prediction_outcomes=prediction_outcomes,
)
exp_acted_win_rate = (
(exp_acted_wins / exp_acted_with_outcome)
if exp_acted_with_outcome > 0
else 0.0
)
exp_avg_confidence_acted = (
(sum(exp_confidence_acted) / len(exp_confidence_acted))
if exp_confidence_acted
else 0.0
)
exp_avg_confidence_skipped = (
(sum(exp_confidence_skipped) / len(exp_confidence_skipped))
if exp_confidence_skipped
else 0.0
)
expected = {
"total_evaluated": exp_act_count + exp_skip_count,
"act_count": exp_act_count,
"skip_count": exp_skip_count,
"acted_win_rate": exp_acted_win_rate,
"avg_confidence_acted": exp_avg_confidence_acted,
"avg_confidence_skipped": exp_avg_confidence_skipped,
}
return data, expected
@given(data_and_expected=recommendation_accuracy_data())
@settings(max_examples=100)
def test_recommendation_accuracy_aggregation(
data_and_expected: tuple[CollectedData, dict],
) -> None:
"""**Validates: Requirements 1.4**
For any non-empty list of trading decisions with associated prediction
outcomes, the computed acted_win_rate SHALL equal the count of profitable
outcomes divided by total acted outcomes with prediction data, act/skip
counts SHALL match, average confidence values SHALL match, and all rate
values SHALL be in [0.0, 1.0].
"""
data, expected = data_and_expected
section = build_recommendation_accuracy_section(data)
# Verify act/skip counts
assert section.total_evaluated == expected["total_evaluated"], (
f"total_evaluated mismatch: got {section.total_evaluated}, "
f"expected {expected['total_evaluated']}"
)
assert section.act_count == expected["act_count"], (
f"act_count mismatch: got {section.act_count}, "
f"expected {expected['act_count']}"
)
assert section.skip_count == expected["skip_count"], (
f"skip_count mismatch: got {section.skip_count}, "
f"expected {expected['skip_count']}"
)
# Verify acted win rate
assert abs(section.acted_win_rate - expected["acted_win_rate"]) < 1e-9, (
f"acted_win_rate mismatch: got {section.acted_win_rate}, "
f"expected {expected['acted_win_rate']}"
)
# Verify average confidence values
assert abs(section.avg_confidence_acted - expected["avg_confidence_acted"]) < 1e-9, (
f"avg_confidence_acted mismatch: got {section.avg_confidence_acted}, "
f"expected {expected['avg_confidence_acted']}"
)
assert abs(section.avg_confidence_skipped - expected["avg_confidence_skipped"]) < 1e-9, (
f"avg_confidence_skipped mismatch: got {section.avg_confidence_skipped}, "
f"expected {expected['avg_confidence_skipped']}"
)
# All rate values must be in [0.0, 1.0]
assert 0.0 <= section.acted_win_rate <= 1.0, (
f"acted_win_rate out of range: {section.acted_win_rate}"
)
assert 0.0 <= section.avg_confidence_acted <= 1.0, (
f"avg_confidence_acted out of range: {section.avg_confidence_acted}"
)
assert 0.0 <= section.avg_confidence_skipped <= 1.0, (
f"avg_confidence_skipped out of range: {section.avg_confidence_skipped}"
)
# ---------------------------------------------------------------------------
# Property 5: Portfolio Period-Over-Period Delta Computation
# Validates: Requirements 1.3
# ---------------------------------------------------------------------------
_non_negative_float = st.floats(
min_value=0.0, max_value=1e8, allow_nan=False, allow_infinity=False,
)
_finite_float = st.floats(
min_value=-1e6, max_value=1e6, allow_nan=False, allow_infinity=False,
)
@st.composite
def portfolio_snapshot_pair(draw: st.DrawFn) -> tuple[dict, dict]:
"""Generate a pair of portfolio snapshots (current, previous) with
non-negative portfolio_value, active_pool, reserve_pool, and finite
cumulative_return.
"""
current = {
"portfolio_value": draw(_non_negative_float),
"active_pool": draw(_non_negative_float),
"reserve_pool": draw(_non_negative_float),
"cumulative_return": draw(_finite_float),
"realized_pnl": draw(_finite_float),
"unrealized_pnl": draw(_finite_float),
"daily_return": draw(_finite_float),
"win_count": draw(st.integers(min_value=0, max_value=10000)),
"loss_count": draw(st.integers(min_value=0, max_value=10000)),
"win_rate": draw(
st.floats(
min_value=0.0, max_value=1.0,
allow_nan=False, allow_infinity=False,
)
),
"sharpe_ratio": draw(_finite_float),
}
previous = {
"portfolio_value": draw(_non_negative_float),
"active_pool": draw(_non_negative_float),
"reserve_pool": draw(_non_negative_float),
"cumulative_return": draw(_finite_float),
"realized_pnl": draw(_finite_float),
"unrealized_pnl": draw(_finite_float),
"daily_return": draw(_finite_float),
"win_count": draw(st.integers(min_value=0, max_value=10000)),
"loss_count": draw(st.integers(min_value=0, max_value=10000)),
"win_rate": draw(
st.floats(
min_value=0.0, max_value=1.0,
allow_nan=False, allow_infinity=False,
)
),
"sharpe_ratio": draw(_finite_float),
}
return current, previous
@given(snapshots=portfolio_snapshot_pair())
@settings(max_examples=100)
def test_portfolio_delta_with_both_snapshots(
snapshots: tuple[dict, dict],
) -> None:
"""**Validates: Requirements 1.3**
For any two valid portfolio snapshots (current and previous), the
period-over-period deltas SHALL equal (current - previous) for
portfolio_value, active_pool, reserve_pool, and cumulative_return.
The build_pnl_section extracts values from the current snapshot.
We verify that the delta between the current and previous section
outputs matches (current - previous) for each field.
"""
current_snap, previous_snap = snapshots
# Build sections from current and previous snapshots
data_current = CollectedData(portfolio_snapshot=current_snap)
data_previous = CollectedData(portfolio_snapshot=previous_snap)
section_current = build_pnl_section(data_current)
section_previous = build_pnl_section(data_previous)
# Verify deltas: current section values - previous section values
# should equal current snapshot values - previous snapshot values
delta_cumulative = section_current.cumulative_return - section_previous.cumulative_return
expected_delta_cumulative = (
float(current_snap["cumulative_return"])
- float(previous_snap["cumulative_return"])
)
assert abs(delta_cumulative - expected_delta_cumulative) < 1e-9, (
f"cumulative_return delta mismatch: "
f"got {delta_cumulative}, expected {expected_delta_cumulative}"
)
delta_realized = section_current.realized_pnl - section_previous.realized_pnl
expected_delta_realized = (
float(current_snap["realized_pnl"])
- float(previous_snap["realized_pnl"])
)
assert abs(delta_realized - expected_delta_realized) < 1e-9, (
f"realized_pnl delta mismatch: "
f"got {delta_realized}, expected {expected_delta_realized}"
)
delta_unrealized = section_current.unrealized_pnl - section_previous.unrealized_pnl
expected_delta_unrealized = (
float(current_snap["unrealized_pnl"])
- float(previous_snap["unrealized_pnl"])
)
assert abs(delta_unrealized - expected_delta_unrealized) < 1e-9, (
f"unrealized_pnl delta mismatch: "
f"got {delta_unrealized}, expected {expected_delta_unrealized}"
)
# Verify that section values faithfully reflect snapshot values
assert abs(section_current.cumulative_return - float(current_snap["cumulative_return"])) < 1e-9
assert abs(section_current.realized_pnl - float(current_snap["realized_pnl"])) < 1e-9
assert abs(section_current.unrealized_pnl - float(current_snap["unrealized_pnl"])) < 1e-9
assert abs(section_current.daily_return - float(current_snap["daily_return"])) < 1e-9
assert abs(section_current.win_rate - float(current_snap["win_rate"])) < 1e-9
@given(
portfolio_value=_non_negative_float,
active_pool=_non_negative_float,
reserve_pool=_non_negative_float,
cumulative_return=_finite_float,
)
@settings(max_examples=100)
def test_portfolio_delta_no_previous_snapshot(
portfolio_value: float,
active_pool: float,
reserve_pool: float,
cumulative_return: float,
) -> None:
"""**Validates: Requirements 1.3**
When no previous snapshot exists, the section SHALL use zero values
for all fields (since portfolio_snapshot is None), meaning the deltas
from a zero baseline are effectively zero.
"""
# When portfolio_snapshot is None, build_pnl_section returns all zeros
data_no_snapshot = CollectedData(portfolio_snapshot=None)
section = build_pnl_section(data_no_snapshot)
assert section.realized_pnl == 0.0, (
f"Expected 0.0 realized_pnl with no snapshot, got {section.realized_pnl}"
)
assert section.unrealized_pnl == 0.0, (
f"Expected 0.0 unrealized_pnl with no snapshot, got {section.unrealized_pnl}"
)
assert section.daily_return == 0.0, (
f"Expected 0.0 daily_return with no snapshot, got {section.daily_return}"
)
assert section.cumulative_return == 0.0, (
f"Expected 0.0 cumulative_return with no snapshot, got {section.cumulative_return}"
)
assert section.win_count == 0, (
f"Expected 0 win_count with no snapshot, got {section.win_count}"
)
assert section.loss_count == 0, (
f"Expected 0 loss_count with no snapshot, got {section.loss_count}"
)
assert section.win_rate == 0.0, (
f"Expected 0.0 win_rate with no snapshot, got {section.win_rate}"
)
assert section.sharpe_ratio == 0.0, (
f"Expected 0.0 sharpe_ratio with no snapshot, got {section.sharpe_ratio}"
)
assert section.profit_factor == 0.0, (
f"Expected 0.0 profit_factor with no snapshot, got {section.profit_factor}"
)
+245
View File
@@ -0,0 +1,245 @@
# Feature: trading-feedback-engine, Property 2: Report serialization round-trip
"""Property-based tests for report serialization round-trip.
Feature: trading-feedback-engine
Tests the report serialization round-trip property from the design
specification: for any valid ReportData object (with valid P&L,
recommendation accuracy, position performance, risk metrics, and model
quality sections), serializing to JSON and then deserializing back SHALL
produce a ReportData object equivalent to the original. All datetime fields
in the serialized JSON SHALL be in ISO 8601 format.
"""
from __future__ import annotations
import json
import re
from datetime import date, datetime, timezone
from hypothesis import given, settings
from hypothesis import strategies as st
from services.reporting.models import (
ModelQualitySection,
ModelQualityWindow,
PLSection,
PositionDetail,
PositionPerformanceSection,
RecommendationAccuracySection,
ReportData,
ReportType,
RiskMetricsSection,
ValidationStatus,
ValidationWarning,
)
# ---------------------------------------------------------------------------
# Property 2: Report Serialization Round-Trip
# Validates: Requirements 8.1, 8.2, 8.3, 8.4
# ---------------------------------------------------------------------------
# ISO 8601 datetime pattern (covers both datetime and date formats)
_ISO8601_DATETIME_RE = re.compile(
r"^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}" # YYYY-MM-DDTHH:MM:SS
r"(?:\.\d+)?" # optional fractional seconds
r"(?:Z|[+-]\d{2}:\d{2})?$" # optional timezone
)
_ISO8601_DATE_RE = re.compile(r"^\d{4}-\d{2}-\d{2}$")
# ---------------------------------------------------------------------------
# Hypothesis strategies for each model
# ---------------------------------------------------------------------------
_finite_float = st.floats(allow_nan=False, allow_infinity=False)
_non_negative_finite_float = st.floats(
min_value=0.0, allow_nan=False, allow_infinity=False,
)
_rate_float = st.floats(
min_value=0.0, max_value=1.0, allow_nan=False, allow_infinity=False,
)
_optional_finite_float = st.one_of(st.none(), _finite_float)
_validation_warning_strategy = st.builds(
ValidationWarning,
field_name=st.text(min_size=1, max_size=50),
computed_value=_finite_float,
snapshot_value=_finite_float,
pct_difference=_non_negative_finite_float,
)
_pnl_section_strategy = st.builds(
PLSection,
realized_pnl=_finite_float,
unrealized_pnl=_finite_float,
daily_return=_finite_float,
cumulative_return=_finite_float,
win_count=st.integers(min_value=0, max_value=10000),
loss_count=st.integers(min_value=0, max_value=10000),
win_rate=_rate_float,
profit_factor=_non_negative_finite_float,
sharpe_ratio=_finite_float,
summary=st.text(max_size=200),
validation_warnings=st.lists(
_validation_warning_strategy, min_size=0, max_size=3,
),
)
_recommendation_accuracy_strategy = st.builds(
RecommendationAccuracySection,
total_evaluated=st.integers(min_value=0, max_value=10000),
act_count=st.integers(min_value=0, max_value=10000),
skip_count=st.integers(min_value=0, max_value=10000),
acted_win_rate=_rate_float,
avg_confidence_acted=_rate_float,
avg_confidence_skipped=_rate_float,
summary=st.text(max_size=200),
validation_warnings=st.lists(
_validation_warning_strategy, min_size=0, max_size=3,
),
)
_position_detail_strategy = st.builds(
PositionDetail,
ticker=st.text(min_size=1, max_size=10),
entry_price=_finite_float,
current_or_exit_price=_finite_float,
pnl=_finite_float,
pnl_pct=_finite_float,
hold_duration_hours=_non_negative_finite_float,
status=st.sampled_from(["open", "closed"]),
)
_position_performance_strategy = st.builds(
PositionPerformanceSection,
positions=st.lists(_position_detail_strategy, min_size=0, max_size=5),
summary=st.text(max_size=200),
)
_risk_metrics_strategy = st.builds(
RiskMetricsSection,
current_risk_tier=st.sampled_from(["low", "moderate", "high", "critical"]),
portfolio_heat=_non_negative_finite_float,
max_drawdown=_non_negative_finite_float,
current_drawdown_pct=_non_negative_finite_float,
reserve_pool_balance=_non_negative_finite_float,
circuit_breaker_event_count=st.integers(min_value=0, max_value=100),
summary=st.text(max_size=200),
)
_model_quality_window_strategy = st.builds(
ModelQualityWindow,
lookback=st.sampled_from(["7d", "30d", "90d"]),
win_rate=_optional_finite_float,
directional_accuracy=_optional_finite_float,
information_coefficient=_optional_finite_float,
calibration_error=_optional_finite_float,
brier_score=_optional_finite_float,
)
_model_quality_strategy = st.builds(
ModelQualitySection,
windows=st.lists(_model_quality_window_strategy, min_size=0, max_size=3),
summary=st.text(max_size=200),
validation_warnings=st.lists(
_validation_warning_strategy, min_size=0, max_size=3,
),
)
# Use timezone-aware datetimes for generated_at
_aware_datetime_strategy = st.datetimes(
min_value=datetime(2020, 1, 1),
max_value=datetime(2030, 12, 31),
timezones=st.just(timezone.utc),
)
_date_strategy = st.dates(
min_value=date(2020, 1, 1),
max_value=date(2030, 12, 31),
)
_report_data_strategy = st.builds(
ReportData,
pnl=_pnl_section_strategy,
recommendation_accuracy=_recommendation_accuracy_strategy,
position_performance=_position_performance_strategy,
risk_metrics=_risk_metrics_strategy,
model_quality=_model_quality_strategy,
executive_summary=st.text(max_size=300),
validation_status=st.sampled_from(list(ValidationStatus)),
generated_at=_aware_datetime_strategy,
period_start=_date_strategy,
period_end=_date_strategy,
report_type=st.sampled_from(list(ReportType)),
)
# ---------------------------------------------------------------------------
# Helper: recursively find all datetime-like string values in parsed JSON
# ---------------------------------------------------------------------------
_DATETIME_FIELD_NAMES = {"generated_at"}
_DATE_FIELD_NAMES = {"period_start", "period_end"}
def _collect_datetime_strings(
obj: object,
key: str | None = None,
) -> list[tuple[str, str]]:
"""Walk parsed JSON and collect (field_name, value) for datetime fields."""
results: list[tuple[str, str]] = []
if isinstance(obj, dict):
for k, v in obj.items():
results.extend(_collect_datetime_strings(v, k))
elif isinstance(obj, list):
for item in obj:
results.extend(_collect_datetime_strings(item, key))
elif isinstance(obj, str) and key is not None:
if key in _DATETIME_FIELD_NAMES or key in _DATE_FIELD_NAMES:
results.append((key, obj))
return results
# ---------------------------------------------------------------------------
# Property tests
# ---------------------------------------------------------------------------
@given(report=_report_data_strategy)
@settings(max_examples=100)
def test_report_serialization_round_trip(report: ReportData) -> None:
"""**Validates: Requirements 8.1, 8.2, 8.3, 8.4**
For any valid ReportData object, serializing to JSON and then
deserializing back SHALL produce a ReportData object equivalent
to the original.
"""
json_str = report.model_dump_json()
restored = ReportData.model_validate_json(json_str)
assert restored == report, (
f"Round-trip failed: deserialized report differs from original.\n"
f" report_type: {report.report_type}\n"
f" period: {report.period_start}{report.period_end}\n"
f" generated_at: {report.generated_at}"
)
@given(report=_report_data_strategy)
@settings(max_examples=100)
def test_report_datetime_fields_iso8601(report: ReportData) -> None:
"""**Validates: Requirements 8.4**
All datetime fields in the serialized JSON SHALL be in ISO 8601 format.
"""
json_str = report.model_dump_json()
parsed = json.loads(json_str)
dt_fields = _collect_datetime_strings(parsed)
for field_name, value in dt_fields:
if field_name in _DATETIME_FIELD_NAMES:
assert _ISO8601_DATETIME_RE.match(value), (
f"Datetime field '{field_name}' is not ISO 8601: {value!r}"
)
elif field_name in _DATE_FIELD_NAMES:
assert _ISO8601_DATE_RE.match(value), (
f"Date field '{field_name}' is not ISO 8601: {value!r}"
)
+127
View File
@@ -0,0 +1,127 @@
# Feature: trading-feedback-engine, Property 3: Validation discrepancy detection correctness
"""Property-based tests for report validation discrepancy detection.
Feature: trading-feedback-engine
Tests the validation discrepancy detection correctness property from the
design specification: for any pair of computed metric value and snapshot
metric value (both finite, non-negative floats), the validation function
SHALL produce a warning if and only if the percentage difference exceeds 5%.
The percentage difference SHALL be computed as |computed - snapshot| /
snapshot * 100 when snapshot > 0, and SHALL flag any non-zero computed value
when snapshot is 0.
"""
from __future__ import annotations
import math
from hypothesis import given, settings
from hypothesis import strategies as st
from services.reporting.validator import (
DISCREPANCY_THRESHOLD_PCT,
_check_discrepancy,
)
# ---------------------------------------------------------------------------
# Property 3: Validation Discrepancy Detection Correctness
# Validates: Requirements 4.1, 4.2, 4.3, 4.4
# ---------------------------------------------------------------------------
# Strategy: finite, non-negative floats in [0, 1e6]
_metric_float = st.floats(
min_value=0, max_value=1e6, allow_nan=False, allow_infinity=False,
)
@given(computed=_metric_float, snapshot=_metric_float)
@settings(max_examples=100)
def test_discrepancy_detection_correctness(
computed: float,
snapshot: float,
) -> None:
"""**Validates: Requirements 4.1, 4.2, 4.3, 4.4**
For any pair of computed and snapshot values (finite, non-negative):
- Both zero → no warning
- Snapshot zero, computed non-zero → warning (100% discrepancy)
- Snapshot > 0 → warning iff |computed - snapshot| / snapshot * 100 > 5%
"""
result = _check_discrepancy("test_field", computed, snapshot)
if snapshot == 0.0 and computed == 0.0:
# Both zero → no discrepancy
assert result is None, (
f"Expected no warning when both values are 0, got {result}"
)
elif snapshot == 0.0:
# Non-zero computed with zero snapshot → always a warning
assert result is not None, (
f"Expected warning for non-zero computed={computed} with "
f"snapshot=0, got None"
)
assert result.pct_difference == 100.0, (
f"Expected 100% discrepancy for zero snapshot, "
f"got {result.pct_difference}%"
)
else:
# Normal case: snapshot > 0
expected_pct = abs(computed - snapshot) / snapshot * 100.0
if expected_pct > DISCREPANCY_THRESHOLD_PCT:
assert result is not None, (
f"Expected warning for {expected_pct:.4f}% discrepancy "
f"(computed={computed}, snapshot={snapshot}), got None"
)
# When expected_pct is inf (very small snapshot), both should be inf
if math.isinf(expected_pct):
assert math.isinf(result.pct_difference), (
f"Expected inf pct_difference, got {result.pct_difference}"
)
else:
assert abs(result.pct_difference - round(expected_pct, 4)) < 1e-6, (
f"Percentage difference mismatch: "
f"expected {round(expected_pct, 4)}, "
f"got {result.pct_difference}"
)
else:
assert result is None, (
f"Expected no warning for {expected_pct:.4f}% discrepancy "
f"(computed={computed}, snapshot={snapshot}), "
f"got warning with pct_difference={result.pct_difference}"
)
@given(computed=_metric_float, snapshot=_metric_float)
@settings(max_examples=100)
def test_discrepancy_threshold_is_five_percent(
computed: float,
snapshot: float,
) -> None:
"""**Validates: Requirements 4.1, 4.2, 4.3, 4.4**
Verify that DISCREPANCY_THRESHOLD_PCT = 5.0 is the threshold used:
the function produces a warning if and only if the discrepancy
exceeds exactly 5%.
"""
assert DISCREPANCY_THRESHOLD_PCT == 5.0, (
f"Expected threshold of 5.0%, got {DISCREPANCY_THRESHOLD_PCT}%"
)
result = _check_discrepancy("threshold_check", computed, snapshot)
if snapshot == 0.0 and computed == 0.0:
assert result is None
elif snapshot == 0.0:
# 100% > 5% → always warning
assert result is not None
else:
pct = abs(computed - snapshot) / snapshot * 100.0
should_warn = pct > 5.0
if should_warn:
assert result is not None, (
f"Discrepancy {pct:.4f}% > 5% but no warning produced"
)
else:
assert result is None, (
f"Discrepancy {pct:.4f}% <= 5% but warning produced"
)
+256
View File
@@ -0,0 +1,256 @@
"""API integration tests for trading report endpoints.
Tests GET /api/reports (list with pagination/filtering) and
GET /api/reports/{report_id} (detail with full report_data).
Uses httpx.AsyncClient with the FastAPI app and mocks the module-level
``pool`` variable in services.api.app.
Requirements validated: 5.4, 5.5, 5.6
"""
from __future__ import annotations
import uuid
from datetime import date, datetime, timezone
from unittest.mock import AsyncMock, patch
import httpx
import pytest
from services.api.app import app
# ── Helpers ──────────────────────────────────────────────────────────────
class FakeRecord(dict):
"""Dict subclass that behaves like an asyncpg Record for bracket access."""
def __getattr__(self, name: str):
try:
return self[name]
except KeyError:
raise AttributeError(name)
def _make_list_record(**overrides) -> FakeRecord:
"""Build a FakeRecord matching the list-endpoint SELECT columns."""
defaults = {
"id": uuid.uuid4(),
"report_type": "daily",
"period_start": date(2025, 1, 15),
"period_end": date(2025, 1, 15),
"validation_status": "passed",
"generated_at": datetime(2025, 1, 15, 21, 30, tzinfo=timezone.utc),
}
defaults.update(overrides)
return FakeRecord(**defaults)
def _make_detail_record(**overrides) -> FakeRecord:
"""Build a FakeRecord matching the detail-endpoint SELECT columns."""
defaults = {
"id": uuid.uuid4(),
"report_type": "daily",
"period_start": date(2025, 1, 15),
"period_end": date(2025, 1, 15),
"report_data": {
"pnl": {"realized_pnl": 125.50, "unrealized_pnl": -30.20},
"executive_summary": "Test summary",
},
"validation_status": "passed",
"generated_at": datetime(2025, 1, 15, 21, 30, tzinfo=timezone.utc),
"created_at": datetime(2025, 1, 15, 21, 30, 5, tzinfo=timezone.utc),
}
defaults.update(overrides)
return FakeRecord(**defaults)
_POOL_PATCH = "services.api.app.pool"
# ═══════════════════════════════════════════════════════════════════════
# 1. GET /api/reports — list endpoint
# Requirements validated: 5.4, 5.6
# ═══════════════════════════════════════════════════════════════════════
class TestListReports:
"""Tests for GET /api/reports."""
@pytest.mark.asyncio
async def test_default_pagination(self) -> None:
"""List reports with no params returns rows using default limit/offset."""
r1 = _make_list_record()
r2 = _make_list_record(
report_type="weekly",
period_start=date(2025, 1, 13),
period_end=date(2025, 1, 17),
)
mock_pool = AsyncMock()
mock_pool.fetch = AsyncMock(return_value=[r1, r2])
with patch(_POOL_PATCH, mock_pool):
async with httpx.AsyncClient(
transport=httpx.ASGITransport(app=app), base_url="http://test"
) as client:
resp = await client.get("/api/reports")
assert resp.status_code == 200
data = resp.json()
assert len(data) == 2
# UUID fields are serialized as strings
assert data[0]["id"] == str(r1["id"])
assert data[0]["report_type"] == "daily"
assert data[0]["period_start"] == "2025-01-15"
assert data[0]["period_end"] == "2025-01-15"
assert data[0]["validation_status"] == "passed"
assert "generated_at" in data[0]
# pool.fetch called with default limit=20, offset=0
call_args = mock_pool.fetch.call_args
sql = call_args[0][0]
assert "LIMIT" in sql
assert "OFFSET" in sql
# Last two positional args are limit and offset
assert call_args[0][-2] == 20
assert call_args[0][-1] == 0
@pytest.mark.asyncio
async def test_filter_by_report_type(self) -> None:
"""Filtering by report_type=weekly passes the value to the query."""
r1 = _make_list_record(report_type="weekly")
mock_pool = AsyncMock()
mock_pool.fetch = AsyncMock(return_value=[r1])
with patch(_POOL_PATCH, mock_pool):
async with httpx.AsyncClient(
transport=httpx.ASGITransport(app=app), base_url="http://test"
) as client:
resp = await client.get("/api/reports", params={"report_type": "weekly"})
assert resp.status_code == 200
data = resp.json()
assert len(data) == 1
assert data[0]["report_type"] == "weekly"
# Verify the SQL includes a report_type condition
call_args = mock_pool.fetch.call_args
sql = call_args[0][0]
assert "report_type" in sql
# "weekly" should be among the positional params
assert "weekly" in call_args[0]
@pytest.mark.asyncio
async def test_filter_by_date_range(self) -> None:
"""Filtering by start_date and end_date passes dates to the query."""
mock_pool = AsyncMock()
mock_pool.fetch = AsyncMock(return_value=[])
with patch(_POOL_PATCH, mock_pool):
async with httpx.AsyncClient(
transport=httpx.ASGITransport(app=app), base_url="http://test"
) as client:
resp = await client.get(
"/api/reports",
params={"start_date": "2025-01-01", "end_date": "2025-01-31"},
)
assert resp.status_code == 200
call_args = mock_pool.fetch.call_args
sql = call_args[0][0]
assert "period_start" in sql
assert "period_end" in sql
# Date strings should be among the positional params
assert "2025-01-01" in call_args[0]
assert "2025-01-31" in call_args[0]
@pytest.mark.asyncio
async def test_invalid_report_type_returns_400(self) -> None:
"""An invalid report_type value returns HTTP 400."""
mock_pool = AsyncMock()
with patch(_POOL_PATCH, mock_pool):
async with httpx.AsyncClient(
transport=httpx.ASGITransport(app=app), base_url="http://test"
) as client:
resp = await client.get(
"/api/reports", params={"report_type": "monthly"}
)
assert resp.status_code == 400
assert "daily" in resp.json()["detail"].lower() or "weekly" in resp.json()["detail"].lower()
# pool.fetch should NOT have been called
mock_pool.fetch.assert_not_awaited()
@pytest.mark.asyncio
async def test_invalid_date_format_returns_400(self) -> None:
"""A malformed start_date returns HTTP 400."""
mock_pool = AsyncMock()
with patch(_POOL_PATCH, mock_pool):
async with httpx.AsyncClient(
transport=httpx.ASGITransport(app=app), base_url="http://test"
) as client:
resp = await client.get(
"/api/reports", params={"start_date": "not-a-date"}
)
assert resp.status_code == 400
assert "YYYY-MM-DD" in resp.json()["detail"]
mock_pool.fetch.assert_not_awaited()
# ═══════════════════════════════════════════════════════════════════════
# 2. GET /api/reports/{report_id} — detail endpoint
# Requirements validated: 5.4, 5.5
# ═══════════════════════════════════════════════════════════════════════
class TestGetReport:
"""Tests for GET /api/reports/{report_id}."""
@pytest.mark.asyncio
async def test_valid_id_returns_full_report(self) -> None:
"""A valid report_id returns the full report including report_data."""
record = _make_detail_record()
mock_pool = AsyncMock()
mock_pool.fetchrow = AsyncMock(return_value=record)
report_id = str(record["id"])
with patch(_POOL_PATCH, mock_pool):
async with httpx.AsyncClient(
transport=httpx.ASGITransport(app=app), base_url="http://test"
) as client:
resp = await client.get(f"/api/reports/{report_id}")
assert resp.status_code == 200
data = resp.json()
assert data["id"] == report_id
assert data["report_type"] == "daily"
assert data["period_start"] == "2025-01-15"
assert data["period_end"] == "2025-01-15"
assert data["validation_status"] == "passed"
assert "generated_at" in data
assert "created_at" in data
# report_data is included as a dict
assert isinstance(data["report_data"], dict)
assert data["report_data"]["pnl"]["realized_pnl"] == 125.50
assert data["report_data"]["executive_summary"] == "Test summary"
@pytest.mark.asyncio
async def test_nonexistent_id_returns_404(self) -> None:
"""A non-existent report_id returns HTTP 404."""
mock_pool = AsyncMock()
mock_pool.fetchrow = AsyncMock(return_value=None)
fake_id = str(uuid.uuid4())
with patch(_POOL_PATCH, mock_pool):
async with httpx.AsyncClient(
transport=httpx.ASGITransport(app=app), base_url="http://test"
) as client:
resp = await client.get(f"/api/reports/{fake_id}")
assert resp.status_code == 404
assert "not found" in resp.json()["detail"].lower()
+273
View File
@@ -0,0 +1,273 @@
"""Unit tests for the report data collector.
Tests the CollectedData dataclass defaults, _row_dict UUID conversion,
and collect_report_data with mocked asyncpg pool.
Requirements: 1.1, 1.2, 1.3, 1.4, 1.5
"""
from __future__ import annotations
import uuid
from datetime import date
from unittest.mock import AsyncMock, MagicMock
import pytest
from services.reporting.collector import CollectedData, _row_dict, collect_report_data
# ===================================================================
# _row_dict tests
# ===================================================================
class TestRowDict:
"""Tests for _row_dict UUID→str conversion."""
def test_uuid_fields_converted_to_str(self):
"""UUID values in the record are converted to strings."""
test_uuid = uuid.uuid4()
row = MagicMock()
row.__iter__ = MagicMock(return_value=iter([("id", test_uuid), ("name", "test")]))
row.keys = MagicMock(return_value=["id", "name"])
row.values = MagicMock(return_value=[test_uuid, "test"])
row.items = MagicMock(return_value=[("id", test_uuid), ("name", "test")])
# dict(row) needs to work — use a real dict-like mock
mock_dict = {"id": test_uuid, "name": "test"}
row.__iter__ = MagicMock(return_value=iter(mock_dict))
row.__getitem__ = lambda self, key: mock_dict[key]
# Simpler approach: just pass a dict-like object
class FakeRecord(dict):
pass
record = FakeRecord(id=test_uuid, name="test", count=42)
result = _row_dict(record)
assert result["id"] == str(test_uuid)
assert result["name"] == "test"
assert result["count"] == 42
def test_no_uuid_fields_unchanged(self):
"""Non-UUID values pass through unchanged."""
class FakeRecord(dict):
pass
record = FakeRecord(ticker="AAPL", price=185.50, active=True)
result = _row_dict(record)
assert result["ticker"] == "AAPL"
assert result["price"] == 185.50
assert result["active"] is True
def test_multiple_uuid_fields(self):
"""Multiple UUID fields are all converted."""
class FakeRecord(dict):
pass
id1 = uuid.uuid4()
id2 = uuid.uuid4()
record = FakeRecord(id=id1, recommendation_id=id2, ticker="MSFT")
result = _row_dict(record)
assert result["id"] == str(id1)
assert result["recommendation_id"] == str(id2)
assert result["ticker"] == "MSFT"
def test_empty_record(self):
"""Empty record returns empty dict."""
class FakeRecord(dict):
pass
record = FakeRecord()
result = _row_dict(record)
assert result == {}
# ===================================================================
# CollectedData defaults
# ===================================================================
class TestCollectedDataDefaults:
"""Tests for CollectedData dataclass default values."""
def test_default_empty_lists(self):
"""All list fields default to empty lists."""
data = CollectedData()
assert data.trading_decisions == []
assert data.orders == []
assert data.open_positions == []
assert data.closed_positions == []
assert data.recommendations == []
assert data.prediction_outcomes == []
assert data.model_metric_snapshots == []
assert data.circuit_breaker_events == []
def test_default_none_snapshots(self):
"""Snapshot fields default to None."""
data = CollectedData()
assert data.portfolio_snapshot is None
assert data.previous_portfolio_snapshot is None
def test_default_zero_balance(self):
"""Reserve pool balance defaults to 0.0."""
data = CollectedData()
assert data.reserve_pool_balance == 0.0
def test_independent_list_instances(self):
"""Each CollectedData instance has independent list instances."""
data1 = CollectedData()
data2 = CollectedData()
data1.trading_decisions.append({"id": "test"})
assert data2.trading_decisions == []
# ===================================================================
# collect_report_data with mocked pool
# ===================================================================
def _make_mock_pool():
"""Create a mock asyncpg pool with async context manager support."""
pool = MagicMock()
conn = AsyncMock()
# pool.acquire() returns a sync object that supports async context manager
ctx = MagicMock()
ctx.__aenter__ = AsyncMock(return_value=conn)
ctx.__aexit__ = AsyncMock(return_value=False)
pool.acquire.return_value = ctx
return pool, conn
class TestCollectReportData:
"""Tests for collect_report_data with mocked asyncpg."""
@pytest.mark.asyncio
async def test_zero_activity_returns_empty_lists(self):
"""When no data exists, all lists are empty and snapshots are None."""
pool, conn = _make_mock_pool()
# All queries return empty results
conn.fetch.return_value = []
conn.fetchrow.return_value = None
result = await collect_report_data(
pool, date(2025, 1, 15), date(2025, 1, 15)
)
assert isinstance(result, CollectedData)
assert result.trading_decisions == []
assert result.orders == []
assert result.open_positions == []
assert result.closed_positions == []
assert result.portfolio_snapshot is None
assert result.previous_portfolio_snapshot is None
assert result.recommendations == []
assert result.prediction_outcomes == []
assert result.model_metric_snapshots == []
assert result.circuit_breaker_events == []
assert result.reserve_pool_balance == 0.0
@pytest.mark.asyncio
async def test_queries_use_correct_date_range(self):
"""Verify that queries are called with the correct period dates."""
pool, conn = _make_mock_pool()
conn.fetch.return_value = []
conn.fetchrow.return_value = None
start = date(2025, 1, 13)
end = date(2025, 1, 17)
await collect_report_data(pool, start, end)
# Verify fetch was called (trading_decisions, orders, open_positions,
# closed_positions, recommendations, prediction_outcomes,
# model_metric_snapshots, circuit_breaker_events)
assert conn.fetch.call_count == 8
# Verify fetchrow was called (portfolio_snapshot, previous_snapshot,
# reserve_pool_balance)
assert conn.fetchrow.call_count == 3
@pytest.mark.asyncio
async def test_reserve_pool_balance_from_ledger(self):
"""Reserve pool balance is read from the latest ledger entry."""
pool, conn = _make_mock_pool()
conn.fetch.return_value = []
# Mock fetchrow to return different values for different queries
balance_row = {"balance_after": 450.75}
call_count = 0
async def mock_fetchrow(query, *args):
nonlocal call_count
call_count += 1
if "reserve_pool_ledger" in query:
return balance_row
return None
conn.fetchrow.side_effect = mock_fetchrow
result = await collect_report_data(
pool, date(2025, 1, 15), date(2025, 1, 15)
)
assert result.reserve_pool_balance == 450.75
@pytest.mark.asyncio
async def test_portfolio_snapshots_populated(self):
"""Portfolio snapshot and previous snapshot are populated when data exists."""
pool, conn = _make_mock_pool()
conn.fetch.return_value = []
current_snapshot = {
"id": uuid.uuid4(),
"snapshot_date": date(2025, 1, 15),
"portfolio_value": 10500.0,
"active_pool": 8000.0,
"reserve_pool": 2500.0,
"cumulative_return": 0.05,
}
previous_snapshot = {
"id": uuid.uuid4(),
"snapshot_date": date(2025, 1, 14),
"portfolio_value": 10000.0,
"active_pool": 7500.0,
"reserve_pool": 2500.0,
"cumulative_return": 0.0,
}
call_count = 0
async def mock_fetchrow(query, *args):
nonlocal call_count
call_count += 1
if "reserve_pool_ledger" in query:
return None
if "snapshot_date >=" in query:
# current snapshot query (snapshot_date >= $1 AND snapshot_date <= $2)
return current_snapshot
if "snapshot_date <" in query:
# previous snapshot query (snapshot_date < $1)
return previous_snapshot
return None
conn.fetchrow.side_effect = mock_fetchrow
result = await collect_report_data(
pool, date(2025, 1, 15), date(2025, 1, 15)
)
assert result.portfolio_snapshot is not None
assert result.portfolio_snapshot["portfolio_value"] == 10500.0
# UUID fields should be converted to str
assert isinstance(result.portfolio_snapshot["id"], str)
assert result.previous_portfolio_snapshot is not None
assert result.previous_portfolio_snapshot["portfolio_value"] == 10000.0
+678
View File
@@ -0,0 +1,678 @@
"""Unit tests for report generator orchestrator.
Tests the orchestration flow in services.reporting.generator with mocked
dependencies (collector, section builders, validator, summarizer).
Requirements validated: 5.1, 5.2, 5.3
"""
from __future__ import annotations
import uuid
from datetime import date, datetime, timezone
from unittest.mock import AsyncMock, patch
import pytest
from services.reporting.collector import CollectedData
from services.reporting.generator import (
_in_progress_jobs,
generate_report,
process_report_job,
store_report,
)
from services.reporting.models import (
ModelQualitySection,
ModelQualityWindow,
PLSection,
PositionPerformanceSection,
RecommendationAccuracySection,
ReportData,
ReportType,
RiskMetricsSection,
ValidationStatus,
)
# ── Helpers ──────────────────────────────────────────────────────────────
def _make_report_data(**overrides: object) -> ReportData:
"""Build a minimal valid ReportData for testing."""
defaults = {
"pnl": PLSection(
realized_pnl=100.0,
unrealized_pnl=-20.0,
daily_return=0.01,
cumulative_return=0.05,
win_count=5,
loss_count=2,
win_rate=0.71,
profit_factor=2.0,
sharpe_ratio=1.2,
summary="P&L summary",
),
"recommendation_accuracy": RecommendationAccuracySection(
total_evaluated=10,
act_count=6,
skip_count=4,
acted_win_rate=0.67,
avg_confidence_acted=0.75,
avg_confidence_skipped=0.40,
summary="Rec accuracy summary",
),
"position_performance": PositionPerformanceSection(
positions=[],
summary="Position summary",
),
"risk_metrics": RiskMetricsSection(
current_risk_tier="moderate",
portfolio_heat=0.12,
max_drawdown=0.06,
current_drawdown_pct=0.02,
reserve_pool_balance=500.0,
circuit_breaker_event_count=0,
summary="Risk summary",
),
"model_quality": ModelQualitySection(
windows=[
ModelQualityWindow(
lookback="7d",
win_rate=0.65,
directional_accuracy=0.62,
information_coefficient=0.08,
calibration_error=0.12,
brier_score=0.22,
),
],
summary="Model quality summary",
),
"executive_summary": "Executive summary text",
"validation_status": ValidationStatus.PASSED,
"generated_at": datetime(2025, 1, 15, 21, 30, tzinfo=timezone.utc),
"period_start": date(2025, 1, 15),
"period_end": date(2025, 1, 15),
"report_type": ReportType.DAILY,
}
defaults.update(overrides)
return ReportData(**defaults)
def _empty_collected_data() -> CollectedData:
"""Build a zero-activity CollectedData."""
return CollectedData()
def _mock_pool() -> AsyncMock:
"""Create a mock asyncpg pool."""
pool = AsyncMock()
return pool
# Patch targets (all in the generator module namespace)
_PATCH_COLLECT = "services.reporting.generator.collect_report_data"
_PATCH_BUILD_PNL = "services.reporting.generator.build_pnl_section"
_PATCH_BUILD_REC = "services.reporting.generator.build_recommendation_accuracy_section"
_PATCH_BUILD_POS = "services.reporting.generator.build_position_performance_section"
_PATCH_BUILD_RISK = "services.reporting.generator.build_risk_metrics_section"
_PATCH_BUILD_MQ = "services.reporting.generator.build_model_quality_section"
_PATCH_VALIDATE_REC = "services.reporting.generator.validate_recommendation_accuracy"
_PATCH_VALIDATE_MQ = "services.reporting.generator.validate_model_quality"
_PATCH_COMPUTE_STATUS = "services.reporting.generator.compute_validation_status"
_PATCH_SUMMARIZE = "services.reporting.generator.summarize_section"
_PATCH_EXEC_SUMMARY = "services.reporting.generator.generate_executive_summary"
_PATCH_RESOLVER = "services.reporting.generator.AgentConfigResolver"
# ═══════════════════════════════════════════════════════════════════════
# 1. generate_report — orchestration flow
# Requirements validated: 5.1
# ═══════════════════════════════════════════════════════════════════════
class TestGenerateReport:
"""Tests for generate_report orchestration."""
@pytest.mark.asyncio
async def test_orchestration_calls_all_steps(self) -> None:
"""generate_report calls collector, builders, validators, summarizer in order."""
pool = _mock_pool()
collected = _empty_collected_data()
pnl = PLSection(
realized_pnl=0, unrealized_pnl=0, daily_return=0,
cumulative_return=0, win_count=0, loss_count=0,
win_rate=0, profit_factor=0, sharpe_ratio=0,
)
rec = RecommendationAccuracySection(
total_evaluated=0, act_count=0, skip_count=0,
acted_win_rate=0, avg_confidence_acted=0, avg_confidence_skipped=0,
)
pos = PositionPerformanceSection()
risk = RiskMetricsSection(
current_risk_tier="low", portfolio_heat=0, max_drawdown=0,
current_drawdown_pct=0, reserve_pool_balance=0,
circuit_breaker_event_count=0,
)
mq = ModelQualitySection()
with (
patch(_PATCH_COLLECT, new_callable=AsyncMock, return_value=collected) as mock_collect,
patch(_PATCH_BUILD_PNL, return_value=pnl) as mock_pnl,
patch(_PATCH_BUILD_REC, return_value=rec) as mock_rec,
patch(_PATCH_BUILD_POS, return_value=pos) as mock_pos,
patch(_PATCH_BUILD_RISK, return_value=risk) as mock_risk,
patch(_PATCH_BUILD_MQ, return_value=mq) as mock_mq,
patch(_PATCH_VALIDATE_REC, return_value=[]) as mock_val_rec,
patch(_PATCH_VALIDATE_MQ, return_value=[]) as mock_val_mq,
patch(_PATCH_COMPUTE_STATUS, return_value=ValidationStatus.PASSED) as mock_status,
patch(_PATCH_SUMMARIZE, new_callable=AsyncMock, return_value="summary") as mock_sum,
patch(_PATCH_EXEC_SUMMARY, new_callable=AsyncMock, return_value="exec summary") as mock_exec,
patch(_PATCH_RESOLVER) as mock_resolver_cls,
):
result = await generate_report(
pool, ReportType.DAILY, date(2025, 1, 15), date(2025, 1, 15),
)
# Collector called with pool and dates
mock_collect.assert_awaited_once_with(pool, date(2025, 1, 15), date(2025, 1, 15))
# All section builders called with collected data
mock_pnl.assert_called_once_with(collected)
mock_rec.assert_called_once_with(collected)
mock_pos.assert_called_once_with(collected)
mock_risk.assert_called_once_with(collected)
mock_mq.assert_called_once_with(collected)
# Validators called
mock_val_rec.assert_called_once_with(rec, collected.prediction_outcomes)
mock_val_mq.assert_called_once_with(mq, collected.model_metric_snapshots)
# Summarizer called 5 times (one per section)
assert mock_sum.await_count == 5
# Executive summary called
mock_exec.assert_awaited_once()
# Validation status computed
mock_status.assert_called_once()
# Result is a ReportData
assert isinstance(result, ReportData)
assert result.report_type == ReportType.DAILY
assert result.period_start == date(2025, 1, 15)
assert result.period_end == date(2025, 1, 15)
assert result.executive_summary == "exec summary"
@pytest.mark.asyncio
async def test_zero_activity_report(self) -> None:
"""generate_report handles zero-activity data (empty CollectedData)."""
pool = _mock_pool()
collected = _empty_collected_data()
pnl = PLSection(
realized_pnl=0, unrealized_pnl=0, daily_return=0,
cumulative_return=0, win_count=0, loss_count=0,
win_rate=0, profit_factor=0, sharpe_ratio=0,
)
rec = RecommendationAccuracySection(
total_evaluated=0, act_count=0, skip_count=0,
acted_win_rate=0, avg_confidence_acted=0, avg_confidence_skipped=0,
)
pos = PositionPerformanceSection()
risk = RiskMetricsSection(
current_risk_tier="unknown", portfolio_heat=0, max_drawdown=0,
current_drawdown_pct=0, reserve_pool_balance=0,
circuit_breaker_event_count=0,
)
mq = ModelQualitySection()
with (
patch(_PATCH_COLLECT, new_callable=AsyncMock, return_value=collected),
patch(_PATCH_BUILD_PNL, return_value=pnl),
patch(_PATCH_BUILD_REC, return_value=rec),
patch(_PATCH_BUILD_POS, return_value=pos),
patch(_PATCH_BUILD_RISK, return_value=risk),
patch(_PATCH_BUILD_MQ, return_value=mq),
patch(_PATCH_VALIDATE_REC, return_value=[]),
patch(_PATCH_VALIDATE_MQ, return_value=[]),
patch(_PATCH_COMPUTE_STATUS, return_value=ValidationStatus.PASSED),
patch(_PATCH_SUMMARIZE, new_callable=AsyncMock, return_value="No activity"),
patch(_PATCH_EXEC_SUMMARY, new_callable=AsyncMock, return_value="No trading activity"),
patch(_PATCH_RESOLVER),
):
result = await generate_report(
pool, ReportType.DAILY, date(2025, 1, 15), date(2025, 1, 15),
)
assert result.pnl.realized_pnl == 0.0
assert result.pnl.win_count == 0
assert result.recommendation_accuracy.total_evaluated == 0
assert result.position_performance.positions == []
assert result.risk_metrics.current_risk_tier == "unknown"
assert result.validation_status == ValidationStatus.PASSED
@pytest.mark.asyncio
async def test_validation_warnings_attached(self) -> None:
"""Validation warnings from validators are attached to sections."""
pool = _mock_pool()
collected = _empty_collected_data()
from services.reporting.models import ValidationWarning
rec_warning = ValidationWarning(
field_name="acted_win_rate",
computed_value=0.80,
snapshot_value=0.60,
pct_difference=33.33,
)
pnl = PLSection(
realized_pnl=0, unrealized_pnl=0, daily_return=0,
cumulative_return=0, win_count=0, loss_count=0,
win_rate=0, profit_factor=0, sharpe_ratio=0,
)
rec = RecommendationAccuracySection(
total_evaluated=5, act_count=3, skip_count=2,
acted_win_rate=0.80, avg_confidence_acted=0.7, avg_confidence_skipped=0.4,
)
pos = PositionPerformanceSection()
risk = RiskMetricsSection(
current_risk_tier="moderate", portfolio_heat=0.1, max_drawdown=0.05,
current_drawdown_pct=0.02, reserve_pool_balance=100,
circuit_breaker_event_count=0,
)
mq = ModelQualitySection()
with (
patch(_PATCH_COLLECT, new_callable=AsyncMock, return_value=collected),
patch(_PATCH_BUILD_PNL, return_value=pnl),
patch(_PATCH_BUILD_REC, return_value=rec),
patch(_PATCH_BUILD_POS, return_value=pos),
patch(_PATCH_BUILD_RISK, return_value=risk),
patch(_PATCH_BUILD_MQ, return_value=mq),
patch(_PATCH_VALIDATE_REC, return_value=[rec_warning]),
patch(_PATCH_VALIDATE_MQ, return_value=[]),
patch(_PATCH_COMPUTE_STATUS, return_value=ValidationStatus.WARNINGS),
patch(_PATCH_SUMMARIZE, new_callable=AsyncMock, return_value="summary"),
patch(_PATCH_EXEC_SUMMARY, new_callable=AsyncMock, return_value="exec"),
patch(_PATCH_RESOLVER),
):
result = await generate_report(
pool, ReportType.DAILY, date(2025, 1, 15), date(2025, 1, 15),
)
assert result.validation_status == ValidationStatus.WARNINGS
assert len(result.recommendation_accuracy.validation_warnings) == 1
assert result.recommendation_accuracy.validation_warnings[0].field_name == "acted_win_rate"
@pytest.mark.asyncio
async def test_weekly_report_type(self) -> None:
"""generate_report correctly sets weekly report type."""
pool = _mock_pool()
collected = _empty_collected_data()
pnl = PLSection(
realized_pnl=0, unrealized_pnl=0, daily_return=0,
cumulative_return=0, win_count=0, loss_count=0,
win_rate=0, profit_factor=0, sharpe_ratio=0,
)
rec = RecommendationAccuracySection(
total_evaluated=0, act_count=0, skip_count=0,
acted_win_rate=0, avg_confidence_acted=0, avg_confidence_skipped=0,
)
pos = PositionPerformanceSection()
risk = RiskMetricsSection(
current_risk_tier="low", portfolio_heat=0, max_drawdown=0,
current_drawdown_pct=0, reserve_pool_balance=0,
circuit_breaker_event_count=0,
)
mq = ModelQualitySection()
with (
patch(_PATCH_COLLECT, new_callable=AsyncMock, return_value=collected),
patch(_PATCH_BUILD_PNL, return_value=pnl),
patch(_PATCH_BUILD_REC, return_value=rec),
patch(_PATCH_BUILD_POS, return_value=pos),
patch(_PATCH_BUILD_RISK, return_value=risk),
patch(_PATCH_BUILD_MQ, return_value=mq),
patch(_PATCH_VALIDATE_REC, return_value=[]),
patch(_PATCH_VALIDATE_MQ, return_value=[]),
patch(_PATCH_COMPUTE_STATUS, return_value=ValidationStatus.PASSED),
patch(_PATCH_SUMMARIZE, new_callable=AsyncMock, return_value="summary"),
patch(_PATCH_EXEC_SUMMARY, new_callable=AsyncMock, return_value="exec"),
patch(_PATCH_RESOLVER),
):
result = await generate_report(
pool, ReportType.WEEKLY, date(2025, 1, 13), date(2025, 1, 17),
)
assert result.report_type == ReportType.WEEKLY
assert result.period_start == date(2025, 1, 13)
assert result.period_end == date(2025, 1, 17)
# ═══════════════════════════════════════════════════════════════════════
# 2. store_report — upsert behavior
# Requirements validated: 5.2, 5.3
# ═══════════════════════════════════════════════════════════════════════
class TestStoreReport:
"""Tests for store_report upsert behavior."""
@pytest.mark.asyncio
async def test_store_calls_upsert_sql(self) -> None:
"""store_report calls pool.fetchrow with the upsert SQL and correct params."""
pool = _mock_pool()
report_id = str(uuid.uuid4())
pool.fetchrow = AsyncMock(return_value={"id": report_id})
report = _make_report_data()
result = await store_report(pool, report)
assert result == report_id
pool.fetchrow.assert_awaited_once()
call_args = pool.fetchrow.call_args
sql = call_args[0][0]
assert "INSERT INTO trading_reports" in sql
assert "ON CONFLICT" in sql
assert "DO UPDATE" in sql
# Verify the positional parameters
assert call_args[0][1] == report.report_type.value
assert call_args[0][2] == report.period_start
assert call_args[0][3] == report.period_end
# param 4 is the JSON string
assert call_args[0][4] == report.model_dump_json()
assert call_args[0][5] == report.validation_status.value
assert call_args[0][6] == report.generated_at
@pytest.mark.asyncio
async def test_store_returns_uuid_string(self) -> None:
"""store_report returns the UUID as a string."""
pool = _mock_pool()
expected_id = str(uuid.uuid4())
pool.fetchrow = AsyncMock(return_value={"id": expected_id})
report = _make_report_data()
result = await store_report(pool, report)
assert isinstance(result, str)
assert result == expected_id
@pytest.mark.asyncio
async def test_store_upsert_regeneration(self) -> None:
"""store_report handles regeneration (upsert) for existing period."""
pool = _mock_pool()
report_id = str(uuid.uuid4())
pool.fetchrow = AsyncMock(return_value={"id": report_id})
# First store
report1 = _make_report_data()
result1 = await store_report(pool, report1)
# Second store (regeneration) — same period, different data
report2 = _make_report_data(
executive_summary="Updated executive summary",
generated_at=datetime(2025, 1, 15, 22, 0, tzinfo=timezone.utc),
)
result2 = await store_report(pool, report2)
# Both calls succeed (upsert handles the conflict)
assert result1 == report_id
assert result2 == report_id
assert pool.fetchrow.await_count == 2
# ═══════════════════════════════════════════════════════════════════════
# 3. process_report_job — job processing
# Requirements validated: 5.1, 5.3
# ═══════════════════════════════════════════════════════════════════════
class TestProcessReportJob:
"""Tests for process_report_job."""
@pytest.mark.asyncio
async def test_valid_job_calls_generate_and_store(self) -> None:
"""A valid job payload triggers generate_report and store_report."""
pool = _mock_pool()
report = _make_report_data()
with (
patch(
"services.reporting.generator.generate_report",
new_callable=AsyncMock,
return_value=report,
) as mock_gen,
patch(
"services.reporting.generator.store_report",
new_callable=AsyncMock,
return_value=str(uuid.uuid4()),
) as mock_store,
):
job = {
"report_type": "daily",
"period_start": "2025-01-15",
"period_end": "2025-01-15",
}
await process_report_job(pool, job)
mock_gen.assert_awaited_once_with(
pool, ReportType.DAILY, date(2025, 1, 15), date(2025, 1, 15),
)
mock_store.assert_awaited_once_with(pool, report)
@pytest.mark.asyncio
async def test_invalid_report_type_returns_early(self) -> None:
"""An invalid report_type in the job payload causes early return."""
pool = _mock_pool()
with (
patch(
"services.reporting.generator.generate_report",
new_callable=AsyncMock,
) as mock_gen,
):
job = {
"report_type": "invalid_type",
"period_start": "2025-01-15",
"period_end": "2025-01-15",
}
await process_report_job(pool, job)
mock_gen.assert_not_awaited()
@pytest.mark.asyncio
async def test_invalid_date_returns_early(self) -> None:
"""An invalid date in the job payload causes early return."""
pool = _mock_pool()
with (
patch(
"services.reporting.generator.generate_report",
new_callable=AsyncMock,
) as mock_gen,
):
job = {
"report_type": "daily",
"period_start": "not-a-date",
"period_end": "2025-01-15",
}
await process_report_job(pool, job)
mock_gen.assert_not_awaited()
@pytest.mark.asyncio
async def test_missing_fields_returns_early(self) -> None:
"""Missing fields in the job payload causes early return."""
pool = _mock_pool()
with (
patch(
"services.reporting.generator.generate_report",
new_callable=AsyncMock,
) as mock_gen,
):
job = {}
await process_report_job(pool, job)
mock_gen.assert_not_awaited()
@pytest.mark.asyncio
async def test_duplicate_job_rejected(self) -> None:
"""A duplicate in-progress job is rejected without calling generate_report."""
pool = _mock_pool()
key = "daily:2025-01-20:2025-01-20"
# Simulate an in-progress job
_in_progress_jobs.add(key)
try:
with (
patch(
"services.reporting.generator.generate_report",
new_callable=AsyncMock,
) as mock_gen,
):
job = {
"report_type": "daily",
"period_start": "2025-01-20",
"period_end": "2025-01-20",
}
await process_report_job(pool, job)
mock_gen.assert_not_awaited()
finally:
_in_progress_jobs.discard(key)
@pytest.mark.asyncio
async def test_job_cleans_up_in_progress_on_success(self) -> None:
"""After successful completion, the job key is removed from _in_progress_jobs."""
pool = _mock_pool()
report = _make_report_data(
period_start=date(2025, 1, 21),
period_end=date(2025, 1, 21),
)
key = "daily:2025-01-21:2025-01-21"
with (
patch(
"services.reporting.generator.generate_report",
new_callable=AsyncMock,
return_value=report,
),
patch(
"services.reporting.generator.store_report",
new_callable=AsyncMock,
return_value=str(uuid.uuid4()),
),
):
job = {
"report_type": "daily",
"period_start": "2025-01-21",
"period_end": "2025-01-21",
}
await process_report_job(pool, job)
assert key not in _in_progress_jobs
@pytest.mark.asyncio
async def test_job_cleans_up_in_progress_on_failure(self) -> None:
"""After all retries fail, the job key is still removed from _in_progress_jobs."""
pool = _mock_pool()
key = "daily:2025-01-22:2025-01-22"
with (
patch(
"services.reporting.generator.generate_report",
new_callable=AsyncMock,
side_effect=RuntimeError("DB down"),
),
patch("asyncio.sleep", new_callable=AsyncMock),
):
job = {
"report_type": "daily",
"period_start": "2025-01-22",
"period_end": "2025-01-22",
}
await process_report_job(pool, job)
assert key not in _in_progress_jobs
@pytest.mark.asyncio
async def test_retries_on_failure(self) -> None:
"""process_report_job retries up to 3 times on failure."""
pool = _mock_pool()
report = _make_report_data(
period_start=date(2025, 1, 23),
period_end=date(2025, 1, 23),
)
call_count = 0
async def _gen_side_effect(*args, **kwargs):
nonlocal call_count
call_count += 1
if call_count < 3:
raise RuntimeError("Transient error")
return report
with (
patch(
"services.reporting.generator.generate_report",
new_callable=AsyncMock,
side_effect=_gen_side_effect,
),
patch(
"services.reporting.generator.store_report",
new_callable=AsyncMock,
return_value=str(uuid.uuid4()),
) as mock_store,
patch("asyncio.sleep", new_callable=AsyncMock) as mock_sleep,
):
job = {
"report_type": "daily",
"period_start": "2025-01-23",
"period_end": "2025-01-23",
}
await process_report_job(pool, job)
# generate_report called 3 times (2 failures + 1 success)
assert call_count == 3
# store_report called once on success
mock_store.assert_awaited_once()
# sleep called twice (between retries)
assert mock_sleep.await_count == 2
@pytest.mark.asyncio
async def test_weekly_job(self) -> None:
"""A weekly job payload is processed correctly."""
pool = _mock_pool()
report = _make_report_data(
report_type=ReportType.WEEKLY,
period_start=date(2025, 1, 13),
period_end=date(2025, 1, 17),
)
with (
patch(
"services.reporting.generator.generate_report",
new_callable=AsyncMock,
return_value=report,
) as mock_gen,
patch(
"services.reporting.generator.store_report",
new_callable=AsyncMock,
return_value=str(uuid.uuid4()),
),
):
job = {
"report_type": "weekly",
"period_start": "2025-01-13",
"period_end": "2025-01-17",
}
await process_report_job(pool, job)
mock_gen.assert_awaited_once_with(
pool, ReportType.WEEKLY, date(2025, 1, 13), date(2025, 1, 17),
)
+578
View File
@@ -0,0 +1,578 @@
"""Unit tests for report section builders.
Tests each section builder from services.reporting.sections with known
inputs and expected outputs, including edge cases for zero-activity,
single positions, and missing portfolio snapshots.
Requirements validated: 3.1, 3.2, 3.3, 3.4, 3.5
"""
from __future__ import annotations
import uuid
from datetime import datetime, timezone
from services.reporting.collector import CollectedData
from services.reporting.models import (
ModelQualitySection,
PLSection,
PositionPerformanceSection,
RecommendationAccuracySection,
RiskMetricsSection,
)
from services.reporting.sections import (
build_model_quality_section,
build_pnl_section,
build_position_performance_section,
build_recommendation_accuracy_section,
build_risk_metrics_section,
)
# ── Helpers ──────────────────────────────────────────────────────────────
def _make_snapshot(**overrides: object) -> dict:
"""Build a portfolio snapshot dict with sensible defaults."""
snap = {
"realized_pnl": 100.0,
"unrealized_pnl": -20.0,
"daily_return": 0.015,
"cumulative_return": 0.08,
"win_count": 7,
"loss_count": 3,
"win_rate": 0.7,
"sharpe_ratio": 1.5,
"portfolio_heat": 0.12,
"max_drawdown": 0.06,
"current_drawdown_pct": 0.02,
"risk_tier": "moderate",
}
snap.update(overrides)
return snap
def _make_closed_position(
ticker: str,
entry: float,
exit_price: float,
realized_pnl: float,
updated_at: datetime | None = None,
) -> dict:
"""Build a closed position dict."""
return {
"id": str(uuid.uuid4()),
"ticker": ticker,
"avg_entry_price": entry,
"current_price": exit_price,
"realized_pnl": realized_pnl,
"quantity": 0,
"updated_at": updated_at or datetime(2025, 1, 15, 20, 0, tzinfo=timezone.utc),
}
def _make_open_position(
ticker: str,
entry: float,
current: float,
quantity: float,
updated_at: datetime | None = None,
) -> dict:
"""Build an open position dict."""
return {
"id": str(uuid.uuid4()),
"ticker": ticker,
"avg_entry_price": entry,
"current_price": current,
"quantity": quantity,
"updated_at": updated_at or datetime(2025, 1, 14, 10, 0, tzinfo=timezone.utc),
}
# ═══════════════════════════════════════════════════════════════════════
# 1. build_pnl_section
# Requirements validated: 3.1
# ═══════════════════════════════════════════════════════════════════════
class TestBuildPnlSection:
"""Tests for build_pnl_section."""
def test_with_portfolio_snapshot(self) -> None:
"""Section values are extracted from the portfolio snapshot."""
snap = _make_snapshot()
data = CollectedData(portfolio_snapshot=snap)
section = build_pnl_section(data)
assert isinstance(section, PLSection)
assert section.realized_pnl == 100.0
assert section.unrealized_pnl == -20.0
assert section.daily_return == 0.015
assert section.cumulative_return == 0.08
assert section.win_count == 7
assert section.loss_count == 3
assert section.win_rate == 0.7
assert section.sharpe_ratio == 1.5
def test_no_snapshot_returns_zeros(self) -> None:
"""When no portfolio snapshot exists, all values are zero."""
data = CollectedData(portfolio_snapshot=None)
section = build_pnl_section(data)
assert section.realized_pnl == 0.0
assert section.unrealized_pnl == 0.0
assert section.daily_return == 0.0
assert section.cumulative_return == 0.0
assert section.win_count == 0
assert section.loss_count == 0
assert section.win_rate == 0.0
assert section.profit_factor == 0.0
assert section.sharpe_ratio == 0.0
def test_profit_factor_from_closed_positions(self) -> None:
"""Profit factor = sum(gains) / abs(sum(losses)) from closed positions."""
snap = _make_snapshot()
closed = [
_make_closed_position("AAPL", 100.0, 110.0, 50.0), # gain
_make_closed_position("MSFT", 200.0, 190.0, -20.0), # loss
_make_closed_position("GOOG", 150.0, 160.0, 30.0), # gain
]
data = CollectedData(portfolio_snapshot=snap, closed_positions=closed)
section = build_pnl_section(data)
# gains = 50 + 30 = 80, losses = 20
expected_pf = 80.0 / 20.0
assert abs(section.profit_factor - expected_pf) < 1e-9
def test_profit_factor_no_losses(self) -> None:
"""When there are no losses, profit factor is 0.0 (no divisor)."""
snap = _make_snapshot()
closed = [
_make_closed_position("AAPL", 100.0, 110.0, 50.0),
]
data = CollectedData(portfolio_snapshot=snap, closed_positions=closed)
section = build_pnl_section(data)
assert section.profit_factor == 0.0
def test_profit_factor_no_closed_positions(self) -> None:
"""When there are no closed positions, profit factor is 0.0."""
snap = _make_snapshot()
data = CollectedData(portfolio_snapshot=snap, closed_positions=[])
section = build_pnl_section(data)
assert section.profit_factor == 0.0
def test_snapshot_with_none_values(self) -> None:
"""Snapshot fields that are None are coerced to zero."""
snap = _make_snapshot(
realized_pnl=None,
unrealized_pnl=None,
daily_return=None,
win_count=None,
)
data = CollectedData(portfolio_snapshot=snap)
section = build_pnl_section(data)
assert section.realized_pnl == 0.0
assert section.unrealized_pnl == 0.0
assert section.daily_return == 0.0
assert section.win_count == 0
# ═══════════════════════════════════════════════════════════════════════
# 2. build_recommendation_accuracy_section
# Requirements validated: 3.2
# ═══════════════════════════════════════════════════════════════════════
class TestBuildRecommendationAccuracySection:
"""Tests for build_recommendation_accuracy_section."""
def test_with_act_and_skip_decisions(self) -> None:
"""Correctly counts act/skip and computes win rate and confidence."""
rec_id_1 = str(uuid.uuid4())
rec_id_2 = str(uuid.uuid4())
rec_id_3 = str(uuid.uuid4())
data = CollectedData(
trading_decisions=[
{"id": "td1", "recommendation_id": rec_id_1, "decision": "act", "ticker": "AAPL"},
{"id": "td2", "recommendation_id": rec_id_2, "decision": "skip", "ticker": "MSFT"},
{"id": "td3", "recommendation_id": rec_id_3, "decision": "act", "ticker": "GOOG"},
],
recommendations=[
{"id": rec_id_1, "confidence": 0.8},
{"id": rec_id_2, "confidence": 0.3},
{"id": rec_id_3, "confidence": 0.9},
],
prediction_outcomes=[
{"ticker": "AAPL", "profitable": True, "direction_correct": True},
{"ticker": "GOOG", "profitable": False, "direction_correct": False},
],
)
section = build_recommendation_accuracy_section(data)
assert isinstance(section, RecommendationAccuracySection)
assert section.total_evaluated == 3
assert section.act_count == 2
assert section.skip_count == 1
# 1 win out of 2 acted with outcomes
assert abs(section.acted_win_rate - 0.5) < 1e-9
# avg confidence acted = (0.8 + 0.9) / 2 = 0.85
assert abs(section.avg_confidence_acted - 0.85) < 1e-9
# avg confidence skipped = 0.3
assert abs(section.avg_confidence_skipped - 0.3) < 1e-9
def test_no_decisions_returns_zeros(self) -> None:
"""When there are no trading decisions, all values are zero."""
data = CollectedData(trading_decisions=[])
section = build_recommendation_accuracy_section(data)
assert section.total_evaluated == 0
assert section.act_count == 0
assert section.skip_count == 0
assert section.acted_win_rate == 0.0
assert section.avg_confidence_acted == 0.0
assert section.avg_confidence_skipped == 0.0
def test_all_act_decisions(self) -> None:
"""When all decisions are 'act', skip_count is 0."""
rec_id = str(uuid.uuid4())
data = CollectedData(
trading_decisions=[
{"id": "td1", "recommendation_id": rec_id, "decision": "act", "ticker": "AAPL"},
],
recommendations=[
{"id": rec_id, "confidence": 0.75},
],
prediction_outcomes=[
{"ticker": "AAPL", "profitable": True, "direction_correct": True},
],
)
section = build_recommendation_accuracy_section(data)
assert section.act_count == 1
assert section.skip_count == 0
assert section.acted_win_rate == 1.0
assert abs(section.avg_confidence_acted - 0.75) < 1e-9
assert section.avg_confidence_skipped == 0.0
def test_act_without_prediction_outcome(self) -> None:
"""When an acted decision has no matching prediction outcome, win rate is 0."""
rec_id = str(uuid.uuid4())
data = CollectedData(
trading_decisions=[
{"id": "td1", "recommendation_id": rec_id, "decision": "act", "ticker": "AAPL"},
],
recommendations=[
{"id": rec_id, "confidence": 0.6},
],
prediction_outcomes=[], # no outcomes
)
section = build_recommendation_accuracy_section(data)
assert section.act_count == 1
assert section.acted_win_rate == 0.0
# ═══════════════════════════════════════════════════════════════════════
# 3. build_position_performance_section
# Requirements validated: 3.3
# ═══════════════════════════════════════════════════════════════════════
class TestBuildPositionPerformanceSection:
"""Tests for build_position_performance_section."""
def test_with_open_positions(self) -> None:
"""Open positions are listed with computed P&L and P&L%."""
pos = _make_open_position("AAPL", 150.0, 160.0, 10.0)
data = CollectedData(open_positions=[pos])
section = build_position_performance_section(data)
assert isinstance(section, PositionPerformanceSection)
assert len(section.positions) == 1
p = section.positions[0]
assert p.ticker == "AAPL"
assert p.entry_price == 150.0
assert p.current_or_exit_price == 160.0
assert p.status == "open"
# pnl = (160 - 150) * 10 = 100
assert abs(p.pnl - 100.0) < 1e-9
# pnl_pct = 100 / (150 * 10) * 100 = 6.666...%
assert abs(p.pnl_pct - (100.0 / 1500.0 * 100)) < 1e-6
def test_with_closed_positions(self) -> None:
"""Closed positions use realized_pnl directly."""
pos = _make_closed_position("MSFT", 200.0, 210.0, 50.0)
data = CollectedData(closed_positions=[pos])
section = build_position_performance_section(data)
assert len(section.positions) == 1
p = section.positions[0]
assert p.ticker == "MSFT"
assert p.status == "closed"
assert p.pnl == 50.0
def test_empty_positions(self) -> None:
"""When there are no positions, the list is empty."""
data = CollectedData(open_positions=[], closed_positions=[])
section = build_position_performance_section(data)
assert isinstance(section, PositionPerformanceSection)
assert len(section.positions) == 0
def test_mixed_open_and_closed(self) -> None:
"""Both open and closed positions appear in the output."""
open_pos = _make_open_position("AAPL", 150.0, 160.0, 10.0)
closed_pos = _make_closed_position("GOOG", 100.0, 90.0, -25.0)
data = CollectedData(open_positions=[open_pos], closed_positions=[closed_pos])
section = build_position_performance_section(data)
assert len(section.positions) == 2
tickers = {p.ticker for p in section.positions}
assert tickers == {"AAPL", "GOOG"}
statuses = {p.ticker: p.status for p in section.positions}
assert statuses["AAPL"] == "open"
assert statuses["GOOG"] == "closed"
def test_single_position(self) -> None:
"""A single open position is handled correctly."""
pos = _make_open_position("TSLA", 250.0, 250.0, 5.0)
data = CollectedData(open_positions=[pos])
section = build_position_performance_section(data)
assert len(section.positions) == 1
p = section.positions[0]
# pnl = (250 - 250) * 5 = 0
assert p.pnl == 0.0
assert p.pnl_pct == 0.0
def test_hold_duration_computed(self) -> None:
"""Hold duration is computed from updated_at to now."""
# Use a fixed updated_at far enough in the past to get a positive duration
updated = datetime(2025, 1, 10, 12, 0, tzinfo=timezone.utc)
pos = _make_open_position("AAPL", 100.0, 110.0, 1.0, updated_at=updated)
data = CollectedData(open_positions=[pos])
section = build_position_performance_section(data)
# Hold duration should be positive (since updated_at is in the past)
assert section.positions[0].hold_duration_hours > 0.0
# ═══════════════════════════════════════════════════════════════════════
# 4. build_risk_metrics_section
# Requirements validated: 3.4
# ═══════════════════════════════════════════════════════════════════════
class TestBuildRiskMetricsSection:
"""Tests for build_risk_metrics_section."""
def test_with_snapshot(self) -> None:
"""Risk metrics are extracted from the portfolio snapshot."""
snap = _make_snapshot(
risk_tier="high",
portfolio_heat=0.25,
max_drawdown=0.10,
current_drawdown_pct=0.05,
)
data = CollectedData(
portfolio_snapshot=snap,
reserve_pool_balance=500.0,
circuit_breaker_events=[{"id": "cb1"}, {"id": "cb2"}],
)
section = build_risk_metrics_section(data)
assert isinstance(section, RiskMetricsSection)
assert section.current_risk_tier == "high"
assert section.portfolio_heat == 0.25
assert section.max_drawdown == 0.10
assert section.current_drawdown_pct == 0.05
assert section.reserve_pool_balance == 500.0
assert section.circuit_breaker_event_count == 2
def test_no_snapshot(self) -> None:
"""When no snapshot exists, risk tier is 'unknown' and metrics are zero."""
data = CollectedData(
portfolio_snapshot=None,
reserve_pool_balance=300.0,
circuit_breaker_events=[],
)
section = build_risk_metrics_section(data)
assert section.current_risk_tier == "unknown"
assert section.portfolio_heat == 0.0
assert section.max_drawdown == 0.0
assert section.current_drawdown_pct == 0.0
assert section.reserve_pool_balance == 300.0
assert section.circuit_breaker_event_count == 0
def test_circuit_breaker_count(self) -> None:
"""Circuit breaker event count matches the number of events."""
events = [{"id": f"cb{i}"} for i in range(5)]
data = CollectedData(
portfolio_snapshot=_make_snapshot(),
circuit_breaker_events=events,
reserve_pool_balance=0.0,
)
section = build_risk_metrics_section(data)
assert section.circuit_breaker_event_count == 5
def test_zero_circuit_breaker_events(self) -> None:
"""Zero circuit breaker events when list is empty."""
data = CollectedData(
portfolio_snapshot=_make_snapshot(),
circuit_breaker_events=[],
reserve_pool_balance=100.0,
)
section = build_risk_metrics_section(data)
assert section.circuit_breaker_event_count == 0
# ═══════════════════════════════════════════════════════════════════════
# 5. build_model_quality_section
# Requirements validated: 3.5
# ═══════════════════════════════════════════════════════════════════════
class TestBuildModelQualitySection:
"""Tests for build_model_quality_section."""
def test_with_all_windows(self) -> None:
"""Model quality section extracts metrics for 7d, 30d, 90d windows."""
snapshots = [
{
"lookback_window": "7d",
"generated_at": "2025-01-15T20:00:00Z",
"win_rate": 0.65,
"directional_accuracy": 0.62,
"information_coefficient": 0.08,
"calibration_error": 0.12,
"brier_score": 0.22,
},
{
"lookback_window": "30d",
"generated_at": "2025-01-15T20:00:00Z",
"win_rate": 0.60,
"directional_accuracy": 0.58,
"information_coefficient": 0.06,
"calibration_error": 0.15,
"brier_score": 0.25,
},
{
"lookback_window": "90d",
"generated_at": "2025-01-15T20:00:00Z",
"win_rate": 0.55,
"directional_accuracy": 0.53,
"information_coefficient": 0.04,
"calibration_error": 0.18,
"brier_score": 0.28,
},
]
data = CollectedData(model_metric_snapshots=snapshots)
section = build_model_quality_section(data)
assert isinstance(section, ModelQualitySection)
assert len(section.windows) == 3
by_lookback = {w.lookback: w for w in section.windows}
assert by_lookback["7d"].win_rate == 0.65
assert by_lookback["7d"].directional_accuracy == 0.62
assert by_lookback["7d"].information_coefficient == 0.08
assert by_lookback["7d"].calibration_error == 0.12
assert by_lookback["7d"].brier_score == 0.22
assert by_lookback["30d"].win_rate == 0.60
assert by_lookback["90d"].win_rate == 0.55
def test_no_snapshots(self) -> None:
"""When there are no model metric snapshots, windows list is empty."""
data = CollectedData(model_metric_snapshots=[])
section = build_model_quality_section(data)
assert isinstance(section, ModelQualitySection)
assert len(section.windows) == 0
def test_partial_windows(self) -> None:
"""When only some lookback windows are present, missing ones get None values."""
snapshots = [
{
"lookback_window": "7d",
"generated_at": "2025-01-15T20:00:00Z",
"win_rate": 0.70,
"directional_accuracy": 0.68,
"information_coefficient": 0.10,
"calibration_error": 0.08,
"brier_score": 0.18,
},
]
data = CollectedData(model_metric_snapshots=snapshots)
section = build_model_quality_section(data)
assert len(section.windows) == 3
by_lookback = {w.lookback: w for w in section.windows}
# 7d has values
assert by_lookback["7d"].win_rate == 0.70
# 30d and 90d have None values
assert by_lookback["30d"].win_rate is None
assert by_lookback["30d"].directional_accuracy is None
assert by_lookback["90d"].win_rate is None
assert by_lookback["90d"].brier_score is None
def test_takes_latest_snapshot_per_window(self) -> None:
"""When multiple snapshots exist for a window, the first (latest) is used."""
snapshots = [
{
"lookback_window": "7d",
"generated_at": "2025-01-15T20:00:00Z",
"win_rate": 0.70,
"directional_accuracy": None,
"information_coefficient": None,
"calibration_error": None,
"brier_score": None,
},
{
"lookback_window": "7d",
"generated_at": "2025-01-14T20:00:00Z",
"win_rate": 0.50,
"directional_accuracy": None,
"information_coefficient": None,
"calibration_error": None,
"brier_score": None,
},
]
data = CollectedData(model_metric_snapshots=snapshots)
section = build_model_quality_section(data)
by_lookback = {w.lookback: w for w in section.windows}
# Collector orders by generated_at DESC, so first entry (0.70) is latest
assert by_lookback["7d"].win_rate == 0.70
def test_none_metric_values(self) -> None:
"""Snapshot with None metric values produces None in the window."""
snapshots = [
{
"lookback_window": "7d",
"generated_at": "2025-01-15T20:00:00Z",
"win_rate": None,
"directional_accuracy": None,
"information_coefficient": None,
"calibration_error": None,
"brier_score": None,
},
]
data = CollectedData(model_metric_snapshots=snapshots)
section = build_model_quality_section(data)
w = section.windows[0]
assert w.win_rate is None
assert w.directional_accuracy is None
assert w.information_coefficient is None
assert w.calibration_error is None
assert w.brier_score is None
+203
View File
@@ -0,0 +1,203 @@
"""Unit tests for AI summarizer.
Tests the deterministic fallback summary generation and chunk_data edge cases
from services.reporting.summarizer.
Requirements validated: 2.2, 2.6
"""
from __future__ import annotations
from services.reporting.summarizer import build_deterministic_summary, chunk_data
# ═══════════════════════════════════════════════════════════════════════
# 1. chunk_data — edge cases
# Requirements validated: 2.2
# ═══════════════════════════════════════════════════════════════════════
class TestChunkDataEdgeCases:
"""Tests for chunk_data edge cases."""
def test_empty_input_returns_single_empty_chunk(self) -> None:
"""Empty input produces exactly one empty-string chunk."""
result = chunk_data("", max_chars=100)
assert result == [""]
def test_single_character_returns_one_chunk(self) -> None:
"""A single character fits in one chunk."""
result = chunk_data("x", max_chars=100)
assert result == ["x"]
def test_exactly_at_limit_returns_one_chunk(self) -> None:
"""A string exactly at the limit fits in one chunk."""
data = "a" * 50
result = chunk_data(data, max_chars=50)
assert result == [data]
def test_one_char_over_limit_with_newline_returns_two_chunks(self) -> None:
"""A string one char over the limit (with a newline) splits into two chunks."""
# 25 chars + newline + 25 chars = 51 chars total, limit=50
data = "a" * 25 + "\n" + "b" * 25
result = chunk_data(data, max_chars=50)
assert len(result) == 2
# First chunk: "aaa...a\n" (26 chars), second chunk: "bbb...b" (25 chars)
assert result[0] == "a" * 25 + "\n"
assert result[1] == "b" * 25
# Round-trip: concatenation reconstructs original
assert "".join(result) == data
def test_no_newlines_in_long_string_returns_one_chunk(self) -> None:
"""A long string with no newlines is never broken mid-line — stays as one chunk."""
data = "x" * 200
result = chunk_data(data, max_chars=50)
# No newlines means no split points, so the entire string is one chunk
assert result == [data]
def test_multiple_newlines_proper_splitting(self) -> None:
"""Multiple newlines produce proper splitting at line boundaries."""
# 3 lines of 30 chars each (including newlines): "aaa...\n" "bbb...\n" "ccc..."
line_a = "a" * 29 + "\n" # 30 chars
line_b = "b" * 29 + "\n" # 30 chars
line_c = "c" * 29 # 29 chars
data = line_a + line_b + line_c # 89 chars total
result = chunk_data(data, max_chars=60)
# First chunk: line_a + line_b = 60 chars (exactly at limit)
# Second chunk: line_c = 29 chars
assert len(result) == 2
assert result[0] == line_a + line_b
assert result[1] == line_c
assert "".join(result) == data
def test_round_trip_concatenation(self) -> None:
"""Concatenating all chunks reconstructs the original string."""
data = "line1\nline2\nline3\nline4\n"
result = chunk_data(data, max_chars=12)
assert "".join(result) == data
def test_max_chars_one(self) -> None:
"""With max_chars=1, each line-segment becomes its own chunk."""
data = "a\nb"
result = chunk_data(data, max_chars=1)
# "a\n" is 2 chars but no split point within it, so it's one chunk
# "b" is 1 char, another chunk
assert "".join(result) == data
assert len(result) >= 2
# ═══════════════════════════════════════════════════════════════════════
# 2. build_deterministic_summary — section type templates
# Requirements validated: 2.6
# ═══════════════════════════════════════════════════════════════════════
class TestBuildDeterministicSummary:
"""Tests for build_deterministic_summary with each section type."""
def test_pnl_section(self) -> None:
"""P&L section uses the pnl template with realized_pnl, unrealized_pnl, etc."""
data = {
"realized_pnl": 125.50,
"unrealized_pnl": -30.20,
"daily_return": 1.2,
"win_rate": 72.7,
}
result = build_deterministic_summary("pnl", data)
assert "125.5" in result
assert "-30.2" in result
assert "1.2" in result
assert "72.7" in result
assert result.startswith("P&L Summary:")
def test_recommendation_accuracy_section(self) -> None:
"""Recommendation accuracy section uses the template with total_evaluated, act_count, etc."""
data = {
"total_evaluated": 15,
"act_count": 8,
"acted_win_rate": 75.0,
"skip_count": 7,
"avg_confidence_acted": 0.72,
"avg_confidence_skipped": 0.48,
}
result = build_deterministic_summary("recommendation_accuracy", data)
assert "15" in result
assert "8" in result
assert "75.0" in result or "75" in result
assert "7" in result
assert result.startswith("Recommendation Accuracy:")
def test_position_performance_section(self) -> None:
"""Position performance section uses the template with position count."""
data = {
"positions": [
{"ticker": "AAPL", "pnl": 68.0},
{"ticker": "MSFT", "pnl": -12.0},
{"ticker": "GOOG", "pnl": 25.0},
],
}
result = build_deterministic_summary("position_performance", data)
assert "3" in result
assert "Position Performance:" in result
def test_position_performance_empty_positions(self) -> None:
"""Position performance with no positions reports 0."""
data = {"positions": []}
result = build_deterministic_summary("position_performance", data)
assert "0" in result
def test_risk_metrics_section(self) -> None:
"""Risk metrics section uses the template with risk_tier, portfolio_heat, etc."""
data = {
"current_risk_tier": "moderate",
"portfolio_heat": 0.12,
"max_drawdown": 0.08,
"current_drawdown_pct": 3.0,
"reserve_pool_balance": 450.00,
"circuit_breaker_event_count": 1,
}
result = build_deterministic_summary("risk_metrics", data)
assert "moderate" in result
assert "0.12" in result
assert "0.08" in result
assert "3.0" in result or "3" in result
assert "450" in result
assert "1" in result
assert result.startswith("Risk Metrics:")
def test_model_quality_section(self) -> None:
"""Model quality section uses the template with window count."""
data = {
"windows": [
{"lookback": "7d"},
{"lookback": "30d"},
{"lookback": "90d"},
],
}
result = build_deterministic_summary("model_quality", data)
assert "3" in result
assert "Model Quality:" in result
def test_model_quality_no_windows(self) -> None:
"""Model quality with no windows reports 0."""
data = {"windows": []}
result = build_deterministic_summary("model_quality", data)
assert "0" in result
def test_unknown_section_generic_fallback(self) -> None:
"""An unknown section name produces a generic fallback summary."""
data = {"metric_a": 1, "metric_b": 2, "metric_c": 3}
result = build_deterministic_summary("unknown_section", data)
assert "unknown_section" in result
assert "3 metrics reported" in result
def test_unknown_section_empty_data(self) -> None:
"""An unknown section with empty data reports 0 metrics."""
result = build_deterministic_summary("totally_new", {})
assert "totally_new" in result
assert "0 metrics reported" in result
def test_pnl_missing_key_falls_back(self) -> None:
"""P&L template with missing keys falls back to error message."""
data = {"realized_pnl": 100.0} # missing other keys
result = build_deterministic_summary("pnl", data)
# Should fall back to the error message since template.format() will raise KeyError
assert "template formatting failed" in result
+551
View File
@@ -0,0 +1,551 @@
"""Unit tests for report validator.
Tests the validation functions from services.reporting.validator with
specific discrepancy scenarios, boundary cases, and edge cases.
Requirements validated: 4.1, 4.2, 4.3, 4.4
"""
from __future__ import annotations
from datetime import date, datetime, timezone
from services.reporting.models import (
ModelQualitySection,
ModelQualityWindow,
PLSection,
PositionPerformanceSection,
RecommendationAccuracySection,
ReportData,
ReportType,
RiskMetricsSection,
ValidationStatus,
)
from services.reporting.validator import (
_check_discrepancy,
compute_validation_status,
validate_model_quality,
validate_recommendation_accuracy,
)
# ── Helpers ──────────────────────────────────────────────────────────────
def _make_report(**overrides: object) -> ReportData:
"""Build a minimal ReportData with sensible defaults."""
defaults: dict = {
"pnl": PLSection(
realized_pnl=0.0,
unrealized_pnl=0.0,
daily_return=0.0,
cumulative_return=0.0,
win_count=0,
loss_count=0,
win_rate=0.0,
profit_factor=0.0,
sharpe_ratio=0.0,
),
"recommendation_accuracy": RecommendationAccuracySection(
total_evaluated=0,
act_count=0,
skip_count=0,
acted_win_rate=0.0,
avg_confidence_acted=0.0,
avg_confidence_skipped=0.0,
),
"position_performance": PositionPerformanceSection(),
"risk_metrics": RiskMetricsSection(
current_risk_tier="moderate",
portfolio_heat=0.0,
max_drawdown=0.0,
current_drawdown_pct=0.0,
reserve_pool_balance=0.0,
circuit_breaker_event_count=0,
),
"model_quality": ModelQualitySection(),
"generated_at": datetime(2025, 1, 15, 21, 30, tzinfo=timezone.utc),
"period_start": date(2025, 1, 15),
"period_end": date(2025, 1, 15),
"report_type": ReportType.DAILY,
}
defaults.update(overrides)
return ReportData(**defaults)
# ═══════════════════════════════════════════════════════════════════════
# 1. _check_discrepancy — boundary tests
# Requirements validated: 4.1, 4.2, 4.3
# ═══════════════════════════════════════════════════════════════════════
class TestCheckDiscrepancy:
"""Tests for _check_discrepancy boundary and edge cases."""
def test_exactly_5_percent_no_warning(self) -> None:
"""Exactly 5% discrepancy does NOT trigger a warning (threshold is >5%)."""
# snapshot=100, computed=105 → |105-100|/100*100 = 5.0%
result = _check_discrepancy("test_field", 105.0, 100.0)
assert result is None
def test_just_above_5_percent_triggers_warning(self) -> None:
"""5.1% discrepancy triggers a warning."""
# snapshot=100, computed=105.1 → |105.1-100|/100*100 = 5.1%
result = _check_discrepancy("test_field", 105.1, 100.0)
assert result is not None
assert result.field_name == "test_field"
assert result.computed_value == 105.1
assert result.snapshot_value == 100.0
assert abs(result.pct_difference - 5.1) < 0.01
def test_snapshot_zero_computed_nonzero_warns(self) -> None:
"""snapshot=0 with computed≠0 → 100% discrepancy → warning."""
result = _check_discrepancy("test_field", 42.0, 0.0)
assert result is not None
assert result.pct_difference == 100.0
def test_both_zero_no_warning(self) -> None:
"""Both snapshot=0 and computed=0 → no warning."""
result = _check_discrepancy("test_field", 0.0, 0.0)
assert result is None
def test_large_discrepancy(self) -> None:
"""A large discrepancy (50%) triggers a warning."""
# snapshot=100, computed=150 → 50%
result = _check_discrepancy("big_diff", 150.0, 100.0)
assert result is not None
assert abs(result.pct_difference - 50.0) < 0.01
def test_small_discrepancy_no_warning(self) -> None:
"""A small discrepancy (1%) does not trigger a warning."""
# snapshot=100, computed=101 → 1%
result = _check_discrepancy("small_diff", 101.0, 100.0)
assert result is None
def test_computed_below_snapshot(self) -> None:
"""Discrepancy is detected when computed < snapshot too."""
# snapshot=100, computed=94 → 6%
result = _check_discrepancy("below", 94.0, 100.0)
assert result is not None
assert abs(result.pct_difference - 6.0) < 0.01
def test_nan_computed_sanitized_to_zero(self) -> None:
"""NaN computed value is sanitized to 0.0 before comparison."""
result = _check_discrepancy("nan_field", float("nan"), 100.0)
# sanitized computed=0.0, snapshot=100 → 100% discrepancy
assert result is not None
assert result.computed_value == 0.0
assert result.pct_difference == 100.0
def test_inf_computed_sanitized_to_zero(self) -> None:
"""Infinity computed value is sanitized to 0.0 before comparison."""
result = _check_discrepancy("inf_field", float("inf"), 100.0)
assert result is not None
assert result.computed_value == 0.0
def test_snapshot_zero_computed_zero_small(self) -> None:
"""snapshot=0.0 and computed=0.0 exactly → no warning."""
result = _check_discrepancy("zero_zero", 0.0, 0.0)
assert result is None
# ═══════════════════════════════════════════════════════════════════════
# 2. validate_recommendation_accuracy
# Requirements validated: 4.1
# ═══════════════════════════════════════════════════════════════════════
class TestValidateRecommendationAccuracy:
"""Tests for validate_recommendation_accuracy."""
def test_matching_data_no_warnings(self) -> None:
"""When section win rate matches prediction outcomes, no warnings."""
# 2 out of 4 profitable → 0.5 win rate
section = RecommendationAccuracySection(
total_evaluated=4,
act_count=4,
skip_count=0,
acted_win_rate=0.5,
avg_confidence_acted=0.7,
avg_confidence_skipped=0.0,
)
outcomes = [
{"profitable": True},
{"profitable": False},
{"profitable": True},
{"profitable": False},
]
warnings = validate_recommendation_accuracy(section, outcomes)
assert warnings == []
def test_discrepancy_triggers_warning(self) -> None:
"""When section win rate differs >5% from outcomes, a warning is raised."""
# outcomes: 1/2 profitable → 0.5, section says 0.8 → 60% discrepancy
section = RecommendationAccuracySection(
total_evaluated=2,
act_count=2,
skip_count=0,
acted_win_rate=0.8,
avg_confidence_acted=0.7,
avg_confidence_skipped=0.0,
)
outcomes = [
{"profitable": True},
{"profitable": False},
]
warnings = validate_recommendation_accuracy(section, outcomes)
assert len(warnings) == 1
assert warnings[0].field_name == "acted_win_rate"
def test_no_outcomes_returns_empty(self) -> None:
"""When there are no prediction outcomes, validation is skipped."""
section = RecommendationAccuracySection(
total_evaluated=5,
act_count=3,
skip_count=2,
acted_win_rate=0.6,
avg_confidence_acted=0.7,
avg_confidence_skipped=0.4,
)
warnings = validate_recommendation_accuracy(section, [])
assert warnings == []
def test_all_profitable_matching(self) -> None:
"""All outcomes profitable and section says 1.0 → no warning."""
section = RecommendationAccuracySection(
total_evaluated=3,
act_count=3,
skip_count=0,
acted_win_rate=1.0,
avg_confidence_acted=0.9,
avg_confidence_skipped=0.0,
)
outcomes = [
{"profitable": True},
{"profitable": True},
{"profitable": True},
]
warnings = validate_recommendation_accuracy(section, outcomes)
assert warnings == []
# ═══════════════════════════════════════════════════════════════════════
# 3. validate_model_quality
# Requirements validated: 4.2, 4.3
# ═══════════════════════════════════════════════════════════════════════
class TestValidateModelQuality:
"""Tests for validate_model_quality."""
def test_matching_data_no_warnings(self) -> None:
"""When section metrics match snapshots, no warnings are produced."""
section = ModelQualitySection(
windows=[
ModelQualityWindow(
lookback="7d",
win_rate=0.65,
directional_accuracy=0.62,
information_coefficient=0.08,
calibration_error=0.12,
brier_score=0.22,
),
],
)
snapshots = [
{
"lookback_window": "7d",
"win_rate": 0.65,
"directional_accuracy": 0.62,
"information_coefficient": 0.08,
"calibration_error": 0.12,
"brier_score": 0.22,
},
]
warnings = validate_model_quality(section, snapshots)
assert warnings == []
def test_discrepancy_triggers_warnings(self) -> None:
"""When section metrics differ >5% from snapshots, warnings are raised."""
section = ModelQualitySection(
windows=[
ModelQualityWindow(
lookback="7d",
win_rate=0.80, # snapshot says 0.65 → ~23% off
directional_accuracy=0.62,
information_coefficient=0.08,
calibration_error=0.12,
brier_score=0.22,
),
],
)
snapshots = [
{
"lookback_window": "7d",
"win_rate": 0.65,
"directional_accuracy": 0.62,
"information_coefficient": 0.08,
"calibration_error": 0.12,
"brier_score": 0.22,
},
]
warnings = validate_model_quality(section, snapshots)
assert len(warnings) == 1
assert warnings[0].field_name == "7d_win_rate"
def test_null_snapshot_value_skipped(self) -> None:
"""When a snapshot metric is NULL (None), that metric is skipped."""
section = ModelQualitySection(
windows=[
ModelQualityWindow(
lookback="7d",
win_rate=0.65,
directional_accuracy=0.62,
information_coefficient=0.08,
calibration_error=0.12,
brier_score=0.22,
),
],
)
snapshots = [
{
"lookback_window": "7d",
"win_rate": None, # NULL → skip
"directional_accuracy": None,
"information_coefficient": None,
"calibration_error": None,
"brier_score": None,
},
]
warnings = validate_model_quality(section, snapshots)
assert warnings == []
def test_no_snapshots_returns_empty(self) -> None:
"""When there are no metric snapshots, validation is skipped."""
section = ModelQualitySection(
windows=[
ModelQualityWindow(
lookback="7d",
win_rate=0.65,
directional_accuracy=0.62,
information_coefficient=0.08,
calibration_error=0.12,
brier_score=0.22,
),
],
)
warnings = validate_model_quality(section, [])
assert warnings == []
def test_multiple_windows_validated(self) -> None:
"""Validation runs across all lookback windows."""
section = ModelQualitySection(
windows=[
ModelQualityWindow(
lookback="7d",
win_rate=0.65,
directional_accuracy=0.62,
information_coefficient=0.08,
calibration_error=0.12,
brier_score=0.22,
),
ModelQualityWindow(
lookback="30d",
win_rate=0.90, # snapshot says 0.60 → 50% off
directional_accuracy=0.58,
information_coefficient=0.06,
calibration_error=0.15,
brier_score=0.25,
),
],
)
snapshots = [
{
"lookback_window": "7d",
"win_rate": 0.65,
"directional_accuracy": 0.62,
"information_coefficient": 0.08,
"calibration_error": 0.12,
"brier_score": 0.22,
},
{
"lookback_window": "30d",
"win_rate": 0.60,
"directional_accuracy": 0.58,
"information_coefficient": 0.06,
"calibration_error": 0.15,
"brier_score": 0.25,
},
]
warnings = validate_model_quality(section, snapshots)
# Only 30d_win_rate should be flagged
assert len(warnings) == 1
assert warnings[0].field_name == "30d_win_rate"
def test_null_section_value_skipped(self) -> None:
"""When a section metric is None, that metric is skipped."""
section = ModelQualitySection(
windows=[
ModelQualityWindow(
lookback="7d",
win_rate=None,
directional_accuracy=None,
information_coefficient=None,
calibration_error=None,
brier_score=None,
),
],
)
snapshots = [
{
"lookback_window": "7d",
"win_rate": 0.65,
"directional_accuracy": 0.62,
"information_coefficient": 0.08,
"calibration_error": 0.12,
"brier_score": 0.22,
},
]
warnings = validate_model_quality(section, snapshots)
assert warnings == []
def test_no_matching_window_in_snapshots(self) -> None:
"""When section has a window not in snapshots, it is skipped."""
section = ModelQualitySection(
windows=[
ModelQualityWindow(
lookback="90d",
win_rate=0.55,
directional_accuracy=0.53,
information_coefficient=0.04,
calibration_error=0.18,
brier_score=0.28,
),
],
)
snapshots = [
{
"lookback_window": "7d",
"win_rate": 0.65,
"directional_accuracy": 0.62,
"information_coefficient": 0.08,
"calibration_error": 0.12,
"brier_score": 0.22,
},
]
warnings = validate_model_quality(section, snapshots)
assert warnings == []
# ═══════════════════════════════════════════════════════════════════════
# 4. compute_validation_status
# Requirements validated: 4.4
# ═══════════════════════════════════════════════════════════════════════
class TestComputeValidationStatus:
"""Tests for compute_validation_status."""
def test_no_warnings_returns_passed(self) -> None:
"""When no sections have warnings, status is PASSED."""
report = _make_report()
status = compute_validation_status(report)
assert status == ValidationStatus.PASSED
def test_pnl_warnings_returns_warnings(self) -> None:
"""When P&L section has warnings, status is WARNINGS."""
from services.reporting.models import ValidationWarning
report = _make_report(
pnl=PLSection(
realized_pnl=0.0,
unrealized_pnl=0.0,
daily_return=0.0,
cumulative_return=0.0,
win_count=0,
loss_count=0,
win_rate=0.0,
profit_factor=0.0,
sharpe_ratio=0.0,
validation_warnings=[
ValidationWarning(
field_name="test",
computed_value=1.0,
snapshot_value=0.5,
pct_difference=100.0,
),
],
),
)
status = compute_validation_status(report)
assert status == ValidationStatus.WARNINGS
def test_recommendation_accuracy_warnings_returns_warnings(self) -> None:
"""When recommendation accuracy section has warnings, status is WARNINGS."""
from services.reporting.models import ValidationWarning
report = _make_report(
recommendation_accuracy=RecommendationAccuracySection(
total_evaluated=0,
act_count=0,
skip_count=0,
acted_win_rate=0.0,
avg_confidence_acted=0.0,
avg_confidence_skipped=0.0,
validation_warnings=[
ValidationWarning(
field_name="acted_win_rate",
computed_value=0.8,
snapshot_value=0.5,
pct_difference=60.0,
),
],
),
)
status = compute_validation_status(report)
assert status == ValidationStatus.WARNINGS
def test_model_quality_warnings_returns_warnings(self) -> None:
"""When model quality section has warnings, status is WARNINGS."""
from services.reporting.models import ValidationWarning
report = _make_report(
model_quality=ModelQualitySection(
validation_warnings=[
ValidationWarning(
field_name="7d_win_rate",
computed_value=0.9,
snapshot_value=0.65,
pct_difference=38.46,
),
],
),
)
status = compute_validation_status(report)
assert status == ValidationStatus.WARNINGS
def test_multiple_sections_with_warnings(self) -> None:
"""When multiple sections have warnings, status is still WARNINGS."""
from services.reporting.models import ValidationWarning
w = ValidationWarning(
field_name="x",
computed_value=1.0,
snapshot_value=0.0,
pct_difference=100.0,
)
report = _make_report(
pnl=PLSection(
realized_pnl=0.0,
unrealized_pnl=0.0,
daily_return=0.0,
cumulative_return=0.0,
win_count=0,
loss_count=0,
win_rate=0.0,
profit_factor=0.0,
sharpe_ratio=0.0,
validation_warnings=[w],
),
model_quality=ModelQualitySection(validation_warnings=[w]),
)
status = compute_validation_status(report)
assert status == ValidationStatus.WARNINGS