feat: trading feedback engine — periodic performance reports with AI summarization
ci/woodpecker/push/test Pipeline was successful
ci/woodpecker/push/build-2 Pipeline was successful
ci/woodpecker/push/build-3 Pipeline was successful
ci/woodpecker/push/build-1 Pipeline was successful
ci/woodpecker/push/finalize Pipeline was successful
Build and Push / lint-and-test (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.adapters.broker_adapter name:broker-adapter]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.aggregation.worker name:aggregation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.extractor.worker name:extractor]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.ingestion.worker name:ingestion]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.lake_publisher.worker name:lake-publisher]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.parser.worker name:parser]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.recommendation.worker name:recommendation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.scheduler.app name:scheduler]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.api.app:app --host 0.0.0.0 --port 8000 name:query-api]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.risk.app:app --host 0.0.0.0 --port 8000 name:risk]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.symbol_registry.app:app --host 0.0.0.0 --port 8000 name:symbol-registry]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.trading.app:app --host 0.0.0.0 --port 8000 name:trading-engine]) (push) Has been cancelled
Build and Push / build-dashboard (push) Has been cancelled
Build and Push / build-superset (push) Has been cancelled
Build and Push / integration-test (push) Has been cancelled
Build and Push / beta-gate (push) Has been cancelled
ci/woodpecker/push/test Pipeline was successful
ci/woodpecker/push/build-2 Pipeline was successful
ci/woodpecker/push/build-3 Pipeline was successful
ci/woodpecker/push/build-1 Pipeline was successful
ci/woodpecker/push/finalize Pipeline was successful
Build and Push / lint-and-test (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.adapters.broker_adapter name:broker-adapter]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.aggregation.worker name:aggregation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.extractor.worker name:extractor]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.ingestion.worker name:ingestion]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.lake_publisher.worker name:lake-publisher]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.parser.worker name:parser]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.recommendation.worker name:recommendation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.scheduler.app name:scheduler]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.api.app:app --host 0.0.0.0 --port 8000 name:query-api]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.risk.app:app --host 0.0.0.0 --port 8000 name:risk]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.symbol_registry.app:app --host 0.0.0.0 --port 8000 name:symbol-registry]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.trading.app:app --host 0.0.0.0 --port 8000 name:trading-engine]) (push) Has been cancelled
Build and Push / build-dashboard (push) Has been cancelled
Build and Push / build-superset (push) Has been cancelled
Build and Push / integration-test (push) Has been cancelled
Build and Push / beta-gate (push) Has been cancelled
- Migration 038: trading_reports table + report-summarizer agent seed
- 6 reporting modules: models, collector, sections, validator, summarizer, generator
- API endpoints: GET /api/reports (paginated, filterable), GET /api/reports/{id}
- Frontend hooks: useReports, useReport with TanStack Query
- Scheduler: daily (after 16:30 ET) and weekly (Saturday) report triggers
- Redis queue consumer for async report generation with retry/dedup
- 5 property-based tests (chunking, serialization, validation, accuracy, deltas)
- 109 unit/integration tests across all modules
- 6 frontend hook tests with MSW mocks
This commit is contained in:
@@ -0,0 +1,110 @@
|
||||
# Feature: trading-feedback-engine, Property 1: Chunking round-trip and size constraint
|
||||
"""Property-based tests for report data chunking.
|
||||
|
||||
Feature: trading-feedback-engine
|
||||
|
||||
Tests the chunking round-trip and size constraint property from the design
|
||||
specification: for any input string, splitting it into chunks with a maximum
|
||||
size limit produces chunks where (a) every chunk is ≤ the size limit in
|
||||
characters (for chunks that don't contain a single oversized line), (b) no
|
||||
chunk is empty (except when the input itself is empty, which produces exactly
|
||||
one empty chunk), and (c) concatenating all chunks in order reconstructs the
|
||||
original input string.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from hypothesis import given, settings
|
||||
from hypothesis import strategies as st
|
||||
|
||||
from services.reporting.summarizer import chunk_data
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Property 1: Chunking Round-Trip and Size Constraint
|
||||
# Validates: Requirements 2.2
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
@given(
|
||||
text=st.text(),
|
||||
max_chars=st.integers(min_value=1, max_value=10000),
|
||||
)
|
||||
@settings(max_examples=100)
|
||||
def test_chunk_data_round_trip(text: str, max_chars: int) -> None:
|
||||
"""**Validates: Requirements 2.2**
|
||||
|
||||
For any input string and any max_chars ≥ 1, concatenating all chunks
|
||||
produced by chunk_data SHALL reconstruct the original input string
|
||||
exactly (round-trip property).
|
||||
"""
|
||||
chunks = chunk_data(text, max_chars)
|
||||
reconstructed = "".join(chunks)
|
||||
assert reconstructed == text, (
|
||||
f"Round-trip failed: concatenation of {len(chunks)} chunks does not "
|
||||
f"equal original input.\n"
|
||||
f" original length: {len(text)}\n"
|
||||
f" reconstructed length: {len(reconstructed)}\n"
|
||||
f" max_chars: {max_chars}"
|
||||
)
|
||||
|
||||
|
||||
@given(
|
||||
text=st.text(),
|
||||
max_chars=st.integers(min_value=1, max_value=10000),
|
||||
)
|
||||
@settings(max_examples=100)
|
||||
def test_chunk_data_no_empty_chunks(text: str, max_chars: int) -> None:
|
||||
"""**Validates: Requirements 2.2**
|
||||
|
||||
For any input string and any max_chars ≥ 1, chunk_data SHALL produce
|
||||
no empty chunks — except when the input itself is empty, in which case
|
||||
it SHALL produce exactly one empty chunk.
|
||||
"""
|
||||
chunks = chunk_data(text, max_chars)
|
||||
|
||||
if text == "":
|
||||
assert chunks == [""], (
|
||||
f"Empty input should produce exactly [''], got {chunks!r}"
|
||||
)
|
||||
else:
|
||||
for i, chunk in enumerate(chunks):
|
||||
assert chunk != "", (
|
||||
f"Chunk {i} is empty for non-empty input.\n"
|
||||
f" input length: {len(text)}\n"
|
||||
f" max_chars: {max_chars}\n"
|
||||
f" total chunks: {len(chunks)}"
|
||||
)
|
||||
|
||||
|
||||
@given(
|
||||
text=st.text(),
|
||||
max_chars=st.integers(min_value=1, max_value=10000),
|
||||
)
|
||||
@settings(max_examples=100)
|
||||
def test_chunk_data_size_constraint(text: str, max_chars: int) -> None:
|
||||
"""**Validates: Requirements 2.2**
|
||||
|
||||
For any input string and any max_chars ≥ 1, every chunk produced by
|
||||
chunk_data SHALL be ≤ max_chars in length — UNLESS the chunk contains
|
||||
a single line that by itself exceeds max_chars (since chunk_data never
|
||||
breaks mid-line, such a line is emitted as its own chunk).
|
||||
|
||||
A chunk is considered "oversized due to a single long line" when it
|
||||
consists of exactly one segment (a line with its trailing newline, or
|
||||
the final line without one) whose length exceeds max_chars.
|
||||
"""
|
||||
chunks = chunk_data(text, max_chars)
|
||||
|
||||
for i, chunk in enumerate(chunks):
|
||||
if len(chunk) > max_chars:
|
||||
# This chunk exceeds the limit. It must be because it contains
|
||||
# a single line that is itself longer than max_chars.
|
||||
# A single-segment chunk has at most one newline (at the end).
|
||||
lines_in_chunk = chunk.split("\n")
|
||||
# If the chunk ends with \n, split produces a trailing empty string
|
||||
non_empty_lines = [ln for ln in lines_in_chunk if ln]
|
||||
assert len(non_empty_lines) <= 1, (
|
||||
f"Chunk {i} exceeds max_chars={max_chars} "
|
||||
f"(len={len(chunk)}) but contains multiple non-empty lines, "
|
||||
f"which should not happen.\n"
|
||||
f" lines: {non_empty_lines!r}"
|
||||
)
|
||||
@@ -0,0 +1,423 @@
|
||||
# Feature: trading-feedback-engine, Property 4: Recommendation accuracy aggregation
|
||||
# Feature: trading-feedback-engine, Property 5: Portfolio period-over-period delta computation
|
||||
"""Property-based tests for report section builders.
|
||||
|
||||
Feature: trading-feedback-engine
|
||||
|
||||
Property 4 tests the recommendation accuracy aggregation property from the
|
||||
design specification: for any non-empty list of trading decisions with
|
||||
associated prediction outcomes, the computed acted_win_rate SHALL equal the
|
||||
count of profitable outcomes divided by total acted outcomes with prediction
|
||||
data, and all rate values SHALL be in [0.0, 1.0].
|
||||
|
||||
Property 5 tests the portfolio period-over-period delta computation property
|
||||
from the design specification: for any two valid portfolio snapshots (current
|
||||
and previous), the period-over-period deltas SHALL equal (current - previous)
|
||||
for each field. When no previous snapshot exists, the deltas SHALL be zero.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import uuid
|
||||
|
||||
from hypothesis import given, settings
|
||||
from hypothesis import strategies as st
|
||||
|
||||
from services.reporting.collector import CollectedData
|
||||
from services.reporting.sections import (
|
||||
build_pnl_section,
|
||||
build_recommendation_accuracy_section,
|
||||
)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Property 4: Recommendation Accuracy Aggregation
|
||||
# Validates: Requirements 1.4
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Strategy: generate a list of unique tickers, then build matching
|
||||
# trading_decisions, recommendations, and prediction_outcomes.
|
||||
|
||||
_ticker_strategy = st.text(
|
||||
alphabet=st.characters(whitelist_categories=("Lu",)),
|
||||
min_size=1,
|
||||
max_size=5,
|
||||
)
|
||||
|
||||
_confidence_strategy = st.floats(
|
||||
min_value=0.0, max_value=1.0, allow_nan=False, allow_infinity=False,
|
||||
)
|
||||
|
||||
_excess_return_strategy = st.floats(
|
||||
min_value=-1.0, max_value=1.0, allow_nan=False, allow_infinity=False,
|
||||
)
|
||||
|
||||
|
||||
@st.composite
|
||||
def recommendation_accuracy_data(draw: st.DrawFn) -> tuple[CollectedData, dict]:
|
||||
"""Generate CollectedData with matching trading decisions, recommendations,
|
||||
and prediction outcomes for testing recommendation accuracy.
|
||||
|
||||
Returns (CollectedData, expected_values) where expected_values contains
|
||||
the independently computed expected results.
|
||||
"""
|
||||
# Generate 1-20 trading decisions with unique tickers
|
||||
n = draw(st.integers(min_value=1, max_value=20))
|
||||
tickers = [draw(_ticker_strategy) for _ in range(n)]
|
||||
# Ensure unique tickers by appending index
|
||||
tickers = [f"{t}{i}" for i, t in enumerate(tickers)]
|
||||
|
||||
decisions = draw(
|
||||
st.lists(
|
||||
st.sampled_from(["act", "skip"]),
|
||||
min_size=n,
|
||||
max_size=n,
|
||||
)
|
||||
)
|
||||
confidences = draw(
|
||||
st.lists(
|
||||
_confidence_strategy,
|
||||
min_size=n,
|
||||
max_size=n,
|
||||
)
|
||||
)
|
||||
profitable_flags = draw(
|
||||
st.lists(
|
||||
st.booleans(),
|
||||
min_size=n,
|
||||
max_size=n,
|
||||
)
|
||||
)
|
||||
direction_correct_flags = draw(
|
||||
st.lists(
|
||||
st.booleans(),
|
||||
min_size=n,
|
||||
max_size=n,
|
||||
)
|
||||
)
|
||||
excess_returns = draw(
|
||||
st.lists(
|
||||
_excess_return_strategy,
|
||||
min_size=n,
|
||||
max_size=n,
|
||||
)
|
||||
)
|
||||
|
||||
trading_decisions = []
|
||||
recommendations = []
|
||||
prediction_outcomes = []
|
||||
|
||||
# Track expected values
|
||||
exp_act_count = 0
|
||||
exp_skip_count = 0
|
||||
exp_acted_wins = 0
|
||||
exp_acted_with_outcome = 0
|
||||
exp_confidence_acted: list[float] = []
|
||||
exp_confidence_skipped: list[float] = []
|
||||
|
||||
for i in range(n):
|
||||
rec_id = str(uuid.uuid4())
|
||||
ticker = tickers[i]
|
||||
decision = decisions[i]
|
||||
confidence = confidences[i]
|
||||
profitable = profitable_flags[i]
|
||||
direction_correct = direction_correct_flags[i]
|
||||
excess_return = excess_returns[i]
|
||||
|
||||
trading_decisions.append(
|
||||
{
|
||||
"id": str(uuid.uuid4()),
|
||||
"recommendation_id": rec_id,
|
||||
"decision": decision,
|
||||
"ticker": ticker,
|
||||
}
|
||||
)
|
||||
recommendations.append(
|
||||
{
|
||||
"id": rec_id,
|
||||
"confidence": confidence,
|
||||
}
|
||||
)
|
||||
prediction_outcomes.append(
|
||||
{
|
||||
"ticker": ticker,
|
||||
"profitable": profitable,
|
||||
"direction_correct": direction_correct,
|
||||
"excess_return_vs_spy": excess_return,
|
||||
}
|
||||
)
|
||||
|
||||
if decision == "act":
|
||||
exp_act_count += 1
|
||||
exp_confidence_acted.append(confidence)
|
||||
# Every acted decision has a matching prediction outcome by ticker
|
||||
exp_acted_with_outcome += 1
|
||||
if profitable:
|
||||
exp_acted_wins += 1
|
||||
else:
|
||||
exp_skip_count += 1
|
||||
exp_confidence_skipped.append(confidence)
|
||||
|
||||
data = CollectedData(
|
||||
trading_decisions=trading_decisions,
|
||||
recommendations=recommendations,
|
||||
prediction_outcomes=prediction_outcomes,
|
||||
)
|
||||
|
||||
exp_acted_win_rate = (
|
||||
(exp_acted_wins / exp_acted_with_outcome)
|
||||
if exp_acted_with_outcome > 0
|
||||
else 0.0
|
||||
)
|
||||
exp_avg_confidence_acted = (
|
||||
(sum(exp_confidence_acted) / len(exp_confidence_acted))
|
||||
if exp_confidence_acted
|
||||
else 0.0
|
||||
)
|
||||
exp_avg_confidence_skipped = (
|
||||
(sum(exp_confidence_skipped) / len(exp_confidence_skipped))
|
||||
if exp_confidence_skipped
|
||||
else 0.0
|
||||
)
|
||||
|
||||
expected = {
|
||||
"total_evaluated": exp_act_count + exp_skip_count,
|
||||
"act_count": exp_act_count,
|
||||
"skip_count": exp_skip_count,
|
||||
"acted_win_rate": exp_acted_win_rate,
|
||||
"avg_confidence_acted": exp_avg_confidence_acted,
|
||||
"avg_confidence_skipped": exp_avg_confidence_skipped,
|
||||
}
|
||||
|
||||
return data, expected
|
||||
|
||||
|
||||
@given(data_and_expected=recommendation_accuracy_data())
|
||||
@settings(max_examples=100)
|
||||
def test_recommendation_accuracy_aggregation(
|
||||
data_and_expected: tuple[CollectedData, dict],
|
||||
) -> None:
|
||||
"""**Validates: Requirements 1.4**
|
||||
|
||||
For any non-empty list of trading decisions with associated prediction
|
||||
outcomes, the computed acted_win_rate SHALL equal the count of profitable
|
||||
outcomes divided by total acted outcomes with prediction data, act/skip
|
||||
counts SHALL match, average confidence values SHALL match, and all rate
|
||||
values SHALL be in [0.0, 1.0].
|
||||
"""
|
||||
data, expected = data_and_expected
|
||||
section = build_recommendation_accuracy_section(data)
|
||||
|
||||
# Verify act/skip counts
|
||||
assert section.total_evaluated == expected["total_evaluated"], (
|
||||
f"total_evaluated mismatch: got {section.total_evaluated}, "
|
||||
f"expected {expected['total_evaluated']}"
|
||||
)
|
||||
assert section.act_count == expected["act_count"], (
|
||||
f"act_count mismatch: got {section.act_count}, "
|
||||
f"expected {expected['act_count']}"
|
||||
)
|
||||
assert section.skip_count == expected["skip_count"], (
|
||||
f"skip_count mismatch: got {section.skip_count}, "
|
||||
f"expected {expected['skip_count']}"
|
||||
)
|
||||
|
||||
# Verify acted win rate
|
||||
assert abs(section.acted_win_rate - expected["acted_win_rate"]) < 1e-9, (
|
||||
f"acted_win_rate mismatch: got {section.acted_win_rate}, "
|
||||
f"expected {expected['acted_win_rate']}"
|
||||
)
|
||||
|
||||
# Verify average confidence values
|
||||
assert abs(section.avg_confidence_acted - expected["avg_confidence_acted"]) < 1e-9, (
|
||||
f"avg_confidence_acted mismatch: got {section.avg_confidence_acted}, "
|
||||
f"expected {expected['avg_confidence_acted']}"
|
||||
)
|
||||
assert abs(section.avg_confidence_skipped - expected["avg_confidence_skipped"]) < 1e-9, (
|
||||
f"avg_confidence_skipped mismatch: got {section.avg_confidence_skipped}, "
|
||||
f"expected {expected['avg_confidence_skipped']}"
|
||||
)
|
||||
|
||||
# All rate values must be in [0.0, 1.0]
|
||||
assert 0.0 <= section.acted_win_rate <= 1.0, (
|
||||
f"acted_win_rate out of range: {section.acted_win_rate}"
|
||||
)
|
||||
assert 0.0 <= section.avg_confidence_acted <= 1.0, (
|
||||
f"avg_confidence_acted out of range: {section.avg_confidence_acted}"
|
||||
)
|
||||
assert 0.0 <= section.avg_confidence_skipped <= 1.0, (
|
||||
f"avg_confidence_skipped out of range: {section.avg_confidence_skipped}"
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Property 5: Portfolio Period-Over-Period Delta Computation
|
||||
# Validates: Requirements 1.3
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_non_negative_float = st.floats(
|
||||
min_value=0.0, max_value=1e8, allow_nan=False, allow_infinity=False,
|
||||
)
|
||||
|
||||
_finite_float = st.floats(
|
||||
min_value=-1e6, max_value=1e6, allow_nan=False, allow_infinity=False,
|
||||
)
|
||||
|
||||
|
||||
@st.composite
|
||||
def portfolio_snapshot_pair(draw: st.DrawFn) -> tuple[dict, dict]:
|
||||
"""Generate a pair of portfolio snapshots (current, previous) with
|
||||
non-negative portfolio_value, active_pool, reserve_pool, and finite
|
||||
cumulative_return.
|
||||
"""
|
||||
current = {
|
||||
"portfolio_value": draw(_non_negative_float),
|
||||
"active_pool": draw(_non_negative_float),
|
||||
"reserve_pool": draw(_non_negative_float),
|
||||
"cumulative_return": draw(_finite_float),
|
||||
"realized_pnl": draw(_finite_float),
|
||||
"unrealized_pnl": draw(_finite_float),
|
||||
"daily_return": draw(_finite_float),
|
||||
"win_count": draw(st.integers(min_value=0, max_value=10000)),
|
||||
"loss_count": draw(st.integers(min_value=0, max_value=10000)),
|
||||
"win_rate": draw(
|
||||
st.floats(
|
||||
min_value=0.0, max_value=1.0,
|
||||
allow_nan=False, allow_infinity=False,
|
||||
)
|
||||
),
|
||||
"sharpe_ratio": draw(_finite_float),
|
||||
}
|
||||
previous = {
|
||||
"portfolio_value": draw(_non_negative_float),
|
||||
"active_pool": draw(_non_negative_float),
|
||||
"reserve_pool": draw(_non_negative_float),
|
||||
"cumulative_return": draw(_finite_float),
|
||||
"realized_pnl": draw(_finite_float),
|
||||
"unrealized_pnl": draw(_finite_float),
|
||||
"daily_return": draw(_finite_float),
|
||||
"win_count": draw(st.integers(min_value=0, max_value=10000)),
|
||||
"loss_count": draw(st.integers(min_value=0, max_value=10000)),
|
||||
"win_rate": draw(
|
||||
st.floats(
|
||||
min_value=0.0, max_value=1.0,
|
||||
allow_nan=False, allow_infinity=False,
|
||||
)
|
||||
),
|
||||
"sharpe_ratio": draw(_finite_float),
|
||||
}
|
||||
return current, previous
|
||||
|
||||
|
||||
@given(snapshots=portfolio_snapshot_pair())
|
||||
@settings(max_examples=100)
|
||||
def test_portfolio_delta_with_both_snapshots(
|
||||
snapshots: tuple[dict, dict],
|
||||
) -> None:
|
||||
"""**Validates: Requirements 1.3**
|
||||
|
||||
For any two valid portfolio snapshots (current and previous), the
|
||||
period-over-period deltas SHALL equal (current - previous) for
|
||||
portfolio_value, active_pool, reserve_pool, and cumulative_return.
|
||||
|
||||
The build_pnl_section extracts values from the current snapshot.
|
||||
We verify that the delta between the current and previous section
|
||||
outputs matches (current - previous) for each field.
|
||||
"""
|
||||
current_snap, previous_snap = snapshots
|
||||
|
||||
# Build sections from current and previous snapshots
|
||||
data_current = CollectedData(portfolio_snapshot=current_snap)
|
||||
data_previous = CollectedData(portfolio_snapshot=previous_snap)
|
||||
|
||||
section_current = build_pnl_section(data_current)
|
||||
section_previous = build_pnl_section(data_previous)
|
||||
|
||||
# Verify deltas: current section values - previous section values
|
||||
# should equal current snapshot values - previous snapshot values
|
||||
delta_cumulative = section_current.cumulative_return - section_previous.cumulative_return
|
||||
expected_delta_cumulative = (
|
||||
float(current_snap["cumulative_return"])
|
||||
- float(previous_snap["cumulative_return"])
|
||||
)
|
||||
assert abs(delta_cumulative - expected_delta_cumulative) < 1e-9, (
|
||||
f"cumulative_return delta mismatch: "
|
||||
f"got {delta_cumulative}, expected {expected_delta_cumulative}"
|
||||
)
|
||||
|
||||
delta_realized = section_current.realized_pnl - section_previous.realized_pnl
|
||||
expected_delta_realized = (
|
||||
float(current_snap["realized_pnl"])
|
||||
- float(previous_snap["realized_pnl"])
|
||||
)
|
||||
assert abs(delta_realized - expected_delta_realized) < 1e-9, (
|
||||
f"realized_pnl delta mismatch: "
|
||||
f"got {delta_realized}, expected {expected_delta_realized}"
|
||||
)
|
||||
|
||||
delta_unrealized = section_current.unrealized_pnl - section_previous.unrealized_pnl
|
||||
expected_delta_unrealized = (
|
||||
float(current_snap["unrealized_pnl"])
|
||||
- float(previous_snap["unrealized_pnl"])
|
||||
)
|
||||
assert abs(delta_unrealized - expected_delta_unrealized) < 1e-9, (
|
||||
f"unrealized_pnl delta mismatch: "
|
||||
f"got {delta_unrealized}, expected {expected_delta_unrealized}"
|
||||
)
|
||||
|
||||
# Verify that section values faithfully reflect snapshot values
|
||||
assert abs(section_current.cumulative_return - float(current_snap["cumulative_return"])) < 1e-9
|
||||
assert abs(section_current.realized_pnl - float(current_snap["realized_pnl"])) < 1e-9
|
||||
assert abs(section_current.unrealized_pnl - float(current_snap["unrealized_pnl"])) < 1e-9
|
||||
assert abs(section_current.daily_return - float(current_snap["daily_return"])) < 1e-9
|
||||
assert abs(section_current.win_rate - float(current_snap["win_rate"])) < 1e-9
|
||||
|
||||
|
||||
@given(
|
||||
portfolio_value=_non_negative_float,
|
||||
active_pool=_non_negative_float,
|
||||
reserve_pool=_non_negative_float,
|
||||
cumulative_return=_finite_float,
|
||||
)
|
||||
@settings(max_examples=100)
|
||||
def test_portfolio_delta_no_previous_snapshot(
|
||||
portfolio_value: float,
|
||||
active_pool: float,
|
||||
reserve_pool: float,
|
||||
cumulative_return: float,
|
||||
) -> None:
|
||||
"""**Validates: Requirements 1.3**
|
||||
|
||||
When no previous snapshot exists, the section SHALL use zero values
|
||||
for all fields (since portfolio_snapshot is None), meaning the deltas
|
||||
from a zero baseline are effectively zero.
|
||||
"""
|
||||
# When portfolio_snapshot is None, build_pnl_section returns all zeros
|
||||
data_no_snapshot = CollectedData(portfolio_snapshot=None)
|
||||
section = build_pnl_section(data_no_snapshot)
|
||||
|
||||
assert section.realized_pnl == 0.0, (
|
||||
f"Expected 0.0 realized_pnl with no snapshot, got {section.realized_pnl}"
|
||||
)
|
||||
assert section.unrealized_pnl == 0.0, (
|
||||
f"Expected 0.0 unrealized_pnl with no snapshot, got {section.unrealized_pnl}"
|
||||
)
|
||||
assert section.daily_return == 0.0, (
|
||||
f"Expected 0.0 daily_return with no snapshot, got {section.daily_return}"
|
||||
)
|
||||
assert section.cumulative_return == 0.0, (
|
||||
f"Expected 0.0 cumulative_return with no snapshot, got {section.cumulative_return}"
|
||||
)
|
||||
assert section.win_count == 0, (
|
||||
f"Expected 0 win_count with no snapshot, got {section.win_count}"
|
||||
)
|
||||
assert section.loss_count == 0, (
|
||||
f"Expected 0 loss_count with no snapshot, got {section.loss_count}"
|
||||
)
|
||||
assert section.win_rate == 0.0, (
|
||||
f"Expected 0.0 win_rate with no snapshot, got {section.win_rate}"
|
||||
)
|
||||
assert section.sharpe_ratio == 0.0, (
|
||||
f"Expected 0.0 sharpe_ratio with no snapshot, got {section.sharpe_ratio}"
|
||||
)
|
||||
assert section.profit_factor == 0.0, (
|
||||
f"Expected 0.0 profit_factor with no snapshot, got {section.profit_factor}"
|
||||
)
|
||||
@@ -0,0 +1,245 @@
|
||||
# Feature: trading-feedback-engine, Property 2: Report serialization round-trip
|
||||
"""Property-based tests for report serialization round-trip.
|
||||
|
||||
Feature: trading-feedback-engine
|
||||
|
||||
Tests the report serialization round-trip property from the design
|
||||
specification: for any valid ReportData object (with valid P&L,
|
||||
recommendation accuracy, position performance, risk metrics, and model
|
||||
quality sections), serializing to JSON and then deserializing back SHALL
|
||||
produce a ReportData object equivalent to the original. All datetime fields
|
||||
in the serialized JSON SHALL be in ISO 8601 format.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import re
|
||||
from datetime import date, datetime, timezone
|
||||
|
||||
from hypothesis import given, settings
|
||||
from hypothesis import strategies as st
|
||||
|
||||
from services.reporting.models import (
|
||||
ModelQualitySection,
|
||||
ModelQualityWindow,
|
||||
PLSection,
|
||||
PositionDetail,
|
||||
PositionPerformanceSection,
|
||||
RecommendationAccuracySection,
|
||||
ReportData,
|
||||
ReportType,
|
||||
RiskMetricsSection,
|
||||
ValidationStatus,
|
||||
ValidationWarning,
|
||||
)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Property 2: Report Serialization Round-Trip
|
||||
# Validates: Requirements 8.1, 8.2, 8.3, 8.4
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# ISO 8601 datetime pattern (covers both datetime and date formats)
|
||||
_ISO8601_DATETIME_RE = re.compile(
|
||||
r"^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}" # YYYY-MM-DDTHH:MM:SS
|
||||
r"(?:\.\d+)?" # optional fractional seconds
|
||||
r"(?:Z|[+-]\d{2}:\d{2})?$" # optional timezone
|
||||
)
|
||||
_ISO8601_DATE_RE = re.compile(r"^\d{4}-\d{2}-\d{2}$")
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Hypothesis strategies for each model
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_finite_float = st.floats(allow_nan=False, allow_infinity=False)
|
||||
_non_negative_finite_float = st.floats(
|
||||
min_value=0.0, allow_nan=False, allow_infinity=False,
|
||||
)
|
||||
_rate_float = st.floats(
|
||||
min_value=0.0, max_value=1.0, allow_nan=False, allow_infinity=False,
|
||||
)
|
||||
_optional_finite_float = st.one_of(st.none(), _finite_float)
|
||||
|
||||
_validation_warning_strategy = st.builds(
|
||||
ValidationWarning,
|
||||
field_name=st.text(min_size=1, max_size=50),
|
||||
computed_value=_finite_float,
|
||||
snapshot_value=_finite_float,
|
||||
pct_difference=_non_negative_finite_float,
|
||||
)
|
||||
|
||||
_pnl_section_strategy = st.builds(
|
||||
PLSection,
|
||||
realized_pnl=_finite_float,
|
||||
unrealized_pnl=_finite_float,
|
||||
daily_return=_finite_float,
|
||||
cumulative_return=_finite_float,
|
||||
win_count=st.integers(min_value=0, max_value=10000),
|
||||
loss_count=st.integers(min_value=0, max_value=10000),
|
||||
win_rate=_rate_float,
|
||||
profit_factor=_non_negative_finite_float,
|
||||
sharpe_ratio=_finite_float,
|
||||
summary=st.text(max_size=200),
|
||||
validation_warnings=st.lists(
|
||||
_validation_warning_strategy, min_size=0, max_size=3,
|
||||
),
|
||||
)
|
||||
|
||||
_recommendation_accuracy_strategy = st.builds(
|
||||
RecommendationAccuracySection,
|
||||
total_evaluated=st.integers(min_value=0, max_value=10000),
|
||||
act_count=st.integers(min_value=0, max_value=10000),
|
||||
skip_count=st.integers(min_value=0, max_value=10000),
|
||||
acted_win_rate=_rate_float,
|
||||
avg_confidence_acted=_rate_float,
|
||||
avg_confidence_skipped=_rate_float,
|
||||
summary=st.text(max_size=200),
|
||||
validation_warnings=st.lists(
|
||||
_validation_warning_strategy, min_size=0, max_size=3,
|
||||
),
|
||||
)
|
||||
|
||||
_position_detail_strategy = st.builds(
|
||||
PositionDetail,
|
||||
ticker=st.text(min_size=1, max_size=10),
|
||||
entry_price=_finite_float,
|
||||
current_or_exit_price=_finite_float,
|
||||
pnl=_finite_float,
|
||||
pnl_pct=_finite_float,
|
||||
hold_duration_hours=_non_negative_finite_float,
|
||||
status=st.sampled_from(["open", "closed"]),
|
||||
)
|
||||
|
||||
_position_performance_strategy = st.builds(
|
||||
PositionPerformanceSection,
|
||||
positions=st.lists(_position_detail_strategy, min_size=0, max_size=5),
|
||||
summary=st.text(max_size=200),
|
||||
)
|
||||
|
||||
_risk_metrics_strategy = st.builds(
|
||||
RiskMetricsSection,
|
||||
current_risk_tier=st.sampled_from(["low", "moderate", "high", "critical"]),
|
||||
portfolio_heat=_non_negative_finite_float,
|
||||
max_drawdown=_non_negative_finite_float,
|
||||
current_drawdown_pct=_non_negative_finite_float,
|
||||
reserve_pool_balance=_non_negative_finite_float,
|
||||
circuit_breaker_event_count=st.integers(min_value=0, max_value=100),
|
||||
summary=st.text(max_size=200),
|
||||
)
|
||||
|
||||
_model_quality_window_strategy = st.builds(
|
||||
ModelQualityWindow,
|
||||
lookback=st.sampled_from(["7d", "30d", "90d"]),
|
||||
win_rate=_optional_finite_float,
|
||||
directional_accuracy=_optional_finite_float,
|
||||
information_coefficient=_optional_finite_float,
|
||||
calibration_error=_optional_finite_float,
|
||||
brier_score=_optional_finite_float,
|
||||
)
|
||||
|
||||
_model_quality_strategy = st.builds(
|
||||
ModelQualitySection,
|
||||
windows=st.lists(_model_quality_window_strategy, min_size=0, max_size=3),
|
||||
summary=st.text(max_size=200),
|
||||
validation_warnings=st.lists(
|
||||
_validation_warning_strategy, min_size=0, max_size=3,
|
||||
),
|
||||
)
|
||||
|
||||
# Use timezone-aware datetimes for generated_at
|
||||
_aware_datetime_strategy = st.datetimes(
|
||||
min_value=datetime(2020, 1, 1),
|
||||
max_value=datetime(2030, 12, 31),
|
||||
timezones=st.just(timezone.utc),
|
||||
)
|
||||
|
||||
_date_strategy = st.dates(
|
||||
min_value=date(2020, 1, 1),
|
||||
max_value=date(2030, 12, 31),
|
||||
)
|
||||
|
||||
_report_data_strategy = st.builds(
|
||||
ReportData,
|
||||
pnl=_pnl_section_strategy,
|
||||
recommendation_accuracy=_recommendation_accuracy_strategy,
|
||||
position_performance=_position_performance_strategy,
|
||||
risk_metrics=_risk_metrics_strategy,
|
||||
model_quality=_model_quality_strategy,
|
||||
executive_summary=st.text(max_size=300),
|
||||
validation_status=st.sampled_from(list(ValidationStatus)),
|
||||
generated_at=_aware_datetime_strategy,
|
||||
period_start=_date_strategy,
|
||||
period_end=_date_strategy,
|
||||
report_type=st.sampled_from(list(ReportType)),
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helper: recursively find all datetime-like string values in parsed JSON
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_DATETIME_FIELD_NAMES = {"generated_at"}
|
||||
_DATE_FIELD_NAMES = {"period_start", "period_end"}
|
||||
|
||||
|
||||
def _collect_datetime_strings(
|
||||
obj: object,
|
||||
key: str | None = None,
|
||||
) -> list[tuple[str, str]]:
|
||||
"""Walk parsed JSON and collect (field_name, value) for datetime fields."""
|
||||
results: list[tuple[str, str]] = []
|
||||
if isinstance(obj, dict):
|
||||
for k, v in obj.items():
|
||||
results.extend(_collect_datetime_strings(v, k))
|
||||
elif isinstance(obj, list):
|
||||
for item in obj:
|
||||
results.extend(_collect_datetime_strings(item, key))
|
||||
elif isinstance(obj, str) and key is not None:
|
||||
if key in _DATETIME_FIELD_NAMES or key in _DATE_FIELD_NAMES:
|
||||
results.append((key, obj))
|
||||
return results
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Property tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
@given(report=_report_data_strategy)
|
||||
@settings(max_examples=100)
|
||||
def test_report_serialization_round_trip(report: ReportData) -> None:
|
||||
"""**Validates: Requirements 8.1, 8.2, 8.3, 8.4**
|
||||
|
||||
For any valid ReportData object, serializing to JSON and then
|
||||
deserializing back SHALL produce a ReportData object equivalent
|
||||
to the original.
|
||||
"""
|
||||
json_str = report.model_dump_json()
|
||||
restored = ReportData.model_validate_json(json_str)
|
||||
assert restored == report, (
|
||||
f"Round-trip failed: deserialized report differs from original.\n"
|
||||
f" report_type: {report.report_type}\n"
|
||||
f" period: {report.period_start} → {report.period_end}\n"
|
||||
f" generated_at: {report.generated_at}"
|
||||
)
|
||||
|
||||
|
||||
@given(report=_report_data_strategy)
|
||||
@settings(max_examples=100)
|
||||
def test_report_datetime_fields_iso8601(report: ReportData) -> None:
|
||||
"""**Validates: Requirements 8.4**
|
||||
|
||||
All datetime fields in the serialized JSON SHALL be in ISO 8601 format.
|
||||
"""
|
||||
json_str = report.model_dump_json()
|
||||
parsed = json.loads(json_str)
|
||||
dt_fields = _collect_datetime_strings(parsed)
|
||||
|
||||
for field_name, value in dt_fields:
|
||||
if field_name in _DATETIME_FIELD_NAMES:
|
||||
assert _ISO8601_DATETIME_RE.match(value), (
|
||||
f"Datetime field '{field_name}' is not ISO 8601: {value!r}"
|
||||
)
|
||||
elif field_name in _DATE_FIELD_NAMES:
|
||||
assert _ISO8601_DATE_RE.match(value), (
|
||||
f"Date field '{field_name}' is not ISO 8601: {value!r}"
|
||||
)
|
||||
@@ -0,0 +1,127 @@
|
||||
# Feature: trading-feedback-engine, Property 3: Validation discrepancy detection correctness
|
||||
"""Property-based tests for report validation discrepancy detection.
|
||||
|
||||
Feature: trading-feedback-engine
|
||||
|
||||
Tests the validation discrepancy detection correctness property from the
|
||||
design specification: for any pair of computed metric value and snapshot
|
||||
metric value (both finite, non-negative floats), the validation function
|
||||
SHALL produce a warning if and only if the percentage difference exceeds 5%.
|
||||
The percentage difference SHALL be computed as |computed - snapshot| /
|
||||
snapshot * 100 when snapshot > 0, and SHALL flag any non-zero computed value
|
||||
when snapshot is 0.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import math
|
||||
|
||||
from hypothesis import given, settings
|
||||
from hypothesis import strategies as st
|
||||
|
||||
from services.reporting.validator import (
|
||||
DISCREPANCY_THRESHOLD_PCT,
|
||||
_check_discrepancy,
|
||||
)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Property 3: Validation Discrepancy Detection Correctness
|
||||
# Validates: Requirements 4.1, 4.2, 4.3, 4.4
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Strategy: finite, non-negative floats in [0, 1e6]
|
||||
_metric_float = st.floats(
|
||||
min_value=0, max_value=1e6, allow_nan=False, allow_infinity=False,
|
||||
)
|
||||
|
||||
|
||||
@given(computed=_metric_float, snapshot=_metric_float)
|
||||
@settings(max_examples=100)
|
||||
def test_discrepancy_detection_correctness(
|
||||
computed: float,
|
||||
snapshot: float,
|
||||
) -> None:
|
||||
"""**Validates: Requirements 4.1, 4.2, 4.3, 4.4**
|
||||
|
||||
For any pair of computed and snapshot values (finite, non-negative):
|
||||
- Both zero → no warning
|
||||
- Snapshot zero, computed non-zero → warning (100% discrepancy)
|
||||
- Snapshot > 0 → warning iff |computed - snapshot| / snapshot * 100 > 5%
|
||||
"""
|
||||
result = _check_discrepancy("test_field", computed, snapshot)
|
||||
|
||||
if snapshot == 0.0 and computed == 0.0:
|
||||
# Both zero → no discrepancy
|
||||
assert result is None, (
|
||||
f"Expected no warning when both values are 0, got {result}"
|
||||
)
|
||||
elif snapshot == 0.0:
|
||||
# Non-zero computed with zero snapshot → always a warning
|
||||
assert result is not None, (
|
||||
f"Expected warning for non-zero computed={computed} with "
|
||||
f"snapshot=0, got None"
|
||||
)
|
||||
assert result.pct_difference == 100.0, (
|
||||
f"Expected 100% discrepancy for zero snapshot, "
|
||||
f"got {result.pct_difference}%"
|
||||
)
|
||||
else:
|
||||
# Normal case: snapshot > 0
|
||||
expected_pct = abs(computed - snapshot) / snapshot * 100.0
|
||||
if expected_pct > DISCREPANCY_THRESHOLD_PCT:
|
||||
assert result is not None, (
|
||||
f"Expected warning for {expected_pct:.4f}% discrepancy "
|
||||
f"(computed={computed}, snapshot={snapshot}), got None"
|
||||
)
|
||||
# When expected_pct is inf (very small snapshot), both should be inf
|
||||
if math.isinf(expected_pct):
|
||||
assert math.isinf(result.pct_difference), (
|
||||
f"Expected inf pct_difference, got {result.pct_difference}"
|
||||
)
|
||||
else:
|
||||
assert abs(result.pct_difference - round(expected_pct, 4)) < 1e-6, (
|
||||
f"Percentage difference mismatch: "
|
||||
f"expected {round(expected_pct, 4)}, "
|
||||
f"got {result.pct_difference}"
|
||||
)
|
||||
else:
|
||||
assert result is None, (
|
||||
f"Expected no warning for {expected_pct:.4f}% discrepancy "
|
||||
f"(computed={computed}, snapshot={snapshot}), "
|
||||
f"got warning with pct_difference={result.pct_difference}"
|
||||
)
|
||||
|
||||
|
||||
@given(computed=_metric_float, snapshot=_metric_float)
|
||||
@settings(max_examples=100)
|
||||
def test_discrepancy_threshold_is_five_percent(
|
||||
computed: float,
|
||||
snapshot: float,
|
||||
) -> None:
|
||||
"""**Validates: Requirements 4.1, 4.2, 4.3, 4.4**
|
||||
|
||||
Verify that DISCREPANCY_THRESHOLD_PCT = 5.0 is the threshold used:
|
||||
the function produces a warning if and only if the discrepancy
|
||||
exceeds exactly 5%.
|
||||
"""
|
||||
assert DISCREPANCY_THRESHOLD_PCT == 5.0, (
|
||||
f"Expected threshold of 5.0%, got {DISCREPANCY_THRESHOLD_PCT}%"
|
||||
)
|
||||
|
||||
result = _check_discrepancy("threshold_check", computed, snapshot)
|
||||
|
||||
if snapshot == 0.0 and computed == 0.0:
|
||||
assert result is None
|
||||
elif snapshot == 0.0:
|
||||
# 100% > 5% → always warning
|
||||
assert result is not None
|
||||
else:
|
||||
pct = abs(computed - snapshot) / snapshot * 100.0
|
||||
should_warn = pct > 5.0
|
||||
if should_warn:
|
||||
assert result is not None, (
|
||||
f"Discrepancy {pct:.4f}% > 5% but no warning produced"
|
||||
)
|
||||
else:
|
||||
assert result is None, (
|
||||
f"Discrepancy {pct:.4f}% <= 5% but warning produced"
|
||||
)
|
||||
@@ -0,0 +1,256 @@
|
||||
"""API integration tests for trading report endpoints.
|
||||
|
||||
Tests GET /api/reports (list with pagination/filtering) and
|
||||
GET /api/reports/{report_id} (detail with full report_data).
|
||||
|
||||
Uses httpx.AsyncClient with the FastAPI app and mocks the module-level
|
||||
``pool`` variable in services.api.app.
|
||||
|
||||
Requirements validated: 5.4, 5.5, 5.6
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import uuid
|
||||
from datetime import date, datetime, timezone
|
||||
from unittest.mock import AsyncMock, patch
|
||||
|
||||
import httpx
|
||||
import pytest
|
||||
|
||||
from services.api.app import app
|
||||
|
||||
# ── Helpers ──────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
class FakeRecord(dict):
|
||||
"""Dict subclass that behaves like an asyncpg Record for bracket access."""
|
||||
|
||||
def __getattr__(self, name: str):
|
||||
try:
|
||||
return self[name]
|
||||
except KeyError:
|
||||
raise AttributeError(name)
|
||||
|
||||
|
||||
def _make_list_record(**overrides) -> FakeRecord:
|
||||
"""Build a FakeRecord matching the list-endpoint SELECT columns."""
|
||||
defaults = {
|
||||
"id": uuid.uuid4(),
|
||||
"report_type": "daily",
|
||||
"period_start": date(2025, 1, 15),
|
||||
"period_end": date(2025, 1, 15),
|
||||
"validation_status": "passed",
|
||||
"generated_at": datetime(2025, 1, 15, 21, 30, tzinfo=timezone.utc),
|
||||
}
|
||||
defaults.update(overrides)
|
||||
return FakeRecord(**defaults)
|
||||
|
||||
|
||||
def _make_detail_record(**overrides) -> FakeRecord:
|
||||
"""Build a FakeRecord matching the detail-endpoint SELECT columns."""
|
||||
defaults = {
|
||||
"id": uuid.uuid4(),
|
||||
"report_type": "daily",
|
||||
"period_start": date(2025, 1, 15),
|
||||
"period_end": date(2025, 1, 15),
|
||||
"report_data": {
|
||||
"pnl": {"realized_pnl": 125.50, "unrealized_pnl": -30.20},
|
||||
"executive_summary": "Test summary",
|
||||
},
|
||||
"validation_status": "passed",
|
||||
"generated_at": datetime(2025, 1, 15, 21, 30, tzinfo=timezone.utc),
|
||||
"created_at": datetime(2025, 1, 15, 21, 30, 5, tzinfo=timezone.utc),
|
||||
}
|
||||
defaults.update(overrides)
|
||||
return FakeRecord(**defaults)
|
||||
|
||||
|
||||
_POOL_PATCH = "services.api.app.pool"
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# 1. GET /api/reports — list endpoint
|
||||
# Requirements validated: 5.4, 5.6
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
|
||||
|
||||
class TestListReports:
|
||||
"""Tests for GET /api/reports."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_default_pagination(self) -> None:
|
||||
"""List reports with no params returns rows using default limit/offset."""
|
||||
r1 = _make_list_record()
|
||||
r2 = _make_list_record(
|
||||
report_type="weekly",
|
||||
period_start=date(2025, 1, 13),
|
||||
period_end=date(2025, 1, 17),
|
||||
)
|
||||
mock_pool = AsyncMock()
|
||||
mock_pool.fetch = AsyncMock(return_value=[r1, r2])
|
||||
|
||||
with patch(_POOL_PATCH, mock_pool):
|
||||
async with httpx.AsyncClient(
|
||||
transport=httpx.ASGITransport(app=app), base_url="http://test"
|
||||
) as client:
|
||||
resp = await client.get("/api/reports")
|
||||
|
||||
assert resp.status_code == 200
|
||||
data = resp.json()
|
||||
assert len(data) == 2
|
||||
# UUID fields are serialized as strings
|
||||
assert data[0]["id"] == str(r1["id"])
|
||||
assert data[0]["report_type"] == "daily"
|
||||
assert data[0]["period_start"] == "2025-01-15"
|
||||
assert data[0]["period_end"] == "2025-01-15"
|
||||
assert data[0]["validation_status"] == "passed"
|
||||
assert "generated_at" in data[0]
|
||||
|
||||
# pool.fetch called with default limit=20, offset=0
|
||||
call_args = mock_pool.fetch.call_args
|
||||
sql = call_args[0][0]
|
||||
assert "LIMIT" in sql
|
||||
assert "OFFSET" in sql
|
||||
# Last two positional args are limit and offset
|
||||
assert call_args[0][-2] == 20
|
||||
assert call_args[0][-1] == 0
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_filter_by_report_type(self) -> None:
|
||||
"""Filtering by report_type=weekly passes the value to the query."""
|
||||
r1 = _make_list_record(report_type="weekly")
|
||||
mock_pool = AsyncMock()
|
||||
mock_pool.fetch = AsyncMock(return_value=[r1])
|
||||
|
||||
with patch(_POOL_PATCH, mock_pool):
|
||||
async with httpx.AsyncClient(
|
||||
transport=httpx.ASGITransport(app=app), base_url="http://test"
|
||||
) as client:
|
||||
resp = await client.get("/api/reports", params={"report_type": "weekly"})
|
||||
|
||||
assert resp.status_code == 200
|
||||
data = resp.json()
|
||||
assert len(data) == 1
|
||||
assert data[0]["report_type"] == "weekly"
|
||||
|
||||
# Verify the SQL includes a report_type condition
|
||||
call_args = mock_pool.fetch.call_args
|
||||
sql = call_args[0][0]
|
||||
assert "report_type" in sql
|
||||
# "weekly" should be among the positional params
|
||||
assert "weekly" in call_args[0]
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_filter_by_date_range(self) -> None:
|
||||
"""Filtering by start_date and end_date passes dates to the query."""
|
||||
mock_pool = AsyncMock()
|
||||
mock_pool.fetch = AsyncMock(return_value=[])
|
||||
|
||||
with patch(_POOL_PATCH, mock_pool):
|
||||
async with httpx.AsyncClient(
|
||||
transport=httpx.ASGITransport(app=app), base_url="http://test"
|
||||
) as client:
|
||||
resp = await client.get(
|
||||
"/api/reports",
|
||||
params={"start_date": "2025-01-01", "end_date": "2025-01-31"},
|
||||
)
|
||||
|
||||
assert resp.status_code == 200
|
||||
call_args = mock_pool.fetch.call_args
|
||||
sql = call_args[0][0]
|
||||
assert "period_start" in sql
|
||||
assert "period_end" in sql
|
||||
# Date strings should be among the positional params
|
||||
assert "2025-01-01" in call_args[0]
|
||||
assert "2025-01-31" in call_args[0]
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_invalid_report_type_returns_400(self) -> None:
|
||||
"""An invalid report_type value returns HTTP 400."""
|
||||
mock_pool = AsyncMock()
|
||||
|
||||
with patch(_POOL_PATCH, mock_pool):
|
||||
async with httpx.AsyncClient(
|
||||
transport=httpx.ASGITransport(app=app), base_url="http://test"
|
||||
) as client:
|
||||
resp = await client.get(
|
||||
"/api/reports", params={"report_type": "monthly"}
|
||||
)
|
||||
|
||||
assert resp.status_code == 400
|
||||
assert "daily" in resp.json()["detail"].lower() or "weekly" in resp.json()["detail"].lower()
|
||||
# pool.fetch should NOT have been called
|
||||
mock_pool.fetch.assert_not_awaited()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_invalid_date_format_returns_400(self) -> None:
|
||||
"""A malformed start_date returns HTTP 400."""
|
||||
mock_pool = AsyncMock()
|
||||
|
||||
with patch(_POOL_PATCH, mock_pool):
|
||||
async with httpx.AsyncClient(
|
||||
transport=httpx.ASGITransport(app=app), base_url="http://test"
|
||||
) as client:
|
||||
resp = await client.get(
|
||||
"/api/reports", params={"start_date": "not-a-date"}
|
||||
)
|
||||
|
||||
assert resp.status_code == 400
|
||||
assert "YYYY-MM-DD" in resp.json()["detail"]
|
||||
mock_pool.fetch.assert_not_awaited()
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# 2. GET /api/reports/{report_id} — detail endpoint
|
||||
# Requirements validated: 5.4, 5.5
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
|
||||
|
||||
class TestGetReport:
|
||||
"""Tests for GET /api/reports/{report_id}."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_valid_id_returns_full_report(self) -> None:
|
||||
"""A valid report_id returns the full report including report_data."""
|
||||
record = _make_detail_record()
|
||||
mock_pool = AsyncMock()
|
||||
mock_pool.fetchrow = AsyncMock(return_value=record)
|
||||
|
||||
report_id = str(record["id"])
|
||||
|
||||
with patch(_POOL_PATCH, mock_pool):
|
||||
async with httpx.AsyncClient(
|
||||
transport=httpx.ASGITransport(app=app), base_url="http://test"
|
||||
) as client:
|
||||
resp = await client.get(f"/api/reports/{report_id}")
|
||||
|
||||
assert resp.status_code == 200
|
||||
data = resp.json()
|
||||
assert data["id"] == report_id
|
||||
assert data["report_type"] == "daily"
|
||||
assert data["period_start"] == "2025-01-15"
|
||||
assert data["period_end"] == "2025-01-15"
|
||||
assert data["validation_status"] == "passed"
|
||||
assert "generated_at" in data
|
||||
assert "created_at" in data
|
||||
# report_data is included as a dict
|
||||
assert isinstance(data["report_data"], dict)
|
||||
assert data["report_data"]["pnl"]["realized_pnl"] == 125.50
|
||||
assert data["report_data"]["executive_summary"] == "Test summary"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_nonexistent_id_returns_404(self) -> None:
|
||||
"""A non-existent report_id returns HTTP 404."""
|
||||
mock_pool = AsyncMock()
|
||||
mock_pool.fetchrow = AsyncMock(return_value=None)
|
||||
|
||||
fake_id = str(uuid.uuid4())
|
||||
|
||||
with patch(_POOL_PATCH, mock_pool):
|
||||
async with httpx.AsyncClient(
|
||||
transport=httpx.ASGITransport(app=app), base_url="http://test"
|
||||
) as client:
|
||||
resp = await client.get(f"/api/reports/{fake_id}")
|
||||
|
||||
assert resp.status_code == 404
|
||||
assert "not found" in resp.json()["detail"].lower()
|
||||
@@ -0,0 +1,273 @@
|
||||
"""Unit tests for the report data collector.
|
||||
|
||||
Tests the CollectedData dataclass defaults, _row_dict UUID conversion,
|
||||
and collect_report_data with mocked asyncpg pool.
|
||||
|
||||
Requirements: 1.1, 1.2, 1.3, 1.4, 1.5
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import uuid
|
||||
from datetime import date
|
||||
from unittest.mock import AsyncMock, MagicMock
|
||||
|
||||
import pytest
|
||||
|
||||
from services.reporting.collector import CollectedData, _row_dict, collect_report_data
|
||||
|
||||
# ===================================================================
|
||||
# _row_dict tests
|
||||
# ===================================================================
|
||||
|
||||
|
||||
class TestRowDict:
|
||||
"""Tests for _row_dict UUID→str conversion."""
|
||||
|
||||
def test_uuid_fields_converted_to_str(self):
|
||||
"""UUID values in the record are converted to strings."""
|
||||
test_uuid = uuid.uuid4()
|
||||
row = MagicMock()
|
||||
row.__iter__ = MagicMock(return_value=iter([("id", test_uuid), ("name", "test")]))
|
||||
row.keys = MagicMock(return_value=["id", "name"])
|
||||
row.values = MagicMock(return_value=[test_uuid, "test"])
|
||||
row.items = MagicMock(return_value=[("id", test_uuid), ("name", "test")])
|
||||
# dict(row) needs to work — use a real dict-like mock
|
||||
mock_dict = {"id": test_uuid, "name": "test"}
|
||||
row.__iter__ = MagicMock(return_value=iter(mock_dict))
|
||||
row.__getitem__ = lambda self, key: mock_dict[key]
|
||||
|
||||
# Simpler approach: just pass a dict-like object
|
||||
class FakeRecord(dict):
|
||||
pass
|
||||
|
||||
record = FakeRecord(id=test_uuid, name="test", count=42)
|
||||
result = _row_dict(record)
|
||||
|
||||
assert result["id"] == str(test_uuid)
|
||||
assert result["name"] == "test"
|
||||
assert result["count"] == 42
|
||||
|
||||
def test_no_uuid_fields_unchanged(self):
|
||||
"""Non-UUID values pass through unchanged."""
|
||||
|
||||
class FakeRecord(dict):
|
||||
pass
|
||||
|
||||
record = FakeRecord(ticker="AAPL", price=185.50, active=True)
|
||||
result = _row_dict(record)
|
||||
|
||||
assert result["ticker"] == "AAPL"
|
||||
assert result["price"] == 185.50
|
||||
assert result["active"] is True
|
||||
|
||||
def test_multiple_uuid_fields(self):
|
||||
"""Multiple UUID fields are all converted."""
|
||||
|
||||
class FakeRecord(dict):
|
||||
pass
|
||||
|
||||
id1 = uuid.uuid4()
|
||||
id2 = uuid.uuid4()
|
||||
record = FakeRecord(id=id1, recommendation_id=id2, ticker="MSFT")
|
||||
result = _row_dict(record)
|
||||
|
||||
assert result["id"] == str(id1)
|
||||
assert result["recommendation_id"] == str(id2)
|
||||
assert result["ticker"] == "MSFT"
|
||||
|
||||
def test_empty_record(self):
|
||||
"""Empty record returns empty dict."""
|
||||
|
||||
class FakeRecord(dict):
|
||||
pass
|
||||
|
||||
record = FakeRecord()
|
||||
result = _row_dict(record)
|
||||
assert result == {}
|
||||
|
||||
|
||||
# ===================================================================
|
||||
# CollectedData defaults
|
||||
# ===================================================================
|
||||
|
||||
|
||||
class TestCollectedDataDefaults:
|
||||
"""Tests for CollectedData dataclass default values."""
|
||||
|
||||
def test_default_empty_lists(self):
|
||||
"""All list fields default to empty lists."""
|
||||
data = CollectedData()
|
||||
assert data.trading_decisions == []
|
||||
assert data.orders == []
|
||||
assert data.open_positions == []
|
||||
assert data.closed_positions == []
|
||||
assert data.recommendations == []
|
||||
assert data.prediction_outcomes == []
|
||||
assert data.model_metric_snapshots == []
|
||||
assert data.circuit_breaker_events == []
|
||||
|
||||
def test_default_none_snapshots(self):
|
||||
"""Snapshot fields default to None."""
|
||||
data = CollectedData()
|
||||
assert data.portfolio_snapshot is None
|
||||
assert data.previous_portfolio_snapshot is None
|
||||
|
||||
def test_default_zero_balance(self):
|
||||
"""Reserve pool balance defaults to 0.0."""
|
||||
data = CollectedData()
|
||||
assert data.reserve_pool_balance == 0.0
|
||||
|
||||
def test_independent_list_instances(self):
|
||||
"""Each CollectedData instance has independent list instances."""
|
||||
data1 = CollectedData()
|
||||
data2 = CollectedData()
|
||||
data1.trading_decisions.append({"id": "test"})
|
||||
assert data2.trading_decisions == []
|
||||
|
||||
|
||||
# ===================================================================
|
||||
# collect_report_data with mocked pool
|
||||
# ===================================================================
|
||||
|
||||
|
||||
def _make_mock_pool():
|
||||
"""Create a mock asyncpg pool with async context manager support."""
|
||||
pool = MagicMock()
|
||||
conn = AsyncMock()
|
||||
|
||||
# pool.acquire() returns a sync object that supports async context manager
|
||||
ctx = MagicMock()
|
||||
ctx.__aenter__ = AsyncMock(return_value=conn)
|
||||
ctx.__aexit__ = AsyncMock(return_value=False)
|
||||
pool.acquire.return_value = ctx
|
||||
|
||||
return pool, conn
|
||||
|
||||
|
||||
class TestCollectReportData:
|
||||
"""Tests for collect_report_data with mocked asyncpg."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_zero_activity_returns_empty_lists(self):
|
||||
"""When no data exists, all lists are empty and snapshots are None."""
|
||||
pool, conn = _make_mock_pool()
|
||||
|
||||
# All queries return empty results
|
||||
conn.fetch.return_value = []
|
||||
conn.fetchrow.return_value = None
|
||||
|
||||
result = await collect_report_data(
|
||||
pool, date(2025, 1, 15), date(2025, 1, 15)
|
||||
)
|
||||
|
||||
assert isinstance(result, CollectedData)
|
||||
assert result.trading_decisions == []
|
||||
assert result.orders == []
|
||||
assert result.open_positions == []
|
||||
assert result.closed_positions == []
|
||||
assert result.portfolio_snapshot is None
|
||||
assert result.previous_portfolio_snapshot is None
|
||||
assert result.recommendations == []
|
||||
assert result.prediction_outcomes == []
|
||||
assert result.model_metric_snapshots == []
|
||||
assert result.circuit_breaker_events == []
|
||||
assert result.reserve_pool_balance == 0.0
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_queries_use_correct_date_range(self):
|
||||
"""Verify that queries are called with the correct period dates."""
|
||||
pool, conn = _make_mock_pool()
|
||||
conn.fetch.return_value = []
|
||||
conn.fetchrow.return_value = None
|
||||
|
||||
start = date(2025, 1, 13)
|
||||
end = date(2025, 1, 17)
|
||||
|
||||
await collect_report_data(pool, start, end)
|
||||
|
||||
# Verify fetch was called (trading_decisions, orders, open_positions,
|
||||
# closed_positions, recommendations, prediction_outcomes,
|
||||
# model_metric_snapshots, circuit_breaker_events)
|
||||
assert conn.fetch.call_count == 8
|
||||
|
||||
# Verify fetchrow was called (portfolio_snapshot, previous_snapshot,
|
||||
# reserve_pool_balance)
|
||||
assert conn.fetchrow.call_count == 3
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_reserve_pool_balance_from_ledger(self):
|
||||
"""Reserve pool balance is read from the latest ledger entry."""
|
||||
pool, conn = _make_mock_pool()
|
||||
conn.fetch.return_value = []
|
||||
|
||||
# Mock fetchrow to return different values for different queries
|
||||
balance_row = {"balance_after": 450.75}
|
||||
|
||||
call_count = 0
|
||||
|
||||
async def mock_fetchrow(query, *args):
|
||||
nonlocal call_count
|
||||
call_count += 1
|
||||
if "reserve_pool_ledger" in query:
|
||||
return balance_row
|
||||
return None
|
||||
|
||||
conn.fetchrow.side_effect = mock_fetchrow
|
||||
|
||||
result = await collect_report_data(
|
||||
pool, date(2025, 1, 15), date(2025, 1, 15)
|
||||
)
|
||||
|
||||
assert result.reserve_pool_balance == 450.75
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_portfolio_snapshots_populated(self):
|
||||
"""Portfolio snapshot and previous snapshot are populated when data exists."""
|
||||
pool, conn = _make_mock_pool()
|
||||
conn.fetch.return_value = []
|
||||
|
||||
current_snapshot = {
|
||||
"id": uuid.uuid4(),
|
||||
"snapshot_date": date(2025, 1, 15),
|
||||
"portfolio_value": 10500.0,
|
||||
"active_pool": 8000.0,
|
||||
"reserve_pool": 2500.0,
|
||||
"cumulative_return": 0.05,
|
||||
}
|
||||
previous_snapshot = {
|
||||
"id": uuid.uuid4(),
|
||||
"snapshot_date": date(2025, 1, 14),
|
||||
"portfolio_value": 10000.0,
|
||||
"active_pool": 7500.0,
|
||||
"reserve_pool": 2500.0,
|
||||
"cumulative_return": 0.0,
|
||||
}
|
||||
|
||||
call_count = 0
|
||||
|
||||
async def mock_fetchrow(query, *args):
|
||||
nonlocal call_count
|
||||
call_count += 1
|
||||
if "reserve_pool_ledger" in query:
|
||||
return None
|
||||
if "snapshot_date >=" in query:
|
||||
# current snapshot query (snapshot_date >= $1 AND snapshot_date <= $2)
|
||||
return current_snapshot
|
||||
if "snapshot_date <" in query:
|
||||
# previous snapshot query (snapshot_date < $1)
|
||||
return previous_snapshot
|
||||
return None
|
||||
|
||||
conn.fetchrow.side_effect = mock_fetchrow
|
||||
|
||||
result = await collect_report_data(
|
||||
pool, date(2025, 1, 15), date(2025, 1, 15)
|
||||
)
|
||||
|
||||
assert result.portfolio_snapshot is not None
|
||||
assert result.portfolio_snapshot["portfolio_value"] == 10500.0
|
||||
# UUID fields should be converted to str
|
||||
assert isinstance(result.portfolio_snapshot["id"], str)
|
||||
|
||||
assert result.previous_portfolio_snapshot is not None
|
||||
assert result.previous_portfolio_snapshot["portfolio_value"] == 10000.0
|
||||
@@ -0,0 +1,678 @@
|
||||
"""Unit tests for report generator orchestrator.
|
||||
|
||||
Tests the orchestration flow in services.reporting.generator with mocked
|
||||
dependencies (collector, section builders, validator, summarizer).
|
||||
|
||||
Requirements validated: 5.1, 5.2, 5.3
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import uuid
|
||||
from datetime import date, datetime, timezone
|
||||
from unittest.mock import AsyncMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from services.reporting.collector import CollectedData
|
||||
from services.reporting.generator import (
|
||||
_in_progress_jobs,
|
||||
generate_report,
|
||||
process_report_job,
|
||||
store_report,
|
||||
)
|
||||
from services.reporting.models import (
|
||||
ModelQualitySection,
|
||||
ModelQualityWindow,
|
||||
PLSection,
|
||||
PositionPerformanceSection,
|
||||
RecommendationAccuracySection,
|
||||
ReportData,
|
||||
ReportType,
|
||||
RiskMetricsSection,
|
||||
ValidationStatus,
|
||||
)
|
||||
|
||||
# ── Helpers ──────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def _make_report_data(**overrides: object) -> ReportData:
|
||||
"""Build a minimal valid ReportData for testing."""
|
||||
defaults = {
|
||||
"pnl": PLSection(
|
||||
realized_pnl=100.0,
|
||||
unrealized_pnl=-20.0,
|
||||
daily_return=0.01,
|
||||
cumulative_return=0.05,
|
||||
win_count=5,
|
||||
loss_count=2,
|
||||
win_rate=0.71,
|
||||
profit_factor=2.0,
|
||||
sharpe_ratio=1.2,
|
||||
summary="P&L summary",
|
||||
),
|
||||
"recommendation_accuracy": RecommendationAccuracySection(
|
||||
total_evaluated=10,
|
||||
act_count=6,
|
||||
skip_count=4,
|
||||
acted_win_rate=0.67,
|
||||
avg_confidence_acted=0.75,
|
||||
avg_confidence_skipped=0.40,
|
||||
summary="Rec accuracy summary",
|
||||
),
|
||||
"position_performance": PositionPerformanceSection(
|
||||
positions=[],
|
||||
summary="Position summary",
|
||||
),
|
||||
"risk_metrics": RiskMetricsSection(
|
||||
current_risk_tier="moderate",
|
||||
portfolio_heat=0.12,
|
||||
max_drawdown=0.06,
|
||||
current_drawdown_pct=0.02,
|
||||
reserve_pool_balance=500.0,
|
||||
circuit_breaker_event_count=0,
|
||||
summary="Risk summary",
|
||||
),
|
||||
"model_quality": ModelQualitySection(
|
||||
windows=[
|
||||
ModelQualityWindow(
|
||||
lookback="7d",
|
||||
win_rate=0.65,
|
||||
directional_accuracy=0.62,
|
||||
information_coefficient=0.08,
|
||||
calibration_error=0.12,
|
||||
brier_score=0.22,
|
||||
),
|
||||
],
|
||||
summary="Model quality summary",
|
||||
),
|
||||
"executive_summary": "Executive summary text",
|
||||
"validation_status": ValidationStatus.PASSED,
|
||||
"generated_at": datetime(2025, 1, 15, 21, 30, tzinfo=timezone.utc),
|
||||
"period_start": date(2025, 1, 15),
|
||||
"period_end": date(2025, 1, 15),
|
||||
"report_type": ReportType.DAILY,
|
||||
}
|
||||
defaults.update(overrides)
|
||||
return ReportData(**defaults)
|
||||
|
||||
|
||||
def _empty_collected_data() -> CollectedData:
|
||||
"""Build a zero-activity CollectedData."""
|
||||
return CollectedData()
|
||||
|
||||
|
||||
def _mock_pool() -> AsyncMock:
|
||||
"""Create a mock asyncpg pool."""
|
||||
pool = AsyncMock()
|
||||
return pool
|
||||
|
||||
|
||||
# Patch targets (all in the generator module namespace)
|
||||
_PATCH_COLLECT = "services.reporting.generator.collect_report_data"
|
||||
_PATCH_BUILD_PNL = "services.reporting.generator.build_pnl_section"
|
||||
_PATCH_BUILD_REC = "services.reporting.generator.build_recommendation_accuracy_section"
|
||||
_PATCH_BUILD_POS = "services.reporting.generator.build_position_performance_section"
|
||||
_PATCH_BUILD_RISK = "services.reporting.generator.build_risk_metrics_section"
|
||||
_PATCH_BUILD_MQ = "services.reporting.generator.build_model_quality_section"
|
||||
_PATCH_VALIDATE_REC = "services.reporting.generator.validate_recommendation_accuracy"
|
||||
_PATCH_VALIDATE_MQ = "services.reporting.generator.validate_model_quality"
|
||||
_PATCH_COMPUTE_STATUS = "services.reporting.generator.compute_validation_status"
|
||||
_PATCH_SUMMARIZE = "services.reporting.generator.summarize_section"
|
||||
_PATCH_EXEC_SUMMARY = "services.reporting.generator.generate_executive_summary"
|
||||
_PATCH_RESOLVER = "services.reporting.generator.AgentConfigResolver"
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# 1. generate_report — orchestration flow
|
||||
# Requirements validated: 5.1
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
|
||||
|
||||
class TestGenerateReport:
|
||||
"""Tests for generate_report orchestration."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_orchestration_calls_all_steps(self) -> None:
|
||||
"""generate_report calls collector, builders, validators, summarizer in order."""
|
||||
pool = _mock_pool()
|
||||
collected = _empty_collected_data()
|
||||
|
||||
pnl = PLSection(
|
||||
realized_pnl=0, unrealized_pnl=0, daily_return=0,
|
||||
cumulative_return=0, win_count=0, loss_count=0,
|
||||
win_rate=0, profit_factor=0, sharpe_ratio=0,
|
||||
)
|
||||
rec = RecommendationAccuracySection(
|
||||
total_evaluated=0, act_count=0, skip_count=0,
|
||||
acted_win_rate=0, avg_confidence_acted=0, avg_confidence_skipped=0,
|
||||
)
|
||||
pos = PositionPerformanceSection()
|
||||
risk = RiskMetricsSection(
|
||||
current_risk_tier="low", portfolio_heat=0, max_drawdown=0,
|
||||
current_drawdown_pct=0, reserve_pool_balance=0,
|
||||
circuit_breaker_event_count=0,
|
||||
)
|
||||
mq = ModelQualitySection()
|
||||
|
||||
with (
|
||||
patch(_PATCH_COLLECT, new_callable=AsyncMock, return_value=collected) as mock_collect,
|
||||
patch(_PATCH_BUILD_PNL, return_value=pnl) as mock_pnl,
|
||||
patch(_PATCH_BUILD_REC, return_value=rec) as mock_rec,
|
||||
patch(_PATCH_BUILD_POS, return_value=pos) as mock_pos,
|
||||
patch(_PATCH_BUILD_RISK, return_value=risk) as mock_risk,
|
||||
patch(_PATCH_BUILD_MQ, return_value=mq) as mock_mq,
|
||||
patch(_PATCH_VALIDATE_REC, return_value=[]) as mock_val_rec,
|
||||
patch(_PATCH_VALIDATE_MQ, return_value=[]) as mock_val_mq,
|
||||
patch(_PATCH_COMPUTE_STATUS, return_value=ValidationStatus.PASSED) as mock_status,
|
||||
patch(_PATCH_SUMMARIZE, new_callable=AsyncMock, return_value="summary") as mock_sum,
|
||||
patch(_PATCH_EXEC_SUMMARY, new_callable=AsyncMock, return_value="exec summary") as mock_exec,
|
||||
patch(_PATCH_RESOLVER) as mock_resolver_cls,
|
||||
):
|
||||
result = await generate_report(
|
||||
pool, ReportType.DAILY, date(2025, 1, 15), date(2025, 1, 15),
|
||||
)
|
||||
|
||||
# Collector called with pool and dates
|
||||
mock_collect.assert_awaited_once_with(pool, date(2025, 1, 15), date(2025, 1, 15))
|
||||
|
||||
# All section builders called with collected data
|
||||
mock_pnl.assert_called_once_with(collected)
|
||||
mock_rec.assert_called_once_with(collected)
|
||||
mock_pos.assert_called_once_with(collected)
|
||||
mock_risk.assert_called_once_with(collected)
|
||||
mock_mq.assert_called_once_with(collected)
|
||||
|
||||
# Validators called
|
||||
mock_val_rec.assert_called_once_with(rec, collected.prediction_outcomes)
|
||||
mock_val_mq.assert_called_once_with(mq, collected.model_metric_snapshots)
|
||||
|
||||
# Summarizer called 5 times (one per section)
|
||||
assert mock_sum.await_count == 5
|
||||
|
||||
# Executive summary called
|
||||
mock_exec.assert_awaited_once()
|
||||
|
||||
# Validation status computed
|
||||
mock_status.assert_called_once()
|
||||
|
||||
# Result is a ReportData
|
||||
assert isinstance(result, ReportData)
|
||||
assert result.report_type == ReportType.DAILY
|
||||
assert result.period_start == date(2025, 1, 15)
|
||||
assert result.period_end == date(2025, 1, 15)
|
||||
assert result.executive_summary == "exec summary"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_zero_activity_report(self) -> None:
|
||||
"""generate_report handles zero-activity data (empty CollectedData)."""
|
||||
pool = _mock_pool()
|
||||
collected = _empty_collected_data()
|
||||
|
||||
pnl = PLSection(
|
||||
realized_pnl=0, unrealized_pnl=0, daily_return=0,
|
||||
cumulative_return=0, win_count=0, loss_count=0,
|
||||
win_rate=0, profit_factor=0, sharpe_ratio=0,
|
||||
)
|
||||
rec = RecommendationAccuracySection(
|
||||
total_evaluated=0, act_count=0, skip_count=0,
|
||||
acted_win_rate=0, avg_confidence_acted=0, avg_confidence_skipped=0,
|
||||
)
|
||||
pos = PositionPerformanceSection()
|
||||
risk = RiskMetricsSection(
|
||||
current_risk_tier="unknown", portfolio_heat=0, max_drawdown=0,
|
||||
current_drawdown_pct=0, reserve_pool_balance=0,
|
||||
circuit_breaker_event_count=0,
|
||||
)
|
||||
mq = ModelQualitySection()
|
||||
|
||||
with (
|
||||
patch(_PATCH_COLLECT, new_callable=AsyncMock, return_value=collected),
|
||||
patch(_PATCH_BUILD_PNL, return_value=pnl),
|
||||
patch(_PATCH_BUILD_REC, return_value=rec),
|
||||
patch(_PATCH_BUILD_POS, return_value=pos),
|
||||
patch(_PATCH_BUILD_RISK, return_value=risk),
|
||||
patch(_PATCH_BUILD_MQ, return_value=mq),
|
||||
patch(_PATCH_VALIDATE_REC, return_value=[]),
|
||||
patch(_PATCH_VALIDATE_MQ, return_value=[]),
|
||||
patch(_PATCH_COMPUTE_STATUS, return_value=ValidationStatus.PASSED),
|
||||
patch(_PATCH_SUMMARIZE, new_callable=AsyncMock, return_value="No activity"),
|
||||
patch(_PATCH_EXEC_SUMMARY, new_callable=AsyncMock, return_value="No trading activity"),
|
||||
patch(_PATCH_RESOLVER),
|
||||
):
|
||||
result = await generate_report(
|
||||
pool, ReportType.DAILY, date(2025, 1, 15), date(2025, 1, 15),
|
||||
)
|
||||
|
||||
assert result.pnl.realized_pnl == 0.0
|
||||
assert result.pnl.win_count == 0
|
||||
assert result.recommendation_accuracy.total_evaluated == 0
|
||||
assert result.position_performance.positions == []
|
||||
assert result.risk_metrics.current_risk_tier == "unknown"
|
||||
assert result.validation_status == ValidationStatus.PASSED
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_validation_warnings_attached(self) -> None:
|
||||
"""Validation warnings from validators are attached to sections."""
|
||||
pool = _mock_pool()
|
||||
collected = _empty_collected_data()
|
||||
|
||||
from services.reporting.models import ValidationWarning
|
||||
|
||||
rec_warning = ValidationWarning(
|
||||
field_name="acted_win_rate",
|
||||
computed_value=0.80,
|
||||
snapshot_value=0.60,
|
||||
pct_difference=33.33,
|
||||
)
|
||||
|
||||
pnl = PLSection(
|
||||
realized_pnl=0, unrealized_pnl=0, daily_return=0,
|
||||
cumulative_return=0, win_count=0, loss_count=0,
|
||||
win_rate=0, profit_factor=0, sharpe_ratio=0,
|
||||
)
|
||||
rec = RecommendationAccuracySection(
|
||||
total_evaluated=5, act_count=3, skip_count=2,
|
||||
acted_win_rate=0.80, avg_confidence_acted=0.7, avg_confidence_skipped=0.4,
|
||||
)
|
||||
pos = PositionPerformanceSection()
|
||||
risk = RiskMetricsSection(
|
||||
current_risk_tier="moderate", portfolio_heat=0.1, max_drawdown=0.05,
|
||||
current_drawdown_pct=0.02, reserve_pool_balance=100,
|
||||
circuit_breaker_event_count=0,
|
||||
)
|
||||
mq = ModelQualitySection()
|
||||
|
||||
with (
|
||||
patch(_PATCH_COLLECT, new_callable=AsyncMock, return_value=collected),
|
||||
patch(_PATCH_BUILD_PNL, return_value=pnl),
|
||||
patch(_PATCH_BUILD_REC, return_value=rec),
|
||||
patch(_PATCH_BUILD_POS, return_value=pos),
|
||||
patch(_PATCH_BUILD_RISK, return_value=risk),
|
||||
patch(_PATCH_BUILD_MQ, return_value=mq),
|
||||
patch(_PATCH_VALIDATE_REC, return_value=[rec_warning]),
|
||||
patch(_PATCH_VALIDATE_MQ, return_value=[]),
|
||||
patch(_PATCH_COMPUTE_STATUS, return_value=ValidationStatus.WARNINGS),
|
||||
patch(_PATCH_SUMMARIZE, new_callable=AsyncMock, return_value="summary"),
|
||||
patch(_PATCH_EXEC_SUMMARY, new_callable=AsyncMock, return_value="exec"),
|
||||
patch(_PATCH_RESOLVER),
|
||||
):
|
||||
result = await generate_report(
|
||||
pool, ReportType.DAILY, date(2025, 1, 15), date(2025, 1, 15),
|
||||
)
|
||||
|
||||
assert result.validation_status == ValidationStatus.WARNINGS
|
||||
assert len(result.recommendation_accuracy.validation_warnings) == 1
|
||||
assert result.recommendation_accuracy.validation_warnings[0].field_name == "acted_win_rate"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_weekly_report_type(self) -> None:
|
||||
"""generate_report correctly sets weekly report type."""
|
||||
pool = _mock_pool()
|
||||
collected = _empty_collected_data()
|
||||
|
||||
pnl = PLSection(
|
||||
realized_pnl=0, unrealized_pnl=0, daily_return=0,
|
||||
cumulative_return=0, win_count=0, loss_count=0,
|
||||
win_rate=0, profit_factor=0, sharpe_ratio=0,
|
||||
)
|
||||
rec = RecommendationAccuracySection(
|
||||
total_evaluated=0, act_count=0, skip_count=0,
|
||||
acted_win_rate=0, avg_confidence_acted=0, avg_confidence_skipped=0,
|
||||
)
|
||||
pos = PositionPerformanceSection()
|
||||
risk = RiskMetricsSection(
|
||||
current_risk_tier="low", portfolio_heat=0, max_drawdown=0,
|
||||
current_drawdown_pct=0, reserve_pool_balance=0,
|
||||
circuit_breaker_event_count=0,
|
||||
)
|
||||
mq = ModelQualitySection()
|
||||
|
||||
with (
|
||||
patch(_PATCH_COLLECT, new_callable=AsyncMock, return_value=collected),
|
||||
patch(_PATCH_BUILD_PNL, return_value=pnl),
|
||||
patch(_PATCH_BUILD_REC, return_value=rec),
|
||||
patch(_PATCH_BUILD_POS, return_value=pos),
|
||||
patch(_PATCH_BUILD_RISK, return_value=risk),
|
||||
patch(_PATCH_BUILD_MQ, return_value=mq),
|
||||
patch(_PATCH_VALIDATE_REC, return_value=[]),
|
||||
patch(_PATCH_VALIDATE_MQ, return_value=[]),
|
||||
patch(_PATCH_COMPUTE_STATUS, return_value=ValidationStatus.PASSED),
|
||||
patch(_PATCH_SUMMARIZE, new_callable=AsyncMock, return_value="summary"),
|
||||
patch(_PATCH_EXEC_SUMMARY, new_callable=AsyncMock, return_value="exec"),
|
||||
patch(_PATCH_RESOLVER),
|
||||
):
|
||||
result = await generate_report(
|
||||
pool, ReportType.WEEKLY, date(2025, 1, 13), date(2025, 1, 17),
|
||||
)
|
||||
|
||||
assert result.report_type == ReportType.WEEKLY
|
||||
assert result.period_start == date(2025, 1, 13)
|
||||
assert result.period_end == date(2025, 1, 17)
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# 2. store_report — upsert behavior
|
||||
# Requirements validated: 5.2, 5.3
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
|
||||
|
||||
class TestStoreReport:
|
||||
"""Tests for store_report upsert behavior."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_store_calls_upsert_sql(self) -> None:
|
||||
"""store_report calls pool.fetchrow with the upsert SQL and correct params."""
|
||||
pool = _mock_pool()
|
||||
report_id = str(uuid.uuid4())
|
||||
pool.fetchrow = AsyncMock(return_value={"id": report_id})
|
||||
|
||||
report = _make_report_data()
|
||||
result = await store_report(pool, report)
|
||||
|
||||
assert result == report_id
|
||||
pool.fetchrow.assert_awaited_once()
|
||||
|
||||
call_args = pool.fetchrow.call_args
|
||||
sql = call_args[0][0]
|
||||
assert "INSERT INTO trading_reports" in sql
|
||||
assert "ON CONFLICT" in sql
|
||||
assert "DO UPDATE" in sql
|
||||
|
||||
# Verify the positional parameters
|
||||
assert call_args[0][1] == report.report_type.value
|
||||
assert call_args[0][2] == report.period_start
|
||||
assert call_args[0][3] == report.period_end
|
||||
# param 4 is the JSON string
|
||||
assert call_args[0][4] == report.model_dump_json()
|
||||
assert call_args[0][5] == report.validation_status.value
|
||||
assert call_args[0][6] == report.generated_at
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_store_returns_uuid_string(self) -> None:
|
||||
"""store_report returns the UUID as a string."""
|
||||
pool = _mock_pool()
|
||||
expected_id = str(uuid.uuid4())
|
||||
pool.fetchrow = AsyncMock(return_value={"id": expected_id})
|
||||
|
||||
report = _make_report_data()
|
||||
result = await store_report(pool, report)
|
||||
|
||||
assert isinstance(result, str)
|
||||
assert result == expected_id
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_store_upsert_regeneration(self) -> None:
|
||||
"""store_report handles regeneration (upsert) for existing period."""
|
||||
pool = _mock_pool()
|
||||
report_id = str(uuid.uuid4())
|
||||
pool.fetchrow = AsyncMock(return_value={"id": report_id})
|
||||
|
||||
# First store
|
||||
report1 = _make_report_data()
|
||||
result1 = await store_report(pool, report1)
|
||||
|
||||
# Second store (regeneration) — same period, different data
|
||||
report2 = _make_report_data(
|
||||
executive_summary="Updated executive summary",
|
||||
generated_at=datetime(2025, 1, 15, 22, 0, tzinfo=timezone.utc),
|
||||
)
|
||||
result2 = await store_report(pool, report2)
|
||||
|
||||
# Both calls succeed (upsert handles the conflict)
|
||||
assert result1 == report_id
|
||||
assert result2 == report_id
|
||||
assert pool.fetchrow.await_count == 2
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# 3. process_report_job — job processing
|
||||
# Requirements validated: 5.1, 5.3
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
|
||||
|
||||
class TestProcessReportJob:
|
||||
"""Tests for process_report_job."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_valid_job_calls_generate_and_store(self) -> None:
|
||||
"""A valid job payload triggers generate_report and store_report."""
|
||||
pool = _mock_pool()
|
||||
report = _make_report_data()
|
||||
|
||||
with (
|
||||
patch(
|
||||
"services.reporting.generator.generate_report",
|
||||
new_callable=AsyncMock,
|
||||
return_value=report,
|
||||
) as mock_gen,
|
||||
patch(
|
||||
"services.reporting.generator.store_report",
|
||||
new_callable=AsyncMock,
|
||||
return_value=str(uuid.uuid4()),
|
||||
) as mock_store,
|
||||
):
|
||||
job = {
|
||||
"report_type": "daily",
|
||||
"period_start": "2025-01-15",
|
||||
"period_end": "2025-01-15",
|
||||
}
|
||||
await process_report_job(pool, job)
|
||||
|
||||
mock_gen.assert_awaited_once_with(
|
||||
pool, ReportType.DAILY, date(2025, 1, 15), date(2025, 1, 15),
|
||||
)
|
||||
mock_store.assert_awaited_once_with(pool, report)
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_invalid_report_type_returns_early(self) -> None:
|
||||
"""An invalid report_type in the job payload causes early return."""
|
||||
pool = _mock_pool()
|
||||
|
||||
with (
|
||||
patch(
|
||||
"services.reporting.generator.generate_report",
|
||||
new_callable=AsyncMock,
|
||||
) as mock_gen,
|
||||
):
|
||||
job = {
|
||||
"report_type": "invalid_type",
|
||||
"period_start": "2025-01-15",
|
||||
"period_end": "2025-01-15",
|
||||
}
|
||||
await process_report_job(pool, job)
|
||||
|
||||
mock_gen.assert_not_awaited()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_invalid_date_returns_early(self) -> None:
|
||||
"""An invalid date in the job payload causes early return."""
|
||||
pool = _mock_pool()
|
||||
|
||||
with (
|
||||
patch(
|
||||
"services.reporting.generator.generate_report",
|
||||
new_callable=AsyncMock,
|
||||
) as mock_gen,
|
||||
):
|
||||
job = {
|
||||
"report_type": "daily",
|
||||
"period_start": "not-a-date",
|
||||
"period_end": "2025-01-15",
|
||||
}
|
||||
await process_report_job(pool, job)
|
||||
|
||||
mock_gen.assert_not_awaited()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_missing_fields_returns_early(self) -> None:
|
||||
"""Missing fields in the job payload causes early return."""
|
||||
pool = _mock_pool()
|
||||
|
||||
with (
|
||||
patch(
|
||||
"services.reporting.generator.generate_report",
|
||||
new_callable=AsyncMock,
|
||||
) as mock_gen,
|
||||
):
|
||||
job = {}
|
||||
await process_report_job(pool, job)
|
||||
|
||||
mock_gen.assert_not_awaited()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_duplicate_job_rejected(self) -> None:
|
||||
"""A duplicate in-progress job is rejected without calling generate_report."""
|
||||
pool = _mock_pool()
|
||||
key = "daily:2025-01-20:2025-01-20"
|
||||
|
||||
# Simulate an in-progress job
|
||||
_in_progress_jobs.add(key)
|
||||
try:
|
||||
with (
|
||||
patch(
|
||||
"services.reporting.generator.generate_report",
|
||||
new_callable=AsyncMock,
|
||||
) as mock_gen,
|
||||
):
|
||||
job = {
|
||||
"report_type": "daily",
|
||||
"period_start": "2025-01-20",
|
||||
"period_end": "2025-01-20",
|
||||
}
|
||||
await process_report_job(pool, job)
|
||||
|
||||
mock_gen.assert_not_awaited()
|
||||
finally:
|
||||
_in_progress_jobs.discard(key)
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_job_cleans_up_in_progress_on_success(self) -> None:
|
||||
"""After successful completion, the job key is removed from _in_progress_jobs."""
|
||||
pool = _mock_pool()
|
||||
report = _make_report_data(
|
||||
period_start=date(2025, 1, 21),
|
||||
period_end=date(2025, 1, 21),
|
||||
)
|
||||
key = "daily:2025-01-21:2025-01-21"
|
||||
|
||||
with (
|
||||
patch(
|
||||
"services.reporting.generator.generate_report",
|
||||
new_callable=AsyncMock,
|
||||
return_value=report,
|
||||
),
|
||||
patch(
|
||||
"services.reporting.generator.store_report",
|
||||
new_callable=AsyncMock,
|
||||
return_value=str(uuid.uuid4()),
|
||||
),
|
||||
):
|
||||
job = {
|
||||
"report_type": "daily",
|
||||
"period_start": "2025-01-21",
|
||||
"period_end": "2025-01-21",
|
||||
}
|
||||
await process_report_job(pool, job)
|
||||
|
||||
assert key not in _in_progress_jobs
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_job_cleans_up_in_progress_on_failure(self) -> None:
|
||||
"""After all retries fail, the job key is still removed from _in_progress_jobs."""
|
||||
pool = _mock_pool()
|
||||
key = "daily:2025-01-22:2025-01-22"
|
||||
|
||||
with (
|
||||
patch(
|
||||
"services.reporting.generator.generate_report",
|
||||
new_callable=AsyncMock,
|
||||
side_effect=RuntimeError("DB down"),
|
||||
),
|
||||
patch("asyncio.sleep", new_callable=AsyncMock),
|
||||
):
|
||||
job = {
|
||||
"report_type": "daily",
|
||||
"period_start": "2025-01-22",
|
||||
"period_end": "2025-01-22",
|
||||
}
|
||||
await process_report_job(pool, job)
|
||||
|
||||
assert key not in _in_progress_jobs
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_retries_on_failure(self) -> None:
|
||||
"""process_report_job retries up to 3 times on failure."""
|
||||
pool = _mock_pool()
|
||||
report = _make_report_data(
|
||||
period_start=date(2025, 1, 23),
|
||||
period_end=date(2025, 1, 23),
|
||||
)
|
||||
|
||||
call_count = 0
|
||||
|
||||
async def _gen_side_effect(*args, **kwargs):
|
||||
nonlocal call_count
|
||||
call_count += 1
|
||||
if call_count < 3:
|
||||
raise RuntimeError("Transient error")
|
||||
return report
|
||||
|
||||
with (
|
||||
patch(
|
||||
"services.reporting.generator.generate_report",
|
||||
new_callable=AsyncMock,
|
||||
side_effect=_gen_side_effect,
|
||||
),
|
||||
patch(
|
||||
"services.reporting.generator.store_report",
|
||||
new_callable=AsyncMock,
|
||||
return_value=str(uuid.uuid4()),
|
||||
) as mock_store,
|
||||
patch("asyncio.sleep", new_callable=AsyncMock) as mock_sleep,
|
||||
):
|
||||
job = {
|
||||
"report_type": "daily",
|
||||
"period_start": "2025-01-23",
|
||||
"period_end": "2025-01-23",
|
||||
}
|
||||
await process_report_job(pool, job)
|
||||
|
||||
# generate_report called 3 times (2 failures + 1 success)
|
||||
assert call_count == 3
|
||||
# store_report called once on success
|
||||
mock_store.assert_awaited_once()
|
||||
# sleep called twice (between retries)
|
||||
assert mock_sleep.await_count == 2
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_weekly_job(self) -> None:
|
||||
"""A weekly job payload is processed correctly."""
|
||||
pool = _mock_pool()
|
||||
report = _make_report_data(
|
||||
report_type=ReportType.WEEKLY,
|
||||
period_start=date(2025, 1, 13),
|
||||
period_end=date(2025, 1, 17),
|
||||
)
|
||||
|
||||
with (
|
||||
patch(
|
||||
"services.reporting.generator.generate_report",
|
||||
new_callable=AsyncMock,
|
||||
return_value=report,
|
||||
) as mock_gen,
|
||||
patch(
|
||||
"services.reporting.generator.store_report",
|
||||
new_callable=AsyncMock,
|
||||
return_value=str(uuid.uuid4()),
|
||||
),
|
||||
):
|
||||
job = {
|
||||
"report_type": "weekly",
|
||||
"period_start": "2025-01-13",
|
||||
"period_end": "2025-01-17",
|
||||
}
|
||||
await process_report_job(pool, job)
|
||||
|
||||
mock_gen.assert_awaited_once_with(
|
||||
pool, ReportType.WEEKLY, date(2025, 1, 13), date(2025, 1, 17),
|
||||
)
|
||||
@@ -0,0 +1,578 @@
|
||||
"""Unit tests for report section builders.
|
||||
|
||||
Tests each section builder from services.reporting.sections with known
|
||||
inputs and expected outputs, including edge cases for zero-activity,
|
||||
single positions, and missing portfolio snapshots.
|
||||
|
||||
Requirements validated: 3.1, 3.2, 3.3, 3.4, 3.5
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import uuid
|
||||
from datetime import datetime, timezone
|
||||
|
||||
from services.reporting.collector import CollectedData
|
||||
from services.reporting.models import (
|
||||
ModelQualitySection,
|
||||
PLSection,
|
||||
PositionPerformanceSection,
|
||||
RecommendationAccuracySection,
|
||||
RiskMetricsSection,
|
||||
)
|
||||
from services.reporting.sections import (
|
||||
build_model_quality_section,
|
||||
build_pnl_section,
|
||||
build_position_performance_section,
|
||||
build_recommendation_accuracy_section,
|
||||
build_risk_metrics_section,
|
||||
)
|
||||
|
||||
# ── Helpers ──────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def _make_snapshot(**overrides: object) -> dict:
|
||||
"""Build a portfolio snapshot dict with sensible defaults."""
|
||||
snap = {
|
||||
"realized_pnl": 100.0,
|
||||
"unrealized_pnl": -20.0,
|
||||
"daily_return": 0.015,
|
||||
"cumulative_return": 0.08,
|
||||
"win_count": 7,
|
||||
"loss_count": 3,
|
||||
"win_rate": 0.7,
|
||||
"sharpe_ratio": 1.5,
|
||||
"portfolio_heat": 0.12,
|
||||
"max_drawdown": 0.06,
|
||||
"current_drawdown_pct": 0.02,
|
||||
"risk_tier": "moderate",
|
||||
}
|
||||
snap.update(overrides)
|
||||
return snap
|
||||
|
||||
|
||||
def _make_closed_position(
|
||||
ticker: str,
|
||||
entry: float,
|
||||
exit_price: float,
|
||||
realized_pnl: float,
|
||||
updated_at: datetime | None = None,
|
||||
) -> dict:
|
||||
"""Build a closed position dict."""
|
||||
return {
|
||||
"id": str(uuid.uuid4()),
|
||||
"ticker": ticker,
|
||||
"avg_entry_price": entry,
|
||||
"current_price": exit_price,
|
||||
"realized_pnl": realized_pnl,
|
||||
"quantity": 0,
|
||||
"updated_at": updated_at or datetime(2025, 1, 15, 20, 0, tzinfo=timezone.utc),
|
||||
}
|
||||
|
||||
|
||||
def _make_open_position(
|
||||
ticker: str,
|
||||
entry: float,
|
||||
current: float,
|
||||
quantity: float,
|
||||
updated_at: datetime | None = None,
|
||||
) -> dict:
|
||||
"""Build an open position dict."""
|
||||
return {
|
||||
"id": str(uuid.uuid4()),
|
||||
"ticker": ticker,
|
||||
"avg_entry_price": entry,
|
||||
"current_price": current,
|
||||
"quantity": quantity,
|
||||
"updated_at": updated_at or datetime(2025, 1, 14, 10, 0, tzinfo=timezone.utc),
|
||||
}
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# 1. build_pnl_section
|
||||
# Requirements validated: 3.1
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
|
||||
|
||||
class TestBuildPnlSection:
|
||||
"""Tests for build_pnl_section."""
|
||||
|
||||
def test_with_portfolio_snapshot(self) -> None:
|
||||
"""Section values are extracted from the portfolio snapshot."""
|
||||
snap = _make_snapshot()
|
||||
data = CollectedData(portfolio_snapshot=snap)
|
||||
section = build_pnl_section(data)
|
||||
|
||||
assert isinstance(section, PLSection)
|
||||
assert section.realized_pnl == 100.0
|
||||
assert section.unrealized_pnl == -20.0
|
||||
assert section.daily_return == 0.015
|
||||
assert section.cumulative_return == 0.08
|
||||
assert section.win_count == 7
|
||||
assert section.loss_count == 3
|
||||
assert section.win_rate == 0.7
|
||||
assert section.sharpe_ratio == 1.5
|
||||
|
||||
def test_no_snapshot_returns_zeros(self) -> None:
|
||||
"""When no portfolio snapshot exists, all values are zero."""
|
||||
data = CollectedData(portfolio_snapshot=None)
|
||||
section = build_pnl_section(data)
|
||||
|
||||
assert section.realized_pnl == 0.0
|
||||
assert section.unrealized_pnl == 0.0
|
||||
assert section.daily_return == 0.0
|
||||
assert section.cumulative_return == 0.0
|
||||
assert section.win_count == 0
|
||||
assert section.loss_count == 0
|
||||
assert section.win_rate == 0.0
|
||||
assert section.profit_factor == 0.0
|
||||
assert section.sharpe_ratio == 0.0
|
||||
|
||||
def test_profit_factor_from_closed_positions(self) -> None:
|
||||
"""Profit factor = sum(gains) / abs(sum(losses)) from closed positions."""
|
||||
snap = _make_snapshot()
|
||||
closed = [
|
||||
_make_closed_position("AAPL", 100.0, 110.0, 50.0), # gain
|
||||
_make_closed_position("MSFT", 200.0, 190.0, -20.0), # loss
|
||||
_make_closed_position("GOOG", 150.0, 160.0, 30.0), # gain
|
||||
]
|
||||
data = CollectedData(portfolio_snapshot=snap, closed_positions=closed)
|
||||
section = build_pnl_section(data)
|
||||
|
||||
# gains = 50 + 30 = 80, losses = 20
|
||||
expected_pf = 80.0 / 20.0
|
||||
assert abs(section.profit_factor - expected_pf) < 1e-9
|
||||
|
||||
def test_profit_factor_no_losses(self) -> None:
|
||||
"""When there are no losses, profit factor is 0.0 (no divisor)."""
|
||||
snap = _make_snapshot()
|
||||
closed = [
|
||||
_make_closed_position("AAPL", 100.0, 110.0, 50.0),
|
||||
]
|
||||
data = CollectedData(portfolio_snapshot=snap, closed_positions=closed)
|
||||
section = build_pnl_section(data)
|
||||
|
||||
assert section.profit_factor == 0.0
|
||||
|
||||
def test_profit_factor_no_closed_positions(self) -> None:
|
||||
"""When there are no closed positions, profit factor is 0.0."""
|
||||
snap = _make_snapshot()
|
||||
data = CollectedData(portfolio_snapshot=snap, closed_positions=[])
|
||||
section = build_pnl_section(data)
|
||||
|
||||
assert section.profit_factor == 0.0
|
||||
|
||||
def test_snapshot_with_none_values(self) -> None:
|
||||
"""Snapshot fields that are None are coerced to zero."""
|
||||
snap = _make_snapshot(
|
||||
realized_pnl=None,
|
||||
unrealized_pnl=None,
|
||||
daily_return=None,
|
||||
win_count=None,
|
||||
)
|
||||
data = CollectedData(portfolio_snapshot=snap)
|
||||
section = build_pnl_section(data)
|
||||
|
||||
assert section.realized_pnl == 0.0
|
||||
assert section.unrealized_pnl == 0.0
|
||||
assert section.daily_return == 0.0
|
||||
assert section.win_count == 0
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# 2. build_recommendation_accuracy_section
|
||||
# Requirements validated: 3.2
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
|
||||
|
||||
class TestBuildRecommendationAccuracySection:
|
||||
"""Tests for build_recommendation_accuracy_section."""
|
||||
|
||||
def test_with_act_and_skip_decisions(self) -> None:
|
||||
"""Correctly counts act/skip and computes win rate and confidence."""
|
||||
rec_id_1 = str(uuid.uuid4())
|
||||
rec_id_2 = str(uuid.uuid4())
|
||||
rec_id_3 = str(uuid.uuid4())
|
||||
|
||||
data = CollectedData(
|
||||
trading_decisions=[
|
||||
{"id": "td1", "recommendation_id": rec_id_1, "decision": "act", "ticker": "AAPL"},
|
||||
{"id": "td2", "recommendation_id": rec_id_2, "decision": "skip", "ticker": "MSFT"},
|
||||
{"id": "td3", "recommendation_id": rec_id_3, "decision": "act", "ticker": "GOOG"},
|
||||
],
|
||||
recommendations=[
|
||||
{"id": rec_id_1, "confidence": 0.8},
|
||||
{"id": rec_id_2, "confidence": 0.3},
|
||||
{"id": rec_id_3, "confidence": 0.9},
|
||||
],
|
||||
prediction_outcomes=[
|
||||
{"ticker": "AAPL", "profitable": True, "direction_correct": True},
|
||||
{"ticker": "GOOG", "profitable": False, "direction_correct": False},
|
||||
],
|
||||
)
|
||||
section = build_recommendation_accuracy_section(data)
|
||||
|
||||
assert isinstance(section, RecommendationAccuracySection)
|
||||
assert section.total_evaluated == 3
|
||||
assert section.act_count == 2
|
||||
assert section.skip_count == 1
|
||||
# 1 win out of 2 acted with outcomes
|
||||
assert abs(section.acted_win_rate - 0.5) < 1e-9
|
||||
# avg confidence acted = (0.8 + 0.9) / 2 = 0.85
|
||||
assert abs(section.avg_confidence_acted - 0.85) < 1e-9
|
||||
# avg confidence skipped = 0.3
|
||||
assert abs(section.avg_confidence_skipped - 0.3) < 1e-9
|
||||
|
||||
def test_no_decisions_returns_zeros(self) -> None:
|
||||
"""When there are no trading decisions, all values are zero."""
|
||||
data = CollectedData(trading_decisions=[])
|
||||
section = build_recommendation_accuracy_section(data)
|
||||
|
||||
assert section.total_evaluated == 0
|
||||
assert section.act_count == 0
|
||||
assert section.skip_count == 0
|
||||
assert section.acted_win_rate == 0.0
|
||||
assert section.avg_confidence_acted == 0.0
|
||||
assert section.avg_confidence_skipped == 0.0
|
||||
|
||||
def test_all_act_decisions(self) -> None:
|
||||
"""When all decisions are 'act', skip_count is 0."""
|
||||
rec_id = str(uuid.uuid4())
|
||||
data = CollectedData(
|
||||
trading_decisions=[
|
||||
{"id": "td1", "recommendation_id": rec_id, "decision": "act", "ticker": "AAPL"},
|
||||
],
|
||||
recommendations=[
|
||||
{"id": rec_id, "confidence": 0.75},
|
||||
],
|
||||
prediction_outcomes=[
|
||||
{"ticker": "AAPL", "profitable": True, "direction_correct": True},
|
||||
],
|
||||
)
|
||||
section = build_recommendation_accuracy_section(data)
|
||||
|
||||
assert section.act_count == 1
|
||||
assert section.skip_count == 0
|
||||
assert section.acted_win_rate == 1.0
|
||||
assert abs(section.avg_confidence_acted - 0.75) < 1e-9
|
||||
assert section.avg_confidence_skipped == 0.0
|
||||
|
||||
def test_act_without_prediction_outcome(self) -> None:
|
||||
"""When an acted decision has no matching prediction outcome, win rate is 0."""
|
||||
rec_id = str(uuid.uuid4())
|
||||
data = CollectedData(
|
||||
trading_decisions=[
|
||||
{"id": "td1", "recommendation_id": rec_id, "decision": "act", "ticker": "AAPL"},
|
||||
],
|
||||
recommendations=[
|
||||
{"id": rec_id, "confidence": 0.6},
|
||||
],
|
||||
prediction_outcomes=[], # no outcomes
|
||||
)
|
||||
section = build_recommendation_accuracy_section(data)
|
||||
|
||||
assert section.act_count == 1
|
||||
assert section.acted_win_rate == 0.0
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# 3. build_position_performance_section
|
||||
# Requirements validated: 3.3
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
|
||||
|
||||
class TestBuildPositionPerformanceSection:
|
||||
"""Tests for build_position_performance_section."""
|
||||
|
||||
def test_with_open_positions(self) -> None:
|
||||
"""Open positions are listed with computed P&L and P&L%."""
|
||||
pos = _make_open_position("AAPL", 150.0, 160.0, 10.0)
|
||||
data = CollectedData(open_positions=[pos])
|
||||
section = build_position_performance_section(data)
|
||||
|
||||
assert isinstance(section, PositionPerformanceSection)
|
||||
assert len(section.positions) == 1
|
||||
|
||||
p = section.positions[0]
|
||||
assert p.ticker == "AAPL"
|
||||
assert p.entry_price == 150.0
|
||||
assert p.current_or_exit_price == 160.0
|
||||
assert p.status == "open"
|
||||
# pnl = (160 - 150) * 10 = 100
|
||||
assert abs(p.pnl - 100.0) < 1e-9
|
||||
# pnl_pct = 100 / (150 * 10) * 100 = 6.666...%
|
||||
assert abs(p.pnl_pct - (100.0 / 1500.0 * 100)) < 1e-6
|
||||
|
||||
def test_with_closed_positions(self) -> None:
|
||||
"""Closed positions use realized_pnl directly."""
|
||||
pos = _make_closed_position("MSFT", 200.0, 210.0, 50.0)
|
||||
data = CollectedData(closed_positions=[pos])
|
||||
section = build_position_performance_section(data)
|
||||
|
||||
assert len(section.positions) == 1
|
||||
p = section.positions[0]
|
||||
assert p.ticker == "MSFT"
|
||||
assert p.status == "closed"
|
||||
assert p.pnl == 50.0
|
||||
|
||||
def test_empty_positions(self) -> None:
|
||||
"""When there are no positions, the list is empty."""
|
||||
data = CollectedData(open_positions=[], closed_positions=[])
|
||||
section = build_position_performance_section(data)
|
||||
|
||||
assert isinstance(section, PositionPerformanceSection)
|
||||
assert len(section.positions) == 0
|
||||
|
||||
def test_mixed_open_and_closed(self) -> None:
|
||||
"""Both open and closed positions appear in the output."""
|
||||
open_pos = _make_open_position("AAPL", 150.0, 160.0, 10.0)
|
||||
closed_pos = _make_closed_position("GOOG", 100.0, 90.0, -25.0)
|
||||
data = CollectedData(open_positions=[open_pos], closed_positions=[closed_pos])
|
||||
section = build_position_performance_section(data)
|
||||
|
||||
assert len(section.positions) == 2
|
||||
tickers = {p.ticker for p in section.positions}
|
||||
assert tickers == {"AAPL", "GOOG"}
|
||||
|
||||
statuses = {p.ticker: p.status for p in section.positions}
|
||||
assert statuses["AAPL"] == "open"
|
||||
assert statuses["GOOG"] == "closed"
|
||||
|
||||
def test_single_position(self) -> None:
|
||||
"""A single open position is handled correctly."""
|
||||
pos = _make_open_position("TSLA", 250.0, 250.0, 5.0)
|
||||
data = CollectedData(open_positions=[pos])
|
||||
section = build_position_performance_section(data)
|
||||
|
||||
assert len(section.positions) == 1
|
||||
p = section.positions[0]
|
||||
# pnl = (250 - 250) * 5 = 0
|
||||
assert p.pnl == 0.0
|
||||
assert p.pnl_pct == 0.0
|
||||
|
||||
def test_hold_duration_computed(self) -> None:
|
||||
"""Hold duration is computed from updated_at to now."""
|
||||
# Use a fixed updated_at far enough in the past to get a positive duration
|
||||
updated = datetime(2025, 1, 10, 12, 0, tzinfo=timezone.utc)
|
||||
pos = _make_open_position("AAPL", 100.0, 110.0, 1.0, updated_at=updated)
|
||||
data = CollectedData(open_positions=[pos])
|
||||
section = build_position_performance_section(data)
|
||||
|
||||
# Hold duration should be positive (since updated_at is in the past)
|
||||
assert section.positions[0].hold_duration_hours > 0.0
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# 4. build_risk_metrics_section
|
||||
# Requirements validated: 3.4
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
|
||||
|
||||
class TestBuildRiskMetricsSection:
|
||||
"""Tests for build_risk_metrics_section."""
|
||||
|
||||
def test_with_snapshot(self) -> None:
|
||||
"""Risk metrics are extracted from the portfolio snapshot."""
|
||||
snap = _make_snapshot(
|
||||
risk_tier="high",
|
||||
portfolio_heat=0.25,
|
||||
max_drawdown=0.10,
|
||||
current_drawdown_pct=0.05,
|
||||
)
|
||||
data = CollectedData(
|
||||
portfolio_snapshot=snap,
|
||||
reserve_pool_balance=500.0,
|
||||
circuit_breaker_events=[{"id": "cb1"}, {"id": "cb2"}],
|
||||
)
|
||||
section = build_risk_metrics_section(data)
|
||||
|
||||
assert isinstance(section, RiskMetricsSection)
|
||||
assert section.current_risk_tier == "high"
|
||||
assert section.portfolio_heat == 0.25
|
||||
assert section.max_drawdown == 0.10
|
||||
assert section.current_drawdown_pct == 0.05
|
||||
assert section.reserve_pool_balance == 500.0
|
||||
assert section.circuit_breaker_event_count == 2
|
||||
|
||||
def test_no_snapshot(self) -> None:
|
||||
"""When no snapshot exists, risk tier is 'unknown' and metrics are zero."""
|
||||
data = CollectedData(
|
||||
portfolio_snapshot=None,
|
||||
reserve_pool_balance=300.0,
|
||||
circuit_breaker_events=[],
|
||||
)
|
||||
section = build_risk_metrics_section(data)
|
||||
|
||||
assert section.current_risk_tier == "unknown"
|
||||
assert section.portfolio_heat == 0.0
|
||||
assert section.max_drawdown == 0.0
|
||||
assert section.current_drawdown_pct == 0.0
|
||||
assert section.reserve_pool_balance == 300.0
|
||||
assert section.circuit_breaker_event_count == 0
|
||||
|
||||
def test_circuit_breaker_count(self) -> None:
|
||||
"""Circuit breaker event count matches the number of events."""
|
||||
events = [{"id": f"cb{i}"} for i in range(5)]
|
||||
data = CollectedData(
|
||||
portfolio_snapshot=_make_snapshot(),
|
||||
circuit_breaker_events=events,
|
||||
reserve_pool_balance=0.0,
|
||||
)
|
||||
section = build_risk_metrics_section(data)
|
||||
|
||||
assert section.circuit_breaker_event_count == 5
|
||||
|
||||
def test_zero_circuit_breaker_events(self) -> None:
|
||||
"""Zero circuit breaker events when list is empty."""
|
||||
data = CollectedData(
|
||||
portfolio_snapshot=_make_snapshot(),
|
||||
circuit_breaker_events=[],
|
||||
reserve_pool_balance=100.0,
|
||||
)
|
||||
section = build_risk_metrics_section(data)
|
||||
|
||||
assert section.circuit_breaker_event_count == 0
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# 5. build_model_quality_section
|
||||
# Requirements validated: 3.5
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
|
||||
|
||||
class TestBuildModelQualitySection:
|
||||
"""Tests for build_model_quality_section."""
|
||||
|
||||
def test_with_all_windows(self) -> None:
|
||||
"""Model quality section extracts metrics for 7d, 30d, 90d windows."""
|
||||
snapshots = [
|
||||
{
|
||||
"lookback_window": "7d",
|
||||
"generated_at": "2025-01-15T20:00:00Z",
|
||||
"win_rate": 0.65,
|
||||
"directional_accuracy": 0.62,
|
||||
"information_coefficient": 0.08,
|
||||
"calibration_error": 0.12,
|
||||
"brier_score": 0.22,
|
||||
},
|
||||
{
|
||||
"lookback_window": "30d",
|
||||
"generated_at": "2025-01-15T20:00:00Z",
|
||||
"win_rate": 0.60,
|
||||
"directional_accuracy": 0.58,
|
||||
"information_coefficient": 0.06,
|
||||
"calibration_error": 0.15,
|
||||
"brier_score": 0.25,
|
||||
},
|
||||
{
|
||||
"lookback_window": "90d",
|
||||
"generated_at": "2025-01-15T20:00:00Z",
|
||||
"win_rate": 0.55,
|
||||
"directional_accuracy": 0.53,
|
||||
"information_coefficient": 0.04,
|
||||
"calibration_error": 0.18,
|
||||
"brier_score": 0.28,
|
||||
},
|
||||
]
|
||||
data = CollectedData(model_metric_snapshots=snapshots)
|
||||
section = build_model_quality_section(data)
|
||||
|
||||
assert isinstance(section, ModelQualitySection)
|
||||
assert len(section.windows) == 3
|
||||
|
||||
by_lookback = {w.lookback: w for w in section.windows}
|
||||
assert by_lookback["7d"].win_rate == 0.65
|
||||
assert by_lookback["7d"].directional_accuracy == 0.62
|
||||
assert by_lookback["7d"].information_coefficient == 0.08
|
||||
assert by_lookback["7d"].calibration_error == 0.12
|
||||
assert by_lookback["7d"].brier_score == 0.22
|
||||
|
||||
assert by_lookback["30d"].win_rate == 0.60
|
||||
assert by_lookback["90d"].win_rate == 0.55
|
||||
|
||||
def test_no_snapshots(self) -> None:
|
||||
"""When there are no model metric snapshots, windows list is empty."""
|
||||
data = CollectedData(model_metric_snapshots=[])
|
||||
section = build_model_quality_section(data)
|
||||
|
||||
assert isinstance(section, ModelQualitySection)
|
||||
assert len(section.windows) == 0
|
||||
|
||||
def test_partial_windows(self) -> None:
|
||||
"""When only some lookback windows are present, missing ones get None values."""
|
||||
snapshots = [
|
||||
{
|
||||
"lookback_window": "7d",
|
||||
"generated_at": "2025-01-15T20:00:00Z",
|
||||
"win_rate": 0.70,
|
||||
"directional_accuracy": 0.68,
|
||||
"information_coefficient": 0.10,
|
||||
"calibration_error": 0.08,
|
||||
"brier_score": 0.18,
|
||||
},
|
||||
]
|
||||
data = CollectedData(model_metric_snapshots=snapshots)
|
||||
section = build_model_quality_section(data)
|
||||
|
||||
assert len(section.windows) == 3
|
||||
by_lookback = {w.lookback: w for w in section.windows}
|
||||
|
||||
# 7d has values
|
||||
assert by_lookback["7d"].win_rate == 0.70
|
||||
|
||||
# 30d and 90d have None values
|
||||
assert by_lookback["30d"].win_rate is None
|
||||
assert by_lookback["30d"].directional_accuracy is None
|
||||
assert by_lookback["90d"].win_rate is None
|
||||
assert by_lookback["90d"].brier_score is None
|
||||
|
||||
def test_takes_latest_snapshot_per_window(self) -> None:
|
||||
"""When multiple snapshots exist for a window, the first (latest) is used."""
|
||||
snapshots = [
|
||||
{
|
||||
"lookback_window": "7d",
|
||||
"generated_at": "2025-01-15T20:00:00Z",
|
||||
"win_rate": 0.70,
|
||||
"directional_accuracy": None,
|
||||
"information_coefficient": None,
|
||||
"calibration_error": None,
|
||||
"brier_score": None,
|
||||
},
|
||||
{
|
||||
"lookback_window": "7d",
|
||||
"generated_at": "2025-01-14T20:00:00Z",
|
||||
"win_rate": 0.50,
|
||||
"directional_accuracy": None,
|
||||
"information_coefficient": None,
|
||||
"calibration_error": None,
|
||||
"brier_score": None,
|
||||
},
|
||||
]
|
||||
data = CollectedData(model_metric_snapshots=snapshots)
|
||||
section = build_model_quality_section(data)
|
||||
|
||||
by_lookback = {w.lookback: w for w in section.windows}
|
||||
# Collector orders by generated_at DESC, so first entry (0.70) is latest
|
||||
assert by_lookback["7d"].win_rate == 0.70
|
||||
|
||||
def test_none_metric_values(self) -> None:
|
||||
"""Snapshot with None metric values produces None in the window."""
|
||||
snapshots = [
|
||||
{
|
||||
"lookback_window": "7d",
|
||||
"generated_at": "2025-01-15T20:00:00Z",
|
||||
"win_rate": None,
|
||||
"directional_accuracy": None,
|
||||
"information_coefficient": None,
|
||||
"calibration_error": None,
|
||||
"brier_score": None,
|
||||
},
|
||||
]
|
||||
data = CollectedData(model_metric_snapshots=snapshots)
|
||||
section = build_model_quality_section(data)
|
||||
|
||||
w = section.windows[0]
|
||||
assert w.win_rate is None
|
||||
assert w.directional_accuracy is None
|
||||
assert w.information_coefficient is None
|
||||
assert w.calibration_error is None
|
||||
assert w.brier_score is None
|
||||
@@ -0,0 +1,203 @@
|
||||
"""Unit tests for AI summarizer.
|
||||
|
||||
Tests the deterministic fallback summary generation and chunk_data edge cases
|
||||
from services.reporting.summarizer.
|
||||
|
||||
Requirements validated: 2.2, 2.6
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from services.reporting.summarizer import build_deterministic_summary, chunk_data
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# 1. chunk_data — edge cases
|
||||
# Requirements validated: 2.2
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
|
||||
|
||||
class TestChunkDataEdgeCases:
|
||||
"""Tests for chunk_data edge cases."""
|
||||
|
||||
def test_empty_input_returns_single_empty_chunk(self) -> None:
|
||||
"""Empty input produces exactly one empty-string chunk."""
|
||||
result = chunk_data("", max_chars=100)
|
||||
assert result == [""]
|
||||
|
||||
def test_single_character_returns_one_chunk(self) -> None:
|
||||
"""A single character fits in one chunk."""
|
||||
result = chunk_data("x", max_chars=100)
|
||||
assert result == ["x"]
|
||||
|
||||
def test_exactly_at_limit_returns_one_chunk(self) -> None:
|
||||
"""A string exactly at the limit fits in one chunk."""
|
||||
data = "a" * 50
|
||||
result = chunk_data(data, max_chars=50)
|
||||
assert result == [data]
|
||||
|
||||
def test_one_char_over_limit_with_newline_returns_two_chunks(self) -> None:
|
||||
"""A string one char over the limit (with a newline) splits into two chunks."""
|
||||
# 25 chars + newline + 25 chars = 51 chars total, limit=50
|
||||
data = "a" * 25 + "\n" + "b" * 25
|
||||
result = chunk_data(data, max_chars=50)
|
||||
assert len(result) == 2
|
||||
# First chunk: "aaa...a\n" (26 chars), second chunk: "bbb...b" (25 chars)
|
||||
assert result[0] == "a" * 25 + "\n"
|
||||
assert result[1] == "b" * 25
|
||||
# Round-trip: concatenation reconstructs original
|
||||
assert "".join(result) == data
|
||||
|
||||
def test_no_newlines_in_long_string_returns_one_chunk(self) -> None:
|
||||
"""A long string with no newlines is never broken mid-line — stays as one chunk."""
|
||||
data = "x" * 200
|
||||
result = chunk_data(data, max_chars=50)
|
||||
# No newlines means no split points, so the entire string is one chunk
|
||||
assert result == [data]
|
||||
|
||||
def test_multiple_newlines_proper_splitting(self) -> None:
|
||||
"""Multiple newlines produce proper splitting at line boundaries."""
|
||||
# 3 lines of 30 chars each (including newlines): "aaa...\n" "bbb...\n" "ccc..."
|
||||
line_a = "a" * 29 + "\n" # 30 chars
|
||||
line_b = "b" * 29 + "\n" # 30 chars
|
||||
line_c = "c" * 29 # 29 chars
|
||||
data = line_a + line_b + line_c # 89 chars total
|
||||
result = chunk_data(data, max_chars=60)
|
||||
# First chunk: line_a + line_b = 60 chars (exactly at limit)
|
||||
# Second chunk: line_c = 29 chars
|
||||
assert len(result) == 2
|
||||
assert result[0] == line_a + line_b
|
||||
assert result[1] == line_c
|
||||
assert "".join(result) == data
|
||||
|
||||
def test_round_trip_concatenation(self) -> None:
|
||||
"""Concatenating all chunks reconstructs the original string."""
|
||||
data = "line1\nline2\nline3\nline4\n"
|
||||
result = chunk_data(data, max_chars=12)
|
||||
assert "".join(result) == data
|
||||
|
||||
def test_max_chars_one(self) -> None:
|
||||
"""With max_chars=1, each line-segment becomes its own chunk."""
|
||||
data = "a\nb"
|
||||
result = chunk_data(data, max_chars=1)
|
||||
# "a\n" is 2 chars but no split point within it, so it's one chunk
|
||||
# "b" is 1 char, another chunk
|
||||
assert "".join(result) == data
|
||||
assert len(result) >= 2
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# 2. build_deterministic_summary — section type templates
|
||||
# Requirements validated: 2.6
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
|
||||
|
||||
class TestBuildDeterministicSummary:
|
||||
"""Tests for build_deterministic_summary with each section type."""
|
||||
|
||||
def test_pnl_section(self) -> None:
|
||||
"""P&L section uses the pnl template with realized_pnl, unrealized_pnl, etc."""
|
||||
data = {
|
||||
"realized_pnl": 125.50,
|
||||
"unrealized_pnl": -30.20,
|
||||
"daily_return": 1.2,
|
||||
"win_rate": 72.7,
|
||||
}
|
||||
result = build_deterministic_summary("pnl", data)
|
||||
assert "125.5" in result
|
||||
assert "-30.2" in result
|
||||
assert "1.2" in result
|
||||
assert "72.7" in result
|
||||
assert result.startswith("P&L Summary:")
|
||||
|
||||
def test_recommendation_accuracy_section(self) -> None:
|
||||
"""Recommendation accuracy section uses the template with total_evaluated, act_count, etc."""
|
||||
data = {
|
||||
"total_evaluated": 15,
|
||||
"act_count": 8,
|
||||
"acted_win_rate": 75.0,
|
||||
"skip_count": 7,
|
||||
"avg_confidence_acted": 0.72,
|
||||
"avg_confidence_skipped": 0.48,
|
||||
}
|
||||
result = build_deterministic_summary("recommendation_accuracy", data)
|
||||
assert "15" in result
|
||||
assert "8" in result
|
||||
assert "75.0" in result or "75" in result
|
||||
assert "7" in result
|
||||
assert result.startswith("Recommendation Accuracy:")
|
||||
|
||||
def test_position_performance_section(self) -> None:
|
||||
"""Position performance section uses the template with position count."""
|
||||
data = {
|
||||
"positions": [
|
||||
{"ticker": "AAPL", "pnl": 68.0},
|
||||
{"ticker": "MSFT", "pnl": -12.0},
|
||||
{"ticker": "GOOG", "pnl": 25.0},
|
||||
],
|
||||
}
|
||||
result = build_deterministic_summary("position_performance", data)
|
||||
assert "3" in result
|
||||
assert "Position Performance:" in result
|
||||
|
||||
def test_position_performance_empty_positions(self) -> None:
|
||||
"""Position performance with no positions reports 0."""
|
||||
data = {"positions": []}
|
||||
result = build_deterministic_summary("position_performance", data)
|
||||
assert "0" in result
|
||||
|
||||
def test_risk_metrics_section(self) -> None:
|
||||
"""Risk metrics section uses the template with risk_tier, portfolio_heat, etc."""
|
||||
data = {
|
||||
"current_risk_tier": "moderate",
|
||||
"portfolio_heat": 0.12,
|
||||
"max_drawdown": 0.08,
|
||||
"current_drawdown_pct": 3.0,
|
||||
"reserve_pool_balance": 450.00,
|
||||
"circuit_breaker_event_count": 1,
|
||||
}
|
||||
result = build_deterministic_summary("risk_metrics", data)
|
||||
assert "moderate" in result
|
||||
assert "0.12" in result
|
||||
assert "0.08" in result
|
||||
assert "3.0" in result or "3" in result
|
||||
assert "450" in result
|
||||
assert "1" in result
|
||||
assert result.startswith("Risk Metrics:")
|
||||
|
||||
def test_model_quality_section(self) -> None:
|
||||
"""Model quality section uses the template with window count."""
|
||||
data = {
|
||||
"windows": [
|
||||
{"lookback": "7d"},
|
||||
{"lookback": "30d"},
|
||||
{"lookback": "90d"},
|
||||
],
|
||||
}
|
||||
result = build_deterministic_summary("model_quality", data)
|
||||
assert "3" in result
|
||||
assert "Model Quality:" in result
|
||||
|
||||
def test_model_quality_no_windows(self) -> None:
|
||||
"""Model quality with no windows reports 0."""
|
||||
data = {"windows": []}
|
||||
result = build_deterministic_summary("model_quality", data)
|
||||
assert "0" in result
|
||||
|
||||
def test_unknown_section_generic_fallback(self) -> None:
|
||||
"""An unknown section name produces a generic fallback summary."""
|
||||
data = {"metric_a": 1, "metric_b": 2, "metric_c": 3}
|
||||
result = build_deterministic_summary("unknown_section", data)
|
||||
assert "unknown_section" in result
|
||||
assert "3 metrics reported" in result
|
||||
|
||||
def test_unknown_section_empty_data(self) -> None:
|
||||
"""An unknown section with empty data reports 0 metrics."""
|
||||
result = build_deterministic_summary("totally_new", {})
|
||||
assert "totally_new" in result
|
||||
assert "0 metrics reported" in result
|
||||
|
||||
def test_pnl_missing_key_falls_back(self) -> None:
|
||||
"""P&L template with missing keys falls back to error message."""
|
||||
data = {"realized_pnl": 100.0} # missing other keys
|
||||
result = build_deterministic_summary("pnl", data)
|
||||
# Should fall back to the error message since template.format() will raise KeyError
|
||||
assert "template formatting failed" in result
|
||||
@@ -0,0 +1,551 @@
|
||||
"""Unit tests for report validator.
|
||||
|
||||
Tests the validation functions from services.reporting.validator with
|
||||
specific discrepancy scenarios, boundary cases, and edge cases.
|
||||
|
||||
Requirements validated: 4.1, 4.2, 4.3, 4.4
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from datetime import date, datetime, timezone
|
||||
|
||||
from services.reporting.models import (
|
||||
ModelQualitySection,
|
||||
ModelQualityWindow,
|
||||
PLSection,
|
||||
PositionPerformanceSection,
|
||||
RecommendationAccuracySection,
|
||||
ReportData,
|
||||
ReportType,
|
||||
RiskMetricsSection,
|
||||
ValidationStatus,
|
||||
)
|
||||
from services.reporting.validator import (
|
||||
_check_discrepancy,
|
||||
compute_validation_status,
|
||||
validate_model_quality,
|
||||
validate_recommendation_accuracy,
|
||||
)
|
||||
|
||||
# ── Helpers ──────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def _make_report(**overrides: object) -> ReportData:
|
||||
"""Build a minimal ReportData with sensible defaults."""
|
||||
defaults: dict = {
|
||||
"pnl": PLSection(
|
||||
realized_pnl=0.0,
|
||||
unrealized_pnl=0.0,
|
||||
daily_return=0.0,
|
||||
cumulative_return=0.0,
|
||||
win_count=0,
|
||||
loss_count=0,
|
||||
win_rate=0.0,
|
||||
profit_factor=0.0,
|
||||
sharpe_ratio=0.0,
|
||||
),
|
||||
"recommendation_accuracy": RecommendationAccuracySection(
|
||||
total_evaluated=0,
|
||||
act_count=0,
|
||||
skip_count=0,
|
||||
acted_win_rate=0.0,
|
||||
avg_confidence_acted=0.0,
|
||||
avg_confidence_skipped=0.0,
|
||||
),
|
||||
"position_performance": PositionPerformanceSection(),
|
||||
"risk_metrics": RiskMetricsSection(
|
||||
current_risk_tier="moderate",
|
||||
portfolio_heat=0.0,
|
||||
max_drawdown=0.0,
|
||||
current_drawdown_pct=0.0,
|
||||
reserve_pool_balance=0.0,
|
||||
circuit_breaker_event_count=0,
|
||||
),
|
||||
"model_quality": ModelQualitySection(),
|
||||
"generated_at": datetime(2025, 1, 15, 21, 30, tzinfo=timezone.utc),
|
||||
"period_start": date(2025, 1, 15),
|
||||
"period_end": date(2025, 1, 15),
|
||||
"report_type": ReportType.DAILY,
|
||||
}
|
||||
defaults.update(overrides)
|
||||
return ReportData(**defaults)
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# 1. _check_discrepancy — boundary tests
|
||||
# Requirements validated: 4.1, 4.2, 4.3
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
|
||||
|
||||
class TestCheckDiscrepancy:
|
||||
"""Tests for _check_discrepancy boundary and edge cases."""
|
||||
|
||||
def test_exactly_5_percent_no_warning(self) -> None:
|
||||
"""Exactly 5% discrepancy does NOT trigger a warning (threshold is >5%)."""
|
||||
# snapshot=100, computed=105 → |105-100|/100*100 = 5.0%
|
||||
result = _check_discrepancy("test_field", 105.0, 100.0)
|
||||
assert result is None
|
||||
|
||||
def test_just_above_5_percent_triggers_warning(self) -> None:
|
||||
"""5.1% discrepancy triggers a warning."""
|
||||
# snapshot=100, computed=105.1 → |105.1-100|/100*100 = 5.1%
|
||||
result = _check_discrepancy("test_field", 105.1, 100.0)
|
||||
assert result is not None
|
||||
assert result.field_name == "test_field"
|
||||
assert result.computed_value == 105.1
|
||||
assert result.snapshot_value == 100.0
|
||||
assert abs(result.pct_difference - 5.1) < 0.01
|
||||
|
||||
def test_snapshot_zero_computed_nonzero_warns(self) -> None:
|
||||
"""snapshot=0 with computed≠0 → 100% discrepancy → warning."""
|
||||
result = _check_discrepancy("test_field", 42.0, 0.0)
|
||||
assert result is not None
|
||||
assert result.pct_difference == 100.0
|
||||
|
||||
def test_both_zero_no_warning(self) -> None:
|
||||
"""Both snapshot=0 and computed=0 → no warning."""
|
||||
result = _check_discrepancy("test_field", 0.0, 0.0)
|
||||
assert result is None
|
||||
|
||||
def test_large_discrepancy(self) -> None:
|
||||
"""A large discrepancy (50%) triggers a warning."""
|
||||
# snapshot=100, computed=150 → 50%
|
||||
result = _check_discrepancy("big_diff", 150.0, 100.0)
|
||||
assert result is not None
|
||||
assert abs(result.pct_difference - 50.0) < 0.01
|
||||
|
||||
def test_small_discrepancy_no_warning(self) -> None:
|
||||
"""A small discrepancy (1%) does not trigger a warning."""
|
||||
# snapshot=100, computed=101 → 1%
|
||||
result = _check_discrepancy("small_diff", 101.0, 100.0)
|
||||
assert result is None
|
||||
|
||||
def test_computed_below_snapshot(self) -> None:
|
||||
"""Discrepancy is detected when computed < snapshot too."""
|
||||
# snapshot=100, computed=94 → 6%
|
||||
result = _check_discrepancy("below", 94.0, 100.0)
|
||||
assert result is not None
|
||||
assert abs(result.pct_difference - 6.0) < 0.01
|
||||
|
||||
def test_nan_computed_sanitized_to_zero(self) -> None:
|
||||
"""NaN computed value is sanitized to 0.0 before comparison."""
|
||||
result = _check_discrepancy("nan_field", float("nan"), 100.0)
|
||||
# sanitized computed=0.0, snapshot=100 → 100% discrepancy
|
||||
assert result is not None
|
||||
assert result.computed_value == 0.0
|
||||
assert result.pct_difference == 100.0
|
||||
|
||||
def test_inf_computed_sanitized_to_zero(self) -> None:
|
||||
"""Infinity computed value is sanitized to 0.0 before comparison."""
|
||||
result = _check_discrepancy("inf_field", float("inf"), 100.0)
|
||||
assert result is not None
|
||||
assert result.computed_value == 0.0
|
||||
|
||||
def test_snapshot_zero_computed_zero_small(self) -> None:
|
||||
"""snapshot=0.0 and computed=0.0 exactly → no warning."""
|
||||
result = _check_discrepancy("zero_zero", 0.0, 0.0)
|
||||
assert result is None
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# 2. validate_recommendation_accuracy
|
||||
# Requirements validated: 4.1
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
|
||||
|
||||
class TestValidateRecommendationAccuracy:
|
||||
"""Tests for validate_recommendation_accuracy."""
|
||||
|
||||
def test_matching_data_no_warnings(self) -> None:
|
||||
"""When section win rate matches prediction outcomes, no warnings."""
|
||||
# 2 out of 4 profitable → 0.5 win rate
|
||||
section = RecommendationAccuracySection(
|
||||
total_evaluated=4,
|
||||
act_count=4,
|
||||
skip_count=0,
|
||||
acted_win_rate=0.5,
|
||||
avg_confidence_acted=0.7,
|
||||
avg_confidence_skipped=0.0,
|
||||
)
|
||||
outcomes = [
|
||||
{"profitable": True},
|
||||
{"profitable": False},
|
||||
{"profitable": True},
|
||||
{"profitable": False},
|
||||
]
|
||||
warnings = validate_recommendation_accuracy(section, outcomes)
|
||||
assert warnings == []
|
||||
|
||||
def test_discrepancy_triggers_warning(self) -> None:
|
||||
"""When section win rate differs >5% from outcomes, a warning is raised."""
|
||||
# outcomes: 1/2 profitable → 0.5, section says 0.8 → 60% discrepancy
|
||||
section = RecommendationAccuracySection(
|
||||
total_evaluated=2,
|
||||
act_count=2,
|
||||
skip_count=0,
|
||||
acted_win_rate=0.8,
|
||||
avg_confidence_acted=0.7,
|
||||
avg_confidence_skipped=0.0,
|
||||
)
|
||||
outcomes = [
|
||||
{"profitable": True},
|
||||
{"profitable": False},
|
||||
]
|
||||
warnings = validate_recommendation_accuracy(section, outcomes)
|
||||
assert len(warnings) == 1
|
||||
assert warnings[0].field_name == "acted_win_rate"
|
||||
|
||||
def test_no_outcomes_returns_empty(self) -> None:
|
||||
"""When there are no prediction outcomes, validation is skipped."""
|
||||
section = RecommendationAccuracySection(
|
||||
total_evaluated=5,
|
||||
act_count=3,
|
||||
skip_count=2,
|
||||
acted_win_rate=0.6,
|
||||
avg_confidence_acted=0.7,
|
||||
avg_confidence_skipped=0.4,
|
||||
)
|
||||
warnings = validate_recommendation_accuracy(section, [])
|
||||
assert warnings == []
|
||||
|
||||
def test_all_profitable_matching(self) -> None:
|
||||
"""All outcomes profitable and section says 1.0 → no warning."""
|
||||
section = RecommendationAccuracySection(
|
||||
total_evaluated=3,
|
||||
act_count=3,
|
||||
skip_count=0,
|
||||
acted_win_rate=1.0,
|
||||
avg_confidence_acted=0.9,
|
||||
avg_confidence_skipped=0.0,
|
||||
)
|
||||
outcomes = [
|
||||
{"profitable": True},
|
||||
{"profitable": True},
|
||||
{"profitable": True},
|
||||
]
|
||||
warnings = validate_recommendation_accuracy(section, outcomes)
|
||||
assert warnings == []
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# 3. validate_model_quality
|
||||
# Requirements validated: 4.2, 4.3
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
|
||||
|
||||
class TestValidateModelQuality:
|
||||
"""Tests for validate_model_quality."""
|
||||
|
||||
def test_matching_data_no_warnings(self) -> None:
|
||||
"""When section metrics match snapshots, no warnings are produced."""
|
||||
section = ModelQualitySection(
|
||||
windows=[
|
||||
ModelQualityWindow(
|
||||
lookback="7d",
|
||||
win_rate=0.65,
|
||||
directional_accuracy=0.62,
|
||||
information_coefficient=0.08,
|
||||
calibration_error=0.12,
|
||||
brier_score=0.22,
|
||||
),
|
||||
],
|
||||
)
|
||||
snapshots = [
|
||||
{
|
||||
"lookback_window": "7d",
|
||||
"win_rate": 0.65,
|
||||
"directional_accuracy": 0.62,
|
||||
"information_coefficient": 0.08,
|
||||
"calibration_error": 0.12,
|
||||
"brier_score": 0.22,
|
||||
},
|
||||
]
|
||||
warnings = validate_model_quality(section, snapshots)
|
||||
assert warnings == []
|
||||
|
||||
def test_discrepancy_triggers_warnings(self) -> None:
|
||||
"""When section metrics differ >5% from snapshots, warnings are raised."""
|
||||
section = ModelQualitySection(
|
||||
windows=[
|
||||
ModelQualityWindow(
|
||||
lookback="7d",
|
||||
win_rate=0.80, # snapshot says 0.65 → ~23% off
|
||||
directional_accuracy=0.62,
|
||||
information_coefficient=0.08,
|
||||
calibration_error=0.12,
|
||||
brier_score=0.22,
|
||||
),
|
||||
],
|
||||
)
|
||||
snapshots = [
|
||||
{
|
||||
"lookback_window": "7d",
|
||||
"win_rate": 0.65,
|
||||
"directional_accuracy": 0.62,
|
||||
"information_coefficient": 0.08,
|
||||
"calibration_error": 0.12,
|
||||
"brier_score": 0.22,
|
||||
},
|
||||
]
|
||||
warnings = validate_model_quality(section, snapshots)
|
||||
assert len(warnings) == 1
|
||||
assert warnings[0].field_name == "7d_win_rate"
|
||||
|
||||
def test_null_snapshot_value_skipped(self) -> None:
|
||||
"""When a snapshot metric is NULL (None), that metric is skipped."""
|
||||
section = ModelQualitySection(
|
||||
windows=[
|
||||
ModelQualityWindow(
|
||||
lookback="7d",
|
||||
win_rate=0.65,
|
||||
directional_accuracy=0.62,
|
||||
information_coefficient=0.08,
|
||||
calibration_error=0.12,
|
||||
brier_score=0.22,
|
||||
),
|
||||
],
|
||||
)
|
||||
snapshots = [
|
||||
{
|
||||
"lookback_window": "7d",
|
||||
"win_rate": None, # NULL → skip
|
||||
"directional_accuracy": None,
|
||||
"information_coefficient": None,
|
||||
"calibration_error": None,
|
||||
"brier_score": None,
|
||||
},
|
||||
]
|
||||
warnings = validate_model_quality(section, snapshots)
|
||||
assert warnings == []
|
||||
|
||||
def test_no_snapshots_returns_empty(self) -> None:
|
||||
"""When there are no metric snapshots, validation is skipped."""
|
||||
section = ModelQualitySection(
|
||||
windows=[
|
||||
ModelQualityWindow(
|
||||
lookback="7d",
|
||||
win_rate=0.65,
|
||||
directional_accuracy=0.62,
|
||||
information_coefficient=0.08,
|
||||
calibration_error=0.12,
|
||||
brier_score=0.22,
|
||||
),
|
||||
],
|
||||
)
|
||||
warnings = validate_model_quality(section, [])
|
||||
assert warnings == []
|
||||
|
||||
def test_multiple_windows_validated(self) -> None:
|
||||
"""Validation runs across all lookback windows."""
|
||||
section = ModelQualitySection(
|
||||
windows=[
|
||||
ModelQualityWindow(
|
||||
lookback="7d",
|
||||
win_rate=0.65,
|
||||
directional_accuracy=0.62,
|
||||
information_coefficient=0.08,
|
||||
calibration_error=0.12,
|
||||
brier_score=0.22,
|
||||
),
|
||||
ModelQualityWindow(
|
||||
lookback="30d",
|
||||
win_rate=0.90, # snapshot says 0.60 → 50% off
|
||||
directional_accuracy=0.58,
|
||||
information_coefficient=0.06,
|
||||
calibration_error=0.15,
|
||||
brier_score=0.25,
|
||||
),
|
||||
],
|
||||
)
|
||||
snapshots = [
|
||||
{
|
||||
"lookback_window": "7d",
|
||||
"win_rate": 0.65,
|
||||
"directional_accuracy": 0.62,
|
||||
"information_coefficient": 0.08,
|
||||
"calibration_error": 0.12,
|
||||
"brier_score": 0.22,
|
||||
},
|
||||
{
|
||||
"lookback_window": "30d",
|
||||
"win_rate": 0.60,
|
||||
"directional_accuracy": 0.58,
|
||||
"information_coefficient": 0.06,
|
||||
"calibration_error": 0.15,
|
||||
"brier_score": 0.25,
|
||||
},
|
||||
]
|
||||
warnings = validate_model_quality(section, snapshots)
|
||||
# Only 30d_win_rate should be flagged
|
||||
assert len(warnings) == 1
|
||||
assert warnings[0].field_name == "30d_win_rate"
|
||||
|
||||
def test_null_section_value_skipped(self) -> None:
|
||||
"""When a section metric is None, that metric is skipped."""
|
||||
section = ModelQualitySection(
|
||||
windows=[
|
||||
ModelQualityWindow(
|
||||
lookback="7d",
|
||||
win_rate=None,
|
||||
directional_accuracy=None,
|
||||
information_coefficient=None,
|
||||
calibration_error=None,
|
||||
brier_score=None,
|
||||
),
|
||||
],
|
||||
)
|
||||
snapshots = [
|
||||
{
|
||||
"lookback_window": "7d",
|
||||
"win_rate": 0.65,
|
||||
"directional_accuracy": 0.62,
|
||||
"information_coefficient": 0.08,
|
||||
"calibration_error": 0.12,
|
||||
"brier_score": 0.22,
|
||||
},
|
||||
]
|
||||
warnings = validate_model_quality(section, snapshots)
|
||||
assert warnings == []
|
||||
|
||||
def test_no_matching_window_in_snapshots(self) -> None:
|
||||
"""When section has a window not in snapshots, it is skipped."""
|
||||
section = ModelQualitySection(
|
||||
windows=[
|
||||
ModelQualityWindow(
|
||||
lookback="90d",
|
||||
win_rate=0.55,
|
||||
directional_accuracy=0.53,
|
||||
information_coefficient=0.04,
|
||||
calibration_error=0.18,
|
||||
brier_score=0.28,
|
||||
),
|
||||
],
|
||||
)
|
||||
snapshots = [
|
||||
{
|
||||
"lookback_window": "7d",
|
||||
"win_rate": 0.65,
|
||||
"directional_accuracy": 0.62,
|
||||
"information_coefficient": 0.08,
|
||||
"calibration_error": 0.12,
|
||||
"brier_score": 0.22,
|
||||
},
|
||||
]
|
||||
warnings = validate_model_quality(section, snapshots)
|
||||
assert warnings == []
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
# 4. compute_validation_status
|
||||
# Requirements validated: 4.4
|
||||
# ═══════════════════════════════════════════════════════════════════════
|
||||
|
||||
|
||||
class TestComputeValidationStatus:
|
||||
"""Tests for compute_validation_status."""
|
||||
|
||||
def test_no_warnings_returns_passed(self) -> None:
|
||||
"""When no sections have warnings, status is PASSED."""
|
||||
report = _make_report()
|
||||
status = compute_validation_status(report)
|
||||
assert status == ValidationStatus.PASSED
|
||||
|
||||
def test_pnl_warnings_returns_warnings(self) -> None:
|
||||
"""When P&L section has warnings, status is WARNINGS."""
|
||||
from services.reporting.models import ValidationWarning
|
||||
|
||||
report = _make_report(
|
||||
pnl=PLSection(
|
||||
realized_pnl=0.0,
|
||||
unrealized_pnl=0.0,
|
||||
daily_return=0.0,
|
||||
cumulative_return=0.0,
|
||||
win_count=0,
|
||||
loss_count=0,
|
||||
win_rate=0.0,
|
||||
profit_factor=0.0,
|
||||
sharpe_ratio=0.0,
|
||||
validation_warnings=[
|
||||
ValidationWarning(
|
||||
field_name="test",
|
||||
computed_value=1.0,
|
||||
snapshot_value=0.5,
|
||||
pct_difference=100.0,
|
||||
),
|
||||
],
|
||||
),
|
||||
)
|
||||
status = compute_validation_status(report)
|
||||
assert status == ValidationStatus.WARNINGS
|
||||
|
||||
def test_recommendation_accuracy_warnings_returns_warnings(self) -> None:
|
||||
"""When recommendation accuracy section has warnings, status is WARNINGS."""
|
||||
from services.reporting.models import ValidationWarning
|
||||
|
||||
report = _make_report(
|
||||
recommendation_accuracy=RecommendationAccuracySection(
|
||||
total_evaluated=0,
|
||||
act_count=0,
|
||||
skip_count=0,
|
||||
acted_win_rate=0.0,
|
||||
avg_confidence_acted=0.0,
|
||||
avg_confidence_skipped=0.0,
|
||||
validation_warnings=[
|
||||
ValidationWarning(
|
||||
field_name="acted_win_rate",
|
||||
computed_value=0.8,
|
||||
snapshot_value=0.5,
|
||||
pct_difference=60.0,
|
||||
),
|
||||
],
|
||||
),
|
||||
)
|
||||
status = compute_validation_status(report)
|
||||
assert status == ValidationStatus.WARNINGS
|
||||
|
||||
def test_model_quality_warnings_returns_warnings(self) -> None:
|
||||
"""When model quality section has warnings, status is WARNINGS."""
|
||||
from services.reporting.models import ValidationWarning
|
||||
|
||||
report = _make_report(
|
||||
model_quality=ModelQualitySection(
|
||||
validation_warnings=[
|
||||
ValidationWarning(
|
||||
field_name="7d_win_rate",
|
||||
computed_value=0.9,
|
||||
snapshot_value=0.65,
|
||||
pct_difference=38.46,
|
||||
),
|
||||
],
|
||||
),
|
||||
)
|
||||
status = compute_validation_status(report)
|
||||
assert status == ValidationStatus.WARNINGS
|
||||
|
||||
def test_multiple_sections_with_warnings(self) -> None:
|
||||
"""When multiple sections have warnings, status is still WARNINGS."""
|
||||
from services.reporting.models import ValidationWarning
|
||||
|
||||
w = ValidationWarning(
|
||||
field_name="x",
|
||||
computed_value=1.0,
|
||||
snapshot_value=0.0,
|
||||
pct_difference=100.0,
|
||||
)
|
||||
report = _make_report(
|
||||
pnl=PLSection(
|
||||
realized_pnl=0.0,
|
||||
unrealized_pnl=0.0,
|
||||
daily_return=0.0,
|
||||
cumulative_return=0.0,
|
||||
win_count=0,
|
||||
loss_count=0,
|
||||
win_rate=0.0,
|
||||
profit_factor=0.0,
|
||||
sharpe_ratio=0.0,
|
||||
validation_warnings=[w],
|
||||
),
|
||||
model_quality=ModelQualitySection(validation_warnings=[w]),
|
||||
)
|
||||
status = compute_validation_status(report)
|
||||
assert status == ValidationStatus.WARNINGS
|
||||
Reference in New Issue
Block a user