Files
stonks-oracle/.kiro/specs/cicd-pipeline/tasks.md
T
Celes Renata c85c0068a2 fix: clean up utcnow deprecation warnings, fix 12 failing tests, add CI/CD pipeline manifests
- Replace all datetime.utcnow() with datetime.now(tz=timezone.utc) across 8 files
- Fix 12 failing tests to match current implementation behavior
- Fix pytest_plugins in non-top-level conftest (moved to root conftest.py)
- Auto-fix 189 lint issues (import sorting, unused imports)
- Add CI/CD pipeline infrastructure (ARC, ArgoCD, Kargo manifests)
- Add values-beta.yaml and values-paper.yaml for staged deployments
- Update GitHub Actions workflow to use self-hosted-gremlin runners
- Add integration-test job to CI pipeline

Result: 1596 passed, 0 failed, 0 warnings
2026-04-18 03:59:28 +00:00

97 lines
10 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Implementation Plan: CI/CD Pipeline
## Overview
Build a full CI/CD pipeline for Stonks Oracle using ARC (self-hosted GitHub Actions runners), ArgoCD (GitOps deployment), and Kargo (staged promotion orchestration) on the Gremlin cluster. Pipeline infrastructure scripts go in `~/sources/kube/pipelines/` on gremlin-1. Helm values files and the updated GitHub Actions workflow go in the stonks-oracle repo.
## Tasks
- [x] 1. Create NFS PersistentVolume manifests
- [x] 1.1 Create `~/sources/kube/pipelines/pvs/argocd-pv.yaml` — NFS PV for ArgoCD (5Gi, `nfs://192.168.42.8:/volume1/Kubernetes/pipelines/argocd`, `persistentVolumeReclaimPolicy: Retain`, label `app: pipeline-argocd`)
- _Requirements: 1.2, 17.1, 17.4_
- [x] 1.2 Create `~/sources/kube/pipelines/pvs/kargo-pv.yaml` — NFS PV for Kargo (2Gi, `nfs://192.168.42.8:/volume1/Kubernetes/pipelines/kargo`, `persistentVolumeReclaimPolicy: Retain`, label `app: pipeline-kargo`)
- _Requirements: 1.2, 17.1, 17.4_
- [x] 1.3 Create `~/sources/kube/pipelines/pvs/arc-pv.yaml` — NFS PV for ARC (2Gi, `nfs://192.168.42.8:/volume1/Kubernetes/pipelines/arc`, `persistentVolumeReclaimPolicy: Retain`, label `app: pipeline-arc`)
- _Requirements: 1.2, 17.1, 17.4_
- [x] 2. Create ARC (Actions Runner Controller) manifests
- [x] 2.1 Create `~/sources/kube/pipelines/arc/values.yaml` — Helm values for the ARC controller chart (`oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set-controller`), namespace `arc-system`
- _Requirements: 1.1, 4.1_
- [x] 2.2 Create `~/sources/kube/pipelines/arc/runner-scaleset.yaml` — RunnerScaleSet CR for `celesrenata/stonks-oracle` repo with label `self-hosted-gremlin`, `containerMode.type: kubernetes`, ephemeral pods, 2 CPU / 4Gi memory limits
- _Requirements: 4.1, 4.2, 4.3, 4.4_
- [x] 3. Create ArgoCD manifests
- [x] 3.1 Create `~/sources/kube/pipelines/argocd/values.yaml` — Helm values for `argo/argo-cd` chart in `argocd` namespace, with Traefik ingress at `stonks-argocd.celestium.life`, TLS via `ca-issuer`, NFS PVC for persistence
- _Requirements: 1.1, 1.3, 18.1_
- [x] 3.2 Create `~/sources/kube/pipelines/argocd/repo-secret.yaml` — Kubernetes Secret with Git credentials for the `celesrenata/stonks-oracle` repository, namespace `argocd`
- _Requirements: 18.1_
- [x] 3.3 Create `~/sources/kube/pipelines/argocd/apps/stonks-beta.yaml` — ArgoCD Application for beta stage, pointing at `infra/helm/stonks-oracle/` with `values-beta.yaml`, target namespace `stonks-beta`, auto-sync with prune and selfHeal
- _Requirements: 8.1, 8.4, 18.2, 18.3_
- [x] 3.4 Create `~/sources/kube/pipelines/argocd/apps/stonks-paper.yaml` — ArgoCD Application for paper stage, pointing at `infra/helm/stonks-oracle/` with `values-paper.yaml`, target namespace `stonks-paper`, auto-sync with prune and selfHeal
- _Requirements: 9.1, 9.5, 18.2, 18.3_
- [x] 3.5 Create `~/sources/kube/pipelines/argocd/apps/stonks-live.yaml` — ArgoCD Application for live stage, pointing at `infra/helm/stonks-oracle/` with `values.yaml`, target namespace `stonks-oracle`, auto-sync with prune and selfHeal
- _Requirements: 10.2, 10.5, 18.2, 18.3_
- [x] 4. Checkpoint — Verify ArgoCD and ARC manifests
- Ensure all YAML manifests are syntactically valid. Review that each ArgoCD Application points at the correct chart path, values file, and target namespace. Ask the user if questions arise.
- [x] 5. Create Kargo manifests
- [x] 5.1 Create `~/sources/kube/pipelines/kargo/values.yaml` — Helm values for `oci://ghcr.io/akuity/kargo-charts/kargo` in `kargo` namespace, with Traefik ingress at `stonks-kargo.celestium.life`, TLS via `ca-issuer`, NFS PVC for persistence
- _Requirements: 1.1, 1.4, 16.6_
- [x] 5.2 Create `~/sources/kube/pipelines/kargo/project.yaml` — Kargo Project resource `stonks-oracle` in `stonks-oracle` namespace
- _Requirements: 8.2, 14.1_
- [x] 5.3 Create `~/sources/kube/pipelines/kargo/warehouse.yaml` — Kargo Warehouse `stonks-images` watching `ghcr.io/celesrenata/stonks-oracle/query-api` for new image tags
- _Requirements: 6.5, 14.1_
- [x] 5.4 Create `~/sources/kube/pipelines/kargo/stages/beta.yaml` — Kargo Stage for beta with auto-promotion enabled, promotion template that updates `image.tag` in the `stonks-beta` ArgoCD Application
- _Requirements: 8.1, 8.3, 13.1_
- [x] 5.5 Create `~/sources/kube/pipelines/kargo/stages/paper.yaml` — Kargo Stage for paper with manual promotion, market-hours verification step (AnalysisTemplate), promotion template that updates `image.tag` in the `stonks-paper` ArgoCD Application
- _Requirements: 9.1, 9.3, 9.4, 11.1, 11.2, 13.1_
- [x] 5.6 Create `~/sources/kube/pipelines/kargo/stages/live.yaml` — Kargo Stage for live with manual approval + required notes, market-hours verification step, promotion template that updates `image.tag` in the `stonks-live` ArgoCD Application
- _Requirements: 10.1, 10.3, 10.4, 11.1, 11.2, 12.1, 12.3, 13.1_
- [x] 5.7 Create `~/sources/kube/pipelines/kargo/project-config.yaml` — Kargo ProjectConfig with per-stage `autoPromotionEnabled` settings (beta: true, paper: false, live: false)
- _Requirements: 13.1, 13.2, 13.3_
- [x] 6. Create market-hours AnalysisTemplate
- [x] 6.1 Create the AnalysisTemplate manifest for market-hours verification — runs an Alpine container that checks Eastern Time (09:3016:00 ET, MonFri), exits 0 outside market hours, exits 1 during market hours. Uses `America/New_York` timezone for DST correctness. Place in `~/sources/kube/pipelines/kargo/` directory.
- _Requirements: 11.1, 11.2, 11.4_
- [x] 7. Checkpoint — Verify Kargo manifests and promotion DAG
- Ensure Kargo stages form the correct linear DAG: beta → paper → live. Verify market-hours AnalysisTemplate is referenced by paper and live stages. Ensure all YAML is syntactically valid. Ask the user if questions arise.
- [x] 8. Create Helm values files for beta and paper stages (in stonks-oracle repo)
- [x] 8.1 Create `infra/helm/stonks-oracle/values-beta.yaml` — lighter resources, `BROKER_MODE: mock`, `BROKER_PROVIDER: mock`, `LOG_LEVEL: DEBUG`, `TRADING_ENABLED: false`, single replicas per service
- _Requirements: 8.4, 9.2_
- [x] 8.2 Create `infra/helm/stonks-oracle/values-paper.yaml` — paper broker config, `BROKER_MODE: paper`, `BROKER_PROVIDER: alpaca`, `BROKER_BASE_URL: https://paper-api.alpaca.markets`, `LOG_LEVEL: INFO`, `TRADING_ENABLED: true`
- _Requirements: 9.2, 9.5_
- [x] 9. Update GitHub Actions workflow (in stonks-oracle repo)
- [x] 9.1 Update `.github/workflows/build.yml` — change `runs-on: ubuntu-latest` to `runs-on: self-hosted-gremlin` on all jobs (`lint-and-test`, `build-services`, `build-dashboard`, `build-superset`)
- _Requirements: 5.1, 4.2_
- [x] 9.2 Add `integration-test` job to `.github/workflows/build.yml` — depends on `build-services` and `build-dashboard`, runs only on push to main, invokes `bash infra/inttest/run_pipeline.sh --image-tag ${{ github.sha }} --results-file inttest-results.json`, uploads `inttest-results.json` as a build artifact via `actions/upload-artifact@v4`
- _Requirements: 7.1, 7.2, 7.3, 7.4, 7.5_
- [x] 10. Checkpoint — Verify workflow and values files
- Ensure the updated workflow YAML is syntactically valid. Verify the integration-test job has correct `needs`, `if` condition, and artifact upload. Confirm values-beta.yaml and values-paper.yaml are valid Helm values. Ask the user if questions arise.
- [x] 11. Create install and teardown scripts
- [x] 11.1 Create `~/sources/kube/pipelines/runmefirst.sh` — full install script: create namespaces (`arc-system`, `argocd`, `kargo`, `stonks-beta`, `stonks-paper`), apply PVs, install ARC controller via Helm, apply runner scaleset, install ArgoCD via Helm with values, apply repo secret + ArgoCD Applications, install Kargo via Helm with values, apply Kargo project + warehouse + stages. Use `set -euo pipefail`, idempotent namespace creation via `--dry-run=client -o yaml | kubectl apply -f -`
- _Requirements: 1.1, 1.2, 1.5, 3.1_
- [x] 11.2 Create `~/sources/kube/pipelines/runmelast.sh` — teardown script: delete Kargo resources (stages, warehouse, project-config, project), uninstall Kargo Helm release, delete ArgoCD resources (apps, repo-secret), uninstall ArgoCD Helm release, delete ARC resources (runner-scaleset), uninstall ARC Helm release, delete namespaces (`arc-system`, `argocd`, `kargo`). Preserve PVs, NFS data, `stonks-oracle` namespace, `stonks-beta`, and `stonks-paper` namespaces. Use `--ignore-not-found` and `|| true` for idempotency.
- _Requirements: 2.1, 2.2, 2.3, 2.4, 3.2, 3.3, 17.2_
- [x] 12. Final checkpoint — Review all artifacts
- Ensure all files are created in the correct locations: pipeline scripts in `~/sources/kube/pipelines/`, Helm values and workflow changes in the stonks-oracle repo. Verify install order in `runmefirst.sh` matches design (PVs → ARC → ArgoCD → Kargo). Verify teardown order in `runmelast.sh` is reverse (Kargo → ArgoCD → ARC). Ensure all tests pass, ask the user if questions arise.
## Notes
- Pipeline infrastructure scripts (`~/sources/kube/pipelines/`) are created on gremlin-1, separate from the stonks-oracle repo
- Helm values files (`values-beta.yaml`, `values-paper.yaml`) and the GitHub Actions workflow update are in the stonks-oracle repo
- No property-based tests — this feature is entirely IaC (shell scripts, YAML manifests, Helm values)
- The existing `values.yaml` (production) is not modified — live stage uses it as-is
- PVs use `persistentVolumeReclaimPolicy: Retain` so NFS data survives teardowns
- Break-glass is Kargo's built-in manual approval — no custom code needed (Requirements 12.112.5)
- Audit trail is provided by Kargo's native promotion history (Requirements 15.115.4)
- Kargo Dashboard features (stage display, promotion controls, block indicators) are provided by the Kargo chart out of the box (Requirements 14.114.3, 16.116.5)
- Each task references specific requirements for traceability
- Checkpoints ensure incremental validation between major phases