- Replace all datetime.utcnow() with datetime.now(tz=timezone.utc) across 8 files - Fix 12 failing tests to match current implementation behavior - Fix pytest_plugins in non-top-level conftest (moved to root conftest.py) - Auto-fix 189 lint issues (import sorting, unused imports) - Add CI/CD pipeline infrastructure (ARC, ArgoCD, Kargo manifests) - Add values-beta.yaml and values-paper.yaml for staged deployments - Update GitHub Actions workflow to use self-hosted-gremlin runners - Add integration-test job to CI pipeline Result: 1596 passed, 0 failed, 0 warnings
10 KiB
Implementation Plan: CI/CD Pipeline
Overview
Build a full CI/CD pipeline for Stonks Oracle using ARC (self-hosted GitHub Actions runners), ArgoCD (GitOps deployment), and Kargo (staged promotion orchestration) on the Gremlin cluster. Pipeline infrastructure scripts go in ~/sources/kube/pipelines/ on gremlin-1. Helm values files and the updated GitHub Actions workflow go in the stonks-oracle repo.
Tasks
-
1. Create NFS PersistentVolume manifests
- 1.1 Create
~/sources/kube/pipelines/pvs/argocd-pv.yaml— NFS PV for ArgoCD (5Gi,nfs://192.168.42.8:/volume1/Kubernetes/pipelines/argocd,persistentVolumeReclaimPolicy: Retain, labelapp: pipeline-argocd)- Requirements: 1.2, 17.1, 17.4
- 1.2 Create
~/sources/kube/pipelines/pvs/kargo-pv.yaml— NFS PV for Kargo (2Gi,nfs://192.168.42.8:/volume1/Kubernetes/pipelines/kargo,persistentVolumeReclaimPolicy: Retain, labelapp: pipeline-kargo)- Requirements: 1.2, 17.1, 17.4
- 1.3 Create
~/sources/kube/pipelines/pvs/arc-pv.yaml— NFS PV for ARC (2Gi,nfs://192.168.42.8:/volume1/Kubernetes/pipelines/arc,persistentVolumeReclaimPolicy: Retain, labelapp: pipeline-arc)- Requirements: 1.2, 17.1, 17.4
- 1.1 Create
-
2. Create ARC (Actions Runner Controller) manifests
- 2.1 Create
~/sources/kube/pipelines/arc/values.yaml— Helm values for the ARC controller chart (oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set-controller), namespacearc-system- Requirements: 1.1, 4.1
- 2.2 Create
~/sources/kube/pipelines/arc/runner-scaleset.yaml— RunnerScaleSet CR forcelesrenata/stonks-oraclerepo with labelself-hosted-gremlin,containerMode.type: kubernetes, ephemeral pods, 2 CPU / 4Gi memory limits- Requirements: 4.1, 4.2, 4.3, 4.4
- 2.1 Create
-
3. Create ArgoCD manifests
- 3.1 Create
~/sources/kube/pipelines/argocd/values.yaml— Helm values forargo/argo-cdchart inargocdnamespace, with Traefik ingress atstonks-argocd.celestium.life, TLS viaca-issuer, NFS PVC for persistence- Requirements: 1.1, 1.3, 18.1
- 3.2 Create
~/sources/kube/pipelines/argocd/repo-secret.yaml— Kubernetes Secret with Git credentials for thecelesrenata/stonks-oraclerepository, namespaceargocd- Requirements: 18.1
- 3.3 Create
~/sources/kube/pipelines/argocd/apps/stonks-beta.yaml— ArgoCD Application for beta stage, pointing atinfra/helm/stonks-oracle/withvalues-beta.yaml, target namespacestonks-beta, auto-sync with prune and selfHeal- Requirements: 8.1, 8.4, 18.2, 18.3
- 3.4 Create
~/sources/kube/pipelines/argocd/apps/stonks-paper.yaml— ArgoCD Application for paper stage, pointing atinfra/helm/stonks-oracle/withvalues-paper.yaml, target namespacestonks-paper, auto-sync with prune and selfHeal- Requirements: 9.1, 9.5, 18.2, 18.3
- 3.5 Create
~/sources/kube/pipelines/argocd/apps/stonks-live.yaml— ArgoCD Application for live stage, pointing atinfra/helm/stonks-oracle/withvalues.yaml, target namespacestonks-oracle, auto-sync with prune and selfHeal- Requirements: 10.2, 10.5, 18.2, 18.3
- 3.1 Create
-
4. Checkpoint — Verify ArgoCD and ARC manifests
- Ensure all YAML manifests are syntactically valid. Review that each ArgoCD Application points at the correct chart path, values file, and target namespace. Ask the user if questions arise.
-
5. Create Kargo manifests
- 5.1 Create
~/sources/kube/pipelines/kargo/values.yaml— Helm values foroci://ghcr.io/akuity/kargo-charts/kargoinkargonamespace, with Traefik ingress atstonks-kargo.celestium.life, TLS viaca-issuer, NFS PVC for persistence- Requirements: 1.1, 1.4, 16.6
- 5.2 Create
~/sources/kube/pipelines/kargo/project.yaml— Kargo Project resourcestonks-oracleinstonks-oraclenamespace- Requirements: 8.2, 14.1
- 5.3 Create
~/sources/kube/pipelines/kargo/warehouse.yaml— Kargo Warehousestonks-imageswatchingghcr.io/celesrenata/stonks-oracle/query-apifor new image tags- Requirements: 6.5, 14.1
- 5.4 Create
~/sources/kube/pipelines/kargo/stages/beta.yaml— Kargo Stage for beta with auto-promotion enabled, promotion template that updatesimage.tagin thestonks-betaArgoCD Application- Requirements: 8.1, 8.3, 13.1
- 5.5 Create
~/sources/kube/pipelines/kargo/stages/paper.yaml— Kargo Stage for paper with manual promotion, market-hours verification step (AnalysisTemplate), promotion template that updatesimage.tagin thestonks-paperArgoCD Application- Requirements: 9.1, 9.3, 9.4, 11.1, 11.2, 13.1
- 5.6 Create
~/sources/kube/pipelines/kargo/stages/live.yaml— Kargo Stage for live with manual approval + required notes, market-hours verification step, promotion template that updatesimage.tagin thestonks-liveArgoCD Application- Requirements: 10.1, 10.3, 10.4, 11.1, 11.2, 12.1, 12.3, 13.1
- 5.7 Create
~/sources/kube/pipelines/kargo/project-config.yaml— Kargo ProjectConfig with per-stageautoPromotionEnabledsettings (beta: true, paper: false, live: false)- Requirements: 13.1, 13.2, 13.3
- 5.1 Create
-
6. Create market-hours AnalysisTemplate
- 6.1 Create the AnalysisTemplate manifest for market-hours verification — runs an Alpine container that checks Eastern Time (09:30–16:00 ET, Mon–Fri), exits 0 outside market hours, exits 1 during market hours. Uses
America/New_Yorktimezone for DST correctness. Place in~/sources/kube/pipelines/kargo/directory.- Requirements: 11.1, 11.2, 11.4
- 6.1 Create the AnalysisTemplate manifest for market-hours verification — runs an Alpine container that checks Eastern Time (09:30–16:00 ET, Mon–Fri), exits 0 outside market hours, exits 1 during market hours. Uses
-
7. Checkpoint — Verify Kargo manifests and promotion DAG
- Ensure Kargo stages form the correct linear DAG: beta → paper → live. Verify market-hours AnalysisTemplate is referenced by paper and live stages. Ensure all YAML is syntactically valid. Ask the user if questions arise.
-
8. Create Helm values files for beta and paper stages (in stonks-oracle repo)
- 8.1 Create
infra/helm/stonks-oracle/values-beta.yaml— lighter resources,BROKER_MODE: mock,BROKER_PROVIDER: mock,LOG_LEVEL: DEBUG,TRADING_ENABLED: false, single replicas per service- Requirements: 8.4, 9.2
- 8.2 Create
infra/helm/stonks-oracle/values-paper.yaml— paper broker config,BROKER_MODE: paper,BROKER_PROVIDER: alpaca,BROKER_BASE_URL: https://paper-api.alpaca.markets,LOG_LEVEL: INFO,TRADING_ENABLED: true- Requirements: 9.2, 9.5
- 8.1 Create
-
9. Update GitHub Actions workflow (in stonks-oracle repo)
- 9.1 Update
.github/workflows/build.yml— changeruns-on: ubuntu-latesttoruns-on: self-hosted-gremlinon all jobs (lint-and-test,build-services,build-dashboard,build-superset)- Requirements: 5.1, 4.2
- 9.2 Add
integration-testjob to.github/workflows/build.yml— depends onbuild-servicesandbuild-dashboard, runs only on push to main, invokesbash infra/inttest/run_pipeline.sh --image-tag ${{ github.sha }} --results-file inttest-results.json, uploadsinttest-results.jsonas a build artifact viaactions/upload-artifact@v4- Requirements: 7.1, 7.2, 7.3, 7.4, 7.5
- 9.1 Update
-
10. Checkpoint — Verify workflow and values files
- Ensure the updated workflow YAML is syntactically valid. Verify the integration-test job has correct
needs,ifcondition, and artifact upload. Confirm values-beta.yaml and values-paper.yaml are valid Helm values. Ask the user if questions arise.
- Ensure the updated workflow YAML is syntactically valid. Verify the integration-test job has correct
-
11. Create install and teardown scripts
- 11.1 Create
~/sources/kube/pipelines/runmefirst.sh— full install script: create namespaces (arc-system,argocd,kargo,stonks-beta,stonks-paper), apply PVs, install ARC controller via Helm, apply runner scaleset, install ArgoCD via Helm with values, apply repo secret + ArgoCD Applications, install Kargo via Helm with values, apply Kargo project + warehouse + stages. Useset -euo pipefail, idempotent namespace creation via--dry-run=client -o yaml | kubectl apply -f -- Requirements: 1.1, 1.2, 1.5, 3.1
- 11.2 Create
~/sources/kube/pipelines/runmelast.sh— teardown script: delete Kargo resources (stages, warehouse, project-config, project), uninstall Kargo Helm release, delete ArgoCD resources (apps, repo-secret), uninstall ArgoCD Helm release, delete ARC resources (runner-scaleset), uninstall ARC Helm release, delete namespaces (arc-system,argocd,kargo). Preserve PVs, NFS data,stonks-oraclenamespace,stonks-beta, andstonks-papernamespaces. Use--ignore-not-foundand|| truefor idempotency.- Requirements: 2.1, 2.2, 2.3, 2.4, 3.2, 3.3, 17.2
- 11.1 Create
-
12. Final checkpoint — Review all artifacts
- Ensure all files are created in the correct locations: pipeline scripts in
~/sources/kube/pipelines/, Helm values and workflow changes in the stonks-oracle repo. Verify install order inrunmefirst.shmatches design (PVs → ARC → ArgoCD → Kargo). Verify teardown order inrunmelast.shis reverse (Kargo → ArgoCD → ARC). Ensure all tests pass, ask the user if questions arise.
- Ensure all files are created in the correct locations: pipeline scripts in
Notes
- Pipeline infrastructure scripts (
~/sources/kube/pipelines/) are created on gremlin-1, separate from the stonks-oracle repo - Helm values files (
values-beta.yaml,values-paper.yaml) and the GitHub Actions workflow update are in the stonks-oracle repo - No property-based tests — this feature is entirely IaC (shell scripts, YAML manifests, Helm values)
- The existing
values.yaml(production) is not modified — live stage uses it as-is - PVs use
persistentVolumeReclaimPolicy: Retainso NFS data survives teardowns - Break-glass is Kargo's built-in manual approval — no custom code needed (Requirements 12.1–12.5)
- Audit trail is provided by Kargo's native promotion history (Requirements 15.1–15.4)
- Kargo Dashboard features (stage display, promotion controls, block indicators) are provided by the Kargo chart out of the box (Requirements 14.1–14.3, 16.1–16.5)
- Each task references specific requirements for traceability
- Checkpoints ensure incremental validation between major phases