No description

Python 93.4%
JavaScript 3.9%
CSS 1.7%
HTML 1%

Find a file

Codex 265bb423c6 fix: infer onboarding defaults for known networks		2026-05-10 15:17:34 +04:00
android	chore: initialise calhta DevOps Console repository structure	2026-05-05 18:34:28 +04:00
backend	fix: infer onboarding defaults for known networks	2026-05-10 15:17:34 +04:00
dev-uploads	fix: correct dev-uploads directory structure	2026-05-05 18:36:30 +04:00
docs	fix: infer onboarding defaults for known networks	2026-05-10 15:17:34 +04:00
frontend-web	feat: add controlled Cockpit setup workflow	2026-05-10 15:04:25 +04:00
matrix-bot	chore: initialise calhta DevOps Console repository structure	2026-05-05 18:34:28 +04:00
packaging/systemd	feat: implement milestone 0.1.001 foundation	2026-05-05 19:37:32 +04:00
prompts	chore: initialise calhta DevOps Console repository structure	2026-05-05 18:34:28 +04:00
scripts	chore: initialise calhta DevOps Console repository structure	2026-05-05 18:34:28 +04:00
.gitignore	feat: redesign WebUI around Codex chat	2026-05-10 02:26:19 +04:00
README.md	fix: infer onboarding defaults for known networks	2026-05-10 15:17:34 +04:00
VERSION	fix: infer onboarding defaults for known networks	2026-05-10 15:17:34 +04:00

README.md

calhta DevOps Console

calhta DevOps Console is a permanent Debian 13 server application for managing Codex-driven development across multiple projects and servers.

The console is intended to become a privileged control plane for project registry, remote/local Codex sessions, logs, uploads, SSH targets, Matrix notifications, and continuation workflows. It is not a general public web app and should be operated behind the existing internal/NPM access pattern documented in docs/.

Current Milestone

Version: 0.1.059

Milestone 0.1.059 improves SSH target onboarding defaults for known calhta networks. Current implemented scope includes:

FastAPI backend skeleton.
/health endpoint.
Environment-based configuration loading.
SQLite connection and migration foundation.
Migration runner command.
SQLite-backed project registry CRUD.
Minimal /api/projects CRUD endpoints.
SQLite-backed SSH target registry CRUD.
Minimal /api/ssh-targets CRUD endpoints.
SQLite-backed session state persistence.
Minimal /api/sessions state endpoints.
One-active-session-per-project validation for v1 active states.
Active session lookup under /api/sessions/active?project_id=... so clients can reconnect to the server-owned session instead of creating a new one.
SQLite-backed project runtime profiles for future Codex session orchestration.
Multi-target runtime profile bindings for projects that need more than one server in one work session.
Session switch policy metadata so active multi-server work can be attached to or confirmed before any coordinator refresh.
SQLite-backed DevOps-side project continuation snapshots.
SQLite-backed generated DevOps context digests.
SQLite-backed Codex handoff packages for project-start preparation.
Protected handoff package preparation/list/get endpoints.
Active-session protection that returns an operator-decision-required package instead of silently refreshing or replacing active work.
WebUI composer submits operator project-start intent to the handoff preparation workflow.
WebUI immediately echoes submitted input and shows missing-token/API failures in the chat stream.
WebUI login uses username/password instead of a visible bearer-token box, with bootstrap admin / admin and forced first-login password change.
WebUI clears the composer on send and keeps the chat transcript scrollable within the viewport.
Protected DevOps coordinator chat endpoint for normal chat messages.
DevOps chat operation route for removing retired SSH targets from active inventory.
Durable DevOps chat operation records with confirmation phrases for risky actions.
Codex-oriented tool bridge that plans backend tool calls before falling back to legacy phrase routing or normal chat.
Codex-first chat agent loop that lets Codex choose and chain backend tools before responding to the operator.
Persistent DevOps Codex coordinator runtime using codex app-server --listen stdio://, reusing one server-owned thread across normal WebUI chat turns.
Streaming /api/session-orchestration/devops-chat/stream endpoint wired through the Codex-first tool loop so WebUI requests can execute validated backend operations.
Protected agent_session_start path for starting a remote Codex project worker on an auth-ready target.
SQLite-backed Codex runtime session records for project workers.
Chat-routed agent_session_start tool selection for real work requests such as creating /opt/test-001 on devOps-test-001.
Fixed remote worker launch command that prepares the requested project directory and starts codex app-server --listen stdio:// over SSH, without accepting arbitrary shell text.
Protected project-worker work endpoint that starts or reuses the worker and sends one operator work turn into the remote Codex app-server.
Chat-routed project-worker relay so real work requests return the remote worker response instead of only reporting that the worker started.
Incremental forwarding of remote project-worker Codex deltas to the WebUI while the worker is still running.
Remote project-worker non-text app-server events are forwarded as status updates.
Remote project-worker command executions are surfaced as status updates, recorded in command audits, and approved through system approval records for the active operator-requested project-worker turn.
Remote project-worker approval detection is limited to explicit app-server approval request methods instead of broad text matching.
Remote project-worker turns time out after 60 seconds of no app-server activity instead of hanging silently.
Controlled Cockpit setup workflow for auth-ready lab targets: prepare project tree, update apt, install Cockpit, enable cockpit.socket, verify port 9090, and return the target URL without routing through a Codex project-worker turn.
WebUI yields one animation frame after echoing operator input so the submitted message appears before the long-running streamed request begins.
Session events and infrastructure audit records for remote project-worker start attempts and outcomes.
Persistent app-server tool planning for chat-routed backend operations, avoiding the previous per-step codex exec startup path.
Chat-routed Codex install-plan approval/execution for lab target onboarding once the controlled workflow reaches install-plan review.
WebUI animated thinking indicator for pending devOps responses before the first streamed/final response text arrives.
Streamed devOps status events before and between backend tool calls so long-running work shows visible progress instead of a silent pending request.
DevOps chat target inventory operations: list targets, show target details, check reachability, rename lab targets, update lab target host metadata, and prepare confirmed production target changes.
DevOps chat SSH host-key trust and active onboarding advance tools, including natural follow-ups such as "trust it and continue" and "go on".
Natural target rename phrasing such as "give target-10-76-22-150 the name devOps-test-001".
DevOps chat onboarding intake for new SSH target candidates, creating metadata and starting the controlled agent target onboarding state machine without install/auth execution.
Codex-assisted structured extraction with heuristic fallback for natural onboarding requests such as "I have a new server for you to onboard, its at 10.76.22.150".
SSH target onboarding defaults username=root, port=22, and environment=lab unless overridden.
SSH target onboarding infers Dubai DC/LAN/no tunnel for 10.76.20.0/22, and 12Alder lab with wg-12alder for 10.26.0.0/16.
Host-only onboarding refuses to create a target until the operator supplies the desired inventory name.
Normal WebUI chat uses the persistent DevOps Codex app-server runtime with generated coordinator context and persists operator input plus Codex output as session events.
Project kickoff phrasing routes to handoff preparation instead of every chat message doing so.
WebUI transcript preserves paragraphs and bullet lists from DevOps chat responses.
WebUI assistant label is devOps, with event metadata kept out of the message body.
Remote-access read model includes current TCP reachability for SSH targets.
VPN-required targets report access-path unavailable when their tunnel is not operational instead of pretending the host itself is online.
Connected Servers panel uses current reachability status, not stored readiness metadata.
WebUI records operator input as session events when a session is available.
WebUI renders persisted session events and recent handoff packages in the chat stream.
Minimal /api/session-orchestration metadata endpoints.
SQLite-backed session event/transcript persistence.
Minimal /api/sessions/{session_id}/events create/list endpoints.
SQLite-backed command audit persistence.
Minimal /api/sessions/{session_id}/commands non-executing endpoints.
SQLite-backed approval record persistence.
Minimal /api/sessions/{session_id}/approvals non-executing endpoints.
SQLite-backed system-wide infrastructure audit event persistence.
Protected /api/infrastructure-audit/events create/list/get endpoints.
SQLite-backed agent target onboarding run persistence.
Agent target onboarding history survives SSH target deletion by clearing the retired target reference.
Protected /api/agent-target-onboarding start/list/get/advance/cancel endpoints.
Stateful, resumable lab target onboarding workflow using existing safe host-key, SSH probe, Codex readiness, install-plan, and install execution primitives.
Codex auth readiness representation for installed-but-auth-required targets.
SQLite-backed Codex auth slot metadata.
Protected /api/codex-auth/slots inspect/import/list/get/lease/release/provision/collect endpoints.
Managed Codex auth secret files outside the repository and outside plaintext SQLite, under /etc/calhta-devops-console/codex-auth/slots.
One auth slot / one active target-session leasing model for MVP.
Fixed Codex auth provisioning file-transfer workflow to /root/.codex/auth.json.
Extended SSH target network/access metadata.
Minimal site metadata model.
SQLite-backed tunnel profile metadata persistence.
Minimal /api/tunnels metadata-only endpoints.
Tunnel profile interface_name metadata.
Tunnel operational state fields and remote-execution usability metadata.
Non-mutating WireGuard profile check endpoint under /api/tunnels/{tunnel_id}/check.
Non-mutating local WireGuard tooling readiness endpoint.
Controlled fixed-command WireGuard package install endpoint.
Protected fixed-command WireGuard activation/deactivation/status endpoints for registered profiles.
Infrastructure audit records for WireGuard runtime activation, deactivation, rollback, and status inspection.
Infrastructure audit records for agent target onboarding lifecycle, probe/readiness/install-plan milestones, cancellation, and failures.
First successful controlled onboarding of deb-test-001, ending at installed_but_auth_required.
Default DevOps control-plane SSH identity inheritance for targets without explicit key overrides.
Read-only remote access summaries under /api/remote-access/targets.
Safe fixed-command SSH probe under /api/ssh-targets/{target_id}/probe.
SSH host key discovery and explicit known_hosts trust endpoints.
Non-mutating Codex install readiness checks under /api/ssh-targets/{target_id}/codex-readiness.
SQLite-backed Codex install plan audit metadata.
Minimal /api/ssh-targets/{target_id}/codex-install-plans metadata-only endpoints.
Controlled fixed-sequence Codex install execution endpoint for approved plans.
SQLite-backed operator records.
Hashed API token metadata with one-time raw token output from CLI bootstrap commands.
Password-backed WebUI login with salted PBKDF2 password storage.
Bearer-token authentication dependency for selected privileged routes.
Focused backend tests.
Static chat-first operator WebUI for health, projects, targets, sessions, remote-access readiness, protected infrastructure audit events, and protected Codex auth slot metadata.
Backend serving of the static WebUI at / with frontend assets under /src.
Draft backend systemd service.

The system is incomplete. It does not yet implement automatic tunnel activation, route changes beyond explicit wg-quick profile activation, role-based access control, public exposure configuration, durable project worker reattachment after backend restart, arbitrary command execution, terminal streaming, interactive approval workflow UI, Matrix integration, uploads, expanded authentication flows beyond the managed Codex CLI ChatGPT auth slot MVP, or the near-future polished operator WebUI.

Milestone 0.1.054 starts the first remote project-worker process path and relays one work turn through it, but it still does not provide durable process reattachment after backend restart, live incremental remote stream relay, arbitrary command execution through the console, stop existing sessions, or perform process handoff automation.

Backend Local Run

From backend/, create a virtual environment and install requirements:

python3 -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt
python -m app.cli migrate
uvicorn app.main:app --reload --host 127.0.0.1 --port 8000

Health check:

curl http://127.0.0.1:8000/health

Expected response:

{"status":"ok"}

Static WebUI:

http://127.0.0.1:8000/

Project API base path:

/api/projects

SSH target API base path:

/api/ssh-targets

SSH target records store key paths/references only. Private key material must not be stored in SQLite or committed to the repository. Targets without an explicit key_path inherit the DevOps control-plane SSH identity at /root/.ssh/calhta_devops_orchestrator_ed25519; explicit per-target key overrides remain supported for advanced cases.

Session API base path:

/api/sessions

Session persistence currently stores state only. It does not start Codex, run SSH commands, stream terminals, or manage approvals.

Client disconnects must be treated as view loss only. Browser close, app close, phone lock, or network loss must not create, stop, or replace a server-side session. Reconnecting clients should query /api/sessions/active?project_id=... and reopen their window onto the existing active session when one exists.

Session event API base path:

/api/sessions/{session_id}/events

Session events are persisted records only. They are not streamed yet and do not implement command approval workflows.

Command audit API base path:

/api/sessions/{session_id}/commands

Command audit records are metadata only. The backend does not execute commands or implement approval workflows yet.

Approval record API base path:

/api/sessions/{session_id}/approvals

Approval records are persisted decisions only. They do not execute commands and do not automatically update command audit status.

Session orchestration metadata API base path:

/api/session-orchestration

These endpoints store runtime profiles, multi-target profile bindings, continuation snapshots, generated DevOps context digests, handoff packages, and Codex runtime session records. The agent session paths can start a remote project worker, relay one work turn, stream remote worker deltas, surface non-text remote worker events, and record remote worker command executions with approval/audit metadata. Durable process reattachment and interactive approval UI remain later work. They require an operator bearer token because continuation text and runtime routing metadata may contain sensitive operational context.

Infrastructure audit API base path:

/api/infrastructure-audit/events

Infrastructure audit records capture system-wide control-plane actions that may happen outside a project session. The endpoints require a bearer token and return concise operator-facing fields such as timestamp, summary, action, status, resource name, and risk level. POST rejects obvious secret-looking keys or values in details_json.

Agent target onboarding API base path:

/api/agent-target-onboarding

The protected onboarding workflow stores resumable runs for SSH targets that may become Codex agent candidates. It validates target metadata, checks LAN/tunnel readiness from stored metadata, stops for explicit host-key trust when needed, runs only the existing fixed SSH probe and Codex readiness checks, creates/reuses Codex install plans when the CLI is missing, and can call the existing approved fixed-sequence install executor only when explicitly advanced with install confirmation. It does not deploy Codex auth material, accept arbitrary commands, auto-trust host keys, or auto-activate tunnels.

Codex auth direction: OpenAI API keys are not the intended path. DevOps-managed Codex CLI ChatGPT auth material is treated as secret data and provisioned through protected auth slots. Manual per-target SSH login is not acceptable as the long-term product path for Codex auth setup. A target with the Codex binary installed but no managed auth state is represented as installed_but_auth_required.

Codex auth slot API base path:

/api/codex-auth

The protected auth slot workflow treats the DevOps server as the trusted Codex CLI ChatGPT auth source for MVP. It imports /root/.codex/auth.json into a root-owned managed secret path outside the repo, stores only metadata/path references in SQLite, leases one slot to one active target/session at a time, and provisions only to the fixed target path /root/.codex/auth.json. API responses never include auth file contents or token values. The source file may contain key-shaped internal field names such as OPENAI_API_KEY, but this is not an OpenAI API key workflow and values must never be exposed.

Tunnel profile API base path:

/api/tunnels

Tunnel profiles store metadata and config paths only. They do not start tunnels, store private keys, or activate WireGuard/Tailscale/VPN connections.

Remote access read model API base path:

/api/remote-access/targets

Remote access summaries describe expected access paths only. They do not test SSH, activate tunnels, or connect to remote machines.

Remote access responses include site metadata, tunnel operational status, and ready_for_execution. A target is ready when it does not require a tunnel, or when the required tunnel profile exists, is operational, and is marked usable for remote execution.

SSH probe endpoint:

POST /api/ssh-targets/{target_id}/probe

The probe runs only fixed commands: hostname, whoami, pwd, and codex --version. It does not accept arbitrary commands.

SSH identity model:

Default control-plane identity: /root/.ssh/calhta_devops_orchestrator_ed25519

The DevOps orchestrator key is a dedicated control-plane identity, distinct from human administrator identities. The recommended operational model is that the public key for this identity is included in the standard authorized_keys payload/template installed on managed servers, so new lab/development servers can be onboarded without generating or copying per-target keys. SSH target API responses report whether access uses default_control_plane_identity or target_override_key.

SSH host key onboarding:

GET  /api/ssh-targets/{target_id}/host-key
POST /api/ssh-targets/{target_id}/host-key/trust

Trusting a host key requires:

{"confirm_trust": true}

Host key onboarding uses ssh-keyscan and writes to the configured known_hosts path. It does not open an SSH session.

Codex install readiness endpoint:

POST /api/ssh-targets/{target_id}/codex-readiness

The readiness check runs only fixed safe commands for hostname, user, working directory, Node, npm, Codex, and bubblewrap detection. It returns an advisory text install plan only; it does not install Codex or execute suggested install commands.

Codex install plan API base path:

/api/ssh-targets/{target_id}/codex-install-plans

Install plan records persist readiness-derived plan text, warnings, status, requester, decision metadata, execution timestamps, and result summaries.

Controlled install execution endpoint:

POST /api/ssh-targets/{target_id}/codex-install-plans/{plan_id}/execute

Execution is allowed only for approved plans and requires confirm_execute=true. Production targets also require confirm_production=true. The executor runs only the fixed sequence apt update, apt install -y nodejs npm bubblewrap, npm install -g @openai/codex, codex --version, and bwrap --version.

Operator bootstrap:

python -m app.cli create-operator --username calum --display-name "Calum"
python -m app.cli create-token --username calum --name local-admin

The raw API token is printed once. SQLite stores only a token prefix and SHA-256 hash.

Protected routes currently include:

/api/operators/*
POST /api/ssh-targets/{target_id}/codex-install-plans/{plan_id}/execute
/api/infrastructure-audit/events
/api/agent-target-onboarding/*
POST /api/tunnels/wireguard-tooling/install
POST /api/tunnels/{profile_id}/activate
POST /api/tunnels/{profile_id}/deactivate
GET /api/tunnels/{profile_id}/runtime-status

Health and the existing registry/readiness metadata endpoints remain open for now.

WireGuard tunnel check:

POST /api/tunnels/{tunnel_id}/check

The check validates file existence, restrictive permissions, basic [Interface] and [Peer] structure, non-secret WireGuard fields, wg/wg-quick binary availability, and optional interface state. It does not run wg-quick up, wg-quick down, systemctl, or file edits.

WireGuard tooling readiness:

POST /api/tunnels/wireguard-readiness

This endpoint checks local wg and wg-quick availability and Debian package presence where practical. It is non-mutating.

Controlled WireGuard tooling install:

POST /api/tunnels/wireguard-tooling/install

This endpoint requires a bearer token, confirm_execute=true, and executed_by matching the authenticated operator. It may only run apt update and apt install -y wireguard wireguard-tools. It does not activate tunnels, run wg-quick up/down, change routes, or edit WireGuard config files.

Protected WireGuard runtime endpoints:

POST /api/tunnels/{profile_id}/activate
POST /api/tunnels/{profile_id}/deactivate
GET  /api/tunnels/{profile_id}/runtime-status

These endpoints require a bearer token. Activation requires confirm_activate=true, a registered enabled wireguard tunnel profile, and a valid registered interface_name. It may only run wg-quick up <interface_name>, then fixed status checks for ip link, ip -brief address, and wg show. If validation does not find a peer plus a latest handshake or received bytes, the backend attempts rollback with wg-quick down <interface_name>.

Deactivation requires confirm_deactivate=true and may only run wg-quick down <interface_name>. Already-down interfaces are handled as a safe no-op. Runtime status returns interface, handshake, and transfer summaries only; it does not expose WireGuard private keys or raw wg show output.

The wg-quick interface name must match a valid config basename under /etc/wireguard/<name>.conf. For the validated 12Alder tunnel, use wg-12alder with /etc/wireguard/wg-12alder.conf; older wg-12alder-lab metadata is not the operational activation name.

Backend Tests

From backend/ after installing requirements:

pytest

WebUI Shell

The WebUI shell lives in frontend-web/. Milestone 0.1.026 uses a static chat-first operator console served by the backend and does not require frontend dependencies.

cd frontend-web
python3 -m http.server 5173

Runtime

Draft systemd units live in packaging/systemd/. They are not installed by this repository milestone.

Documentation

Start with:

docs/ARCHITECTURE.md
docs/SESSION_ORCHESTRATION.md
docs/SECURITY.md
docs/OPERATIONS.md
docs/PROJECT_SETUP.md
docs/DECISIONS.md
docs/CONTINUATION.md