activity-core

Author	SHA1	Message	Date
tegwick	65ef005c2d	docs(ACTIVITY-WP-0014): close T05 in-repo; split beachhead adoption to WP-0015 Idempotent-writes half of T05 is done in-repo; the externally-blocked endpoint adoption + actcore-state-hub-bridge proxy retirement move to ACTIVITY-WP-0015 (blocked on the state-hub beachhead) so WP-0014 can close on completed work. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-24 12:41:21 +02:00
tegwick	0e75aaec01	chore(consistency): sync task status from DB [auto] Updated by fix-consistency on 2026-06-23: - update .custodian-brief.md for activity-core	2026-06-23 21:39:32 +02:00
tegwick	b2e57707a7	chore(consistency): sync task status from DB [auto] Updated by fix-consistency on 2026-06-23: - ACTIVITY-WP-0014-T05: todo → progress	2026-06-23 21:39:28 +02:00
tegwick	88fe359385	feat(ACTIVITY-WP-0014): idempotency-keyed State Hub writes (T05, in-repo part) Add activity_core/state_hub_write: every State Hub write (report-sink, ops-evidence, schedule-miss) now sends a stable Idempotency-Key header derived from run_id:instruction_id:event_type. Makes writes safe to buffer/replay under the future state-hub beachhead without duplicate progress/triage events. The read-based _progress_exists dedup is now best-effort (returns False on connection error instead of hard-failing), so the guarantee lives on the keyed write rather than a live read. Tests + runbook note. Endpoint adoption / proxy retirement stays blocked on the state-hub beachhead capability. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 21:38:46 +02:00
tegwick	f90591c5f1	docs(ACTIVITY-WP-0014): rescope T05 to thin client under State Hub beachhead model Resilience (queue/cache) is handed to custodian/state-hub as a per-machine beachhead; activity-core keeps only idempotent writes + adopt-beachhead-endpoint and retires its bespoke actcore-state-hub-bridge proxy. Proposal sent to state-hub. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 21:18:01 +02:00
tegwick	cf7a11dcd9	docs(ACTIVITY-WP-0014): correct Motivation to match T01 findings Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 17:16:17 +02:00
tegwick	99e5d525a8	chore(consistency): sync task status from DB [auto] Updated by fix-consistency on 2026-06-23: - update .custodian-brief.md for activity-core	2026-06-23 17:15:41 +02:00
tegwick	8424c13783	docs(ACTIVITY-WP-0014): T01 root cause — State Hub Connection refused, not misfire Live inspection of railiance01 (ssh + in-node kubectl/temporal) overturns the catchup_window hypothesis: the daily-triage schedule is healthy (CatchupWindow 365d default, 0 MissedCatchupWindow). The 2026-06-23T05:20Z fire ran but Failed at the report sink with '[Errno 111] Connection refused' posting to State Hub. railiance01 reaches State Hub via a reverse tunnel back to the workstation, which is unreachable at 07:20 Europe/Berlin (102 resolver timeouts in 24h). Mark T01 done; add T05 for resilient sinks/resolvers as the real incident fix. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 17:14:04 +02:00
tegwick	864f90f9b9	chore(consistency): sync task status from DB [auto] Updated by fix-consistency on 2026-06-23: - update .custodian-brief.md for activity-core	2026-06-23 14:27:54 +02:00
tegwick	053d18b24a	feat(ACTIVITY-WP-0014): missed-fire detection & alert sink (T03) Add activity_core/schedule_health: a pure evaluate_schedule_health() verdict (built on Temporal's num_actions_missed_catchup_window plus a staleness check), an async check_schedule_health() reader, and post_missed_fire_alert() that emits a schedule_miss State Hub progress event. Makes a missed fire visible even under misfire_policy=skip, where Temporal drops it by design. Unit tests for the verdict logic. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 14:25:33 +02:00
tegwick	77af65afb2	chore(consistency): sync task status from DB [auto] Updated by fix-consistency on 2026-06-23: - update .custodian-brief.md for activity-core	2026-06-23 14:17:14 +02:00
tegwick	0495f8a43f	chore(consistency): sync task status from DB [auto] Updated by fix-consistency on 2026-06-23: - ACTIVITY-WP-0014-T04: progress → wait	2026-06-23 14:17:06 +02:00
tegwick	c6cad9e7b3	chore(consistency): renormalize lifecycle state [auto] Updated by fix-consistency on 2026-06-23: - workplan status: proposed → active	2026-06-23 14:17:06 +02:00
tegwick	a83b117f60	feat(ACTIVITY-WP-0014): explicit run-miss recovery policies (T02, T04) Set Temporal catchup_window on cron schedules so a fire missed during a worker/Temporal outage is no longer silently dropped. Redefine misfire_policy into three explicit modes — skip, catchup_all, catchup_latest — mapping to (catchup_window, overlap) pairs; legacy catchup/compress aliased. Add catchup_window_seconds override. Remove the ad-hoc upsert-time 1h backfill in favour of native catchup. Apply catchup_latest to daily-statehub-wsjf-triage in the Railiance runtime manifest and document run-miss policies in the runbook. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 14:15:45 +02:00
tegwick	ffc0ee2cb7	feat(ACTIVITY-WP-0014): plan schedule misfire robustness & run-miss options Cron fires are silently dropped: _build_schedule() sets SchedulePolicy(overlap=) but never catchup_window, so a brief worker/Temporal outage at trigger time drops the fire with no recovery and no signal (root cause of missing 06-22/06-23 daily triage runs). Define three explicit run-miss policies: skip, catchup_all, catchup_latest, plus missed-fire detection. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 13:46:19 +02:00
tegwick	59b3b73061	ui rules established	2026-06-22 23:03:40 +02:00
tegwick	4bc5111dfd	chore(consistency): apply state_hub_workstream_id writeback Sync archived workplan frontmatter from State Hub fix-consistency.	2026-06-22 17:43:32 +02:00
tegwick	e9a6029ded	chore(consistency): sync task status from DB [auto] Updated by fix-consistency on 2026-06-22: - update .custodian-brief.md for activity-core	2026-06-22 16:50:01 +02:00
tegwick	bf4e61f0bf	feat(ACTIVITY-WP-0012): complete live admin-sync no-restart smoke Ran Railiance01 cluster validation for POST /admin/sync without restarting actcore-worker, added a repeatable smoke script, and closed the workplan.	2026-06-22 16:25:26 +02:00
tegwick	40fa851ec0	fix(bridge): use /state/health for readiness probe The actcore-state-hub-bridge readiness probe hit /state/summary through the tunnel proxy chain. Cold-cache summary requests and intermittent tunnel stalls routinely exceeded the 5s probe timeout (1584 failures over 17h), leaving the pod 0/1 Ready and breaking hourly/triage sinks. Use /state/health instead — same signal the ops inventory already expects, and completes in ~30ms through the bridge.	2026-06-22 14:03:57 +02:00
tegwick	e0742d18d7	Mark .repo-classification.yaml human-reviewed (CUST-WP-0050 T02) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-22 11:40:43 +02:00
tegwick	ccac285b0a	Reclassify as tooling (CUST-WP-0050 T02) Apply the new 'tooling' category (reusable internal tooling/infrastructure) from the Repo Classification Standard. First-pass agent classification. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-22 03:06:01 +02:00
tegwick	a0dcc52353	Add repo classification (CUST-WP-0050 T02) First-pass agent classification per the Repo Classification Standard v1.0 (canon-repo-classification); pending human review. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-22 02:44:46 +02:00
tegwick	faf5d60ae8	feat(STATE-WP-0064): enable cluster consistency sweep schedule Enable the definition in k8s projection and pass activity-core source tags.	2026-06-21 21:46:43 +02:00
tegwick	adfd1a9067	fix(STATE-WP-0064): allow 360s POST timeout on state-hub bridge proxy Consistency sweeps exceed the previous 30s urllib timeout when triggered from Railiance01 activity-core through actcore-state-hub-bridge.	2026-06-21 20:56:35 +02:00
tegwick	44987457c1	chore: add make sync-schedules target for Temporal schedule reconcile Wraps python -m activity_core.sync_schedules for operator discoverability.	2026-06-21 20:28:04 +02:00
tegwick	3a981cc98f	feat(STATE-WP-0064): wire consistency_sweep_remote_all state-hub query Add POST /consistency/sweep/remote-all resolver support with a 330s timeout and k8s projection for the consistency sweep definition.	2026-06-21 20:19:22 +02:00
tegwick	dbd2fbb11c	docs(workplan): record railiance01 llm-connect smoke evidence Note the 2026-06-19 live reconciliation on railiance01: llm-connect deployed, worker restarted with LLM_CONNECT_URL, fixture smoke passed. Manual daily triage still blocked on actcore-state-hub-bridge reachability.	2026-06-19 15:58:04 +02:00
tegwick	c938b80503	chore(kaizen): demote coach/optimization to weekly operate cadence After coulomb-loop bootstrap E2E (3/3 cycles on 2026-06-18), revert activity-core from experimental daily crons to weekly Monday schedules so discover_kaizen_scheduled_repos(cadence=weekly) matches the operate-phase ActivityDefinitions. Drop the disabled tdd-workflow stub.	2026-06-19 11:32:36 +02:00
tegwick	3e93567a53	Add admin sync hot reload path	2026-06-19 01:54:13 +02:00
tegwick	6f68f8f9ec	chore(consistency): sync task status from DB [auto] Updated by fix-consistency on 2026-06-19: - update .custodian-brief.md for activity-core	2026-06-19 01:52:52 +02:00
tegwick	f05c56e202	fix(issue-sink): stringify triggering_event_id before JSON encode IssueCoreRestSink.emit() passed task_spec.triggering_event_id straight into the httpx json= payload. When the field is a UUID object (rather than a string), httpx's JSON encoder raised "TypeError: Object of type UUID is not JSON serializable", failing the emission. Guard with str(), preserving None for optional event ids. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 00:15:03 +02:00
tegwick	200ec0c97a	Add credential routing instructions for all agent runtimes Propagate shared credential-routing section (Codex, Claude, Grok, llm-connect) from state-hub template via scripts/propagate_credential_routing.py.	2026-06-18 22:48:37 +02:00
tegwick	42e5ef725c	Document issue-core emission contract in AGENTS.md Add ISSUE_CORE_URL, ISSUE_CORE_API_KEY, and ISSUE_SINK_TYPE guidance so agents pair keys locally or via OpenBao instead of requesting them from ops-warden.	2026-06-18 22:34:59 +02:00
tegwick	a08bd1684f	Add ISSUE_CORE_API_KEY auth to IssueCoreRestSink Issue-core requires a shared ingestion key on POST /issues/. The REST sink now sends Authorization: Bearer using ISSUE_CORE_API_KEY and fails fast when the key is missing under ISSUE_SINK_TYPE=rest. Updates .env.example, emission boundary docs, and unit tests for the header contract and missing-key error.	2026-06-18 22:30:13 +02:00
tegwick	2078915854	Add reuse-surface report gaps resolver	2026-06-18 17:58:00 +02:00
tegwick	23f4956b68	chore(consistency): sync task status from DB [auto] Updated by fix-consistency on 2026-06-18: - update .custodian-brief.md for activity-core	2026-06-18 17:52:38 +02:00
tegwick	764339e490	chore(consistency): renormalize lifecycle state [auto] Updated by fix-consistency on 2026-06-18: - workplan status: ready → active	2026-06-18 17:52:33 +02:00
tegwick	17e2e39165	Track definition schedule hot reload	2026-06-18 15:21:59 +02:00
tegwick	6518ecefce	chore(consistency): sync task status from DB [auto] Updated by fix-consistency on 2026-06-18: - update .custodian-brief.md for activity-core	2026-06-18 15:20:03 +02:00
tegwick	727868a245	Finish event payload resolver workplan	2026-06-18 15:15:07 +02:00
tegwick	a279d59f73	Add kaizen agent project assets	2026-06-18 15:14:20 +02:00
tegwick	23e2316dff	Harden coding retro resolver selection	2026-06-18 15:13:08 +02:00
tegwick	206bb336d2	Wire llm-connect runtime for daily triage	2026-06-18 15:12:31 +02:00
tegwick	977a3bd97f	Align activity-core scope boundaries	2026-06-18 15:11:48 +02:00
tegwick	78eed5f942	chore(consistency): sync task status from DB [auto] Updated by fix-consistency on 2026-06-18: - update .custodian-brief.md for activity-core	2026-06-18 15:09:20 +02:00
tegwick	717535b62d	Close event-payload live smoke handoff	2026-06-18 14:26:27 +02:00
tegwick	b2816d9776	chore(consistency): sync task status from DB [auto] Updated by fix-consistency on 2026-06-18: - update .custodian-brief.md for activity-core	2026-06-18 14:05:59 +02:00
tegwick	0554014083	Add event-payload context resolver	2026-06-18 14:01:11 +02:00
tegwick	b84e474ac5	chore(consistency): sync task status from DB [auto] Updated by fix-consistency on 2026-06-18: - update .custodian-brief.md for activity-core	2026-06-18 13:16:24 +02:00

1 2 3 4 5 ...

1807 Commits