Scheduler
A cron-like rule engine embedded in the crew dashboard for periodic tasks and agent lifecycle automation. Rules are declarative, one per line, stored in a plain text file per crew.
When To Use It
- Keep a specific agent alive automatically (re-summon on crash or release)
- Send periodic messages (daily narrative prompts, status checks)
- Release idle agents after a timeout
- Run shell commands on a schedule (test suites, maintenance scripts)
- Replace system crontab entries for crew-level operations
Quick Start
Create a rules file at .crew/scheduler.rules in the crew directory:
[keep-forge] every 1m if not alive(forge) => summon forge
[daily-report] at 0 0 * * * => msg beacon "Write daily narrative for today"
The dashboard picks up the file automatically on the next tick (within 10 seconds). No restart required.
Verify with a dry run:
$ metateam crew scheduler test
Rule Format
[<name>] <trigger> [<condition>] => <action>
- Name (optional): enclosed in
[brackets], must be unique. If omitted, an auto-generated name is derived from the trigger and action. - Trigger: when the rule fires (interval or cron).
- Condition (optional): a predicate that must be true for the action to execute.
- Action: what to do when the rule fires.
- Parts are separated by
=>(space-arrow-space). - Empty lines and
#comments are ignored.
Triggers
Interval
every <duration>
Fires repeatedly at fixed intervals. Duration format: <number><unit> where unit is s (seconds), m (minutes), h (hours), or d (days).
On startup or rule creation, the first fire waits one full interval period. This prevents a storm of actions on dashboard restart.
every 30s
every 5m
every 1h
Cron
Standard 5-field crontab expression, evaluated in the server's local timezone:
at <minute> <hour> <day-of-month> <month> <day-of-week>
Fields support *, */N, N,M, and N-M ranges.
at 0 0 * * * # midnight daily
at */30 * * * * # every 30 minutes
at 0 9,17 * * 1-5 # 9am and 5pm on weekdays
Cron Shorthands
at 14:30 # daily at 14:30
at Sun 00:00 # weekly on Sunday at midnight
Conditions
Optional. If omitted, the action fires unconditionally on trigger. One condition per rule.
alive / not alive
if alive(<agent>)
if not alive(<agent>)
True when the specified agent is or is not present in the communicator agent registry.
Multiple agents use AND semantics:
| Expression | Meaning |
|---|---|
alive(forge) |
Forge is registered |
alive(forge,beacon) |
Forge AND Beacon are both registered |
not alive(forge,beacon) |
Forge OR Beacon is not registered |
idle
if idle(<target>) > <duration>
True when agents have had no real terminal activity for longer than the given duration.
| Target | Meaning |
|---|---|
all |
Every registered agent is idle |
any |
At least one registered agent is idle |
<name> |
Specific agent is idle |
<name1>,<name2> |
All listed agents are idle |
When an idle condition matches, two template variables become available in the action's message:
{idle}-- comma-separated names of idle agents{idle_duration}-- formatted duration of the longest idle agent
Actions
msg
msg <target> "<message>"
Send a message via the communicator. Target: all, <name>, or <name1>,<name2>.
Messages support template variables: {timestamp} (ISO timestamp), {crew} (crew name), {idle} and {idle_duration} (when an idle condition is present).
Messages are suppressed when dashboard silence mode is active.
summon
summon <name> ["<message>"]
Summon a persona. Uses the same path as metateam crew summon -- reads KB persona entry for client, model, and effort defaults. Message is optional.
Summon dedup prevents overlapping summons for the same persona. If a summon is already pending (within a 60-second window), subsequent summon actions for that persona are skipped.
release
release <target>
Release an agent. Target: <name>, <name1>,<name2>, or all.
Not suppressed by silence mode.
run
run "<shell command>"
Execute a shell command asynchronously. Output is logged, not sent to agents.
- Working directory: crew root (repository root)
- Default timeout: 5 minutes (SIGTERM, then SIGKILL after 10 seconds)
- Single-instance per rule: skipped if a previous run is still active
- Last 20 lines of stdout/stderr are captured and logged with the exit code
Complete Examples
# === Agent Lifecycle ===
# Keep an agent alive -- re-summon if it disappears
[keep-forge] every 1m if not alive(forge) => summon forge
# Release agents idle for 4 hours
[idle-release] every 5m if idle(any) > 4h => release {idle}
# === Scheduled Tasks ===
# Daily narrative prompt at midnight
[daily-narrative] at 0 0 * * * => msg beacon "Write daily narrative for today"
# Weekly summary on Sunday at 1am
[weekly-summary] at 0 1 * * 0 => summon beacon "Build weekly narrative"
# === Health Checks ===
# Run tests every hour during work hours on weekdays
[api-tests] at 0 9-17 * * 1-5 => run "dotnet test api/Tests -v minimal"
# === Notifications ===
# Nudge crew lead when everyone is idle
[idle-nudge] every 30m if idle(all) > 10m => msg drift "All agents idle for {idle_duration}"
Dry Run
Parse rules and evaluate triggers and conditions against the current crew state without executing any actions:
$ metateam crew scheduler test
Use --json for structured output.
Dashboard Integration
Status Bar
The scheduler shows a compact indicator in the dashboard status bar:
| Indicator | Meaning |
|---|---|
SCH 3 |
3 rules loaded, all healthy |
SCH 3 !1 |
3 rules, 1 in failure backoff (yellow) |
SCH X |
Scheduler disabled after repeated crashes (red) |
SCH -- |
No rules file or empty |
Notifications
Events that need attention surface as dashboard notification toasts:
- Rule enters backoff after consecutive failures
- Scheduler disabled (crash threshold exceeded)
- Rules file parse errors on reload
runaction timeout or non-zero exit
Normal successful fires are silent.
/scheduler Overlay
The /scheduler slash command opens a full state table:
Name Trigger Condition Last Status
keep-forge every 1m not alive(forge) 30s ago ok
daily-narrative at 0 0 * * * -- 8h ago ok
idle-nudge every 30m idle(all) > 10m 2m ago skip
api-tests at 0 9-17 -- 1h ago !3 bk
Status values: ok (last success), skip (condition was false), !N bk (in backoff with N failures), run (currently executing), abandoned (from unclean restart).
Runtime Behavior
- The scheduler runs inside the dashboard communicator server, ticking every 10 seconds.
- Actions are dispatched asynchronously and never block the tick loop.
- All conditions in a single tick evaluate against the same immutable snapshot of agent state.
- On consecutive failures for the same rule, backoff increases exponentially: 1m, 2m, 4m, up to 1 hour max. Backoff resets when the action succeeds or the relevant agent's state changes.
- Execution state is persisted to
.crew/scheduler.state.jsonand survives dashboard restarts. - If the dashboard was down, missed intervals are not replayed. The scheduler resumes from the next future trigger.
Hot Reload
The scheduler checks .crew/scheduler.rules for changes every tick. On change:
- The file is re-parsed entirely.
- Rules matched by name carry forward their execution state.
- New or changed rules start fresh (next interval fires after one full period).
- Parse errors are logged with line numbers; valid rules continue to run.
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| Rule never fires | Condition always false | Run metateam crew scheduler test to see current evaluation |
| Rule fires repeatedly | Condition flaps (agent keeps dying) | Check agent health; backoff will engage automatically |
summon action skipped |
Another summon for that agent is pending | Wait for the 60-second dedup window to expire |
Status shows !N bk |
N consecutive failures triggered backoff | Fix the underlying issue; backoff resets on success |
Status shows abandoned |
Dashboard crashed mid-execution | State resets on next successful fire |
| Parse error on reload | Syntax error in rules file | Check logs for the line number and raw text; valid rules still run |
Guarantees vs Best-Effort
| Aspect | Level |
|---|---|
| Rule evaluation order | Guaranteed: file order, top to bottom |
| At-most-once per interval/minute | Guaranteed: dedup prevents double-fire |
| Exact fire time | Best-effort: evaluated every 10 seconds, so up to 10s late |
| Missed interval catch-up | No: missed windows are not replayed after downtime |
| Action completion | Best-effort: actions can fail; failures trigger backoff |
| State persistence across restart | Guaranteed: .crew/scheduler.state.json survives restarts |
See Also
- Crew (Command Reference) -- summon, release, message commands
- The Terminal Dashboard -- dashboard overview
- Background Monitor -- session monitoring