diff --git a/AGENTS.md b/AGENTS.md index 54f897f..58acfa4 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -28,23 +28,114 @@ Example: git commit -m "Add logout button" --prompt "Add a logout button to the header. When clicked it should clear the session and redirect to /login" ``` -View prompts with: +When `--prompt` is used, zagi automatically stores metadata in git notes: +- `refs/notes/agent` - detected AI agent (claude, opencode, cursor, windsurf, vscode, terminal) +- `refs/notes/prompt` - the user prompt text +- `refs/notes/session` - full session transcript (for Claude Code, OpenCode) + +View metadata with: ```bash -git log --prompts +git log --prompts # show prompts (truncated to 200 chars) +git log --agent # show which AI agent made each commit +git log --session # show session transcript (paginated, first 20k bytes) +git log --session --session-offset=20000 # continue from byte 20000 ``` -### Environment Setup +### Agent Mode -Set `ZAGI_AGENT` to enable prompt enforcement: +Agent mode is automatically enabled when running inside AI tools: +- Claude Code (sets `CLAUDECODE=1`) +- OpenCode (sets `OPENCODE=1`) +- VS Code, Cursor, Windsurf (detected from terminal environment) +You can also enable it manually: ```bash -export ZAGI_AGENT=claude-code +export ZAGI_AGENT=my-agent ``` -When this is set: +When agent mode is active: 1. `git commit` will fail without `--prompt`, ensuring all AI-generated commits have their prompts recorded 2. Destructive commands are blocked to prevent data loss +## Agent Subcommands + +zagi provides two agent subcommands for autonomous task execution using the RALPH pattern (Recursive Agent Loop Pattern for Humans). + +### zagi agent plan + +Starts an interactive planning session where an AI agent collaborates with you to design and create tasks. + +```bash +# Start an interactive session (agent will ask what you want to build) +zagi agent plan + +# Start with initial context +zagi agent plan "Add user authentication with JWT" + +# Preview the prompt without executing +zagi agent plan --dry-run +``` + +The planning agent follows an interactive protocol: +1. **Explore codebase**: Reads AGENTS.md and relevant code to understand architecture +2. **Ask clarifying questions**: Asks about scope, constraints, and preferences before drafting any plan +3. **Propose plan**: Presents a numbered implementation plan for your review +4. **Create tasks**: Only creates tasks after you explicitly approve the plan + +This collaborative approach ensures the agent gathers all necessary context before committing to a task breakdown. The agent will ask 2-4 focused questions at a time about: +- **Scope**: What's included/excluded, edge cases, MVP vs nice-to-haves +- **Constraints**: Performance requirements, dependencies, compatibility +- **Preferences**: Approach/patterns, integration with existing code, testing expectations +- **Acceptance criteria**: How we know it's done, what success looks like + +### zagi agent run + +Executes the RALPH loop to automatically complete pending tasks. + +```bash +# Run until all tasks complete (or fail 3x) +zagi agent run + +# Run only one task then exit +zagi agent run --once + +# Preview what would run without executing +zagi agent run --dry-run + +# Set delay between tasks (default: 2 seconds) +zagi agent run --delay 5 + +# Safety limit - stop after N tasks +zagi agent run --max-tasks 10 +``` + +The run loop will: +1. Pick the next pending task +2. Execute it with the configured agent +3. Mark it done on success (agent calls `zagi tasks done `) +4. Skip tasks that fail 3 consecutive times +5. Continue until all tasks complete + +### Executor Configuration + +Control which AI agent executes tasks using environment variables: + +```bash +# Use Claude Code (default) +ZAGI_AGENT=claude zagi agent run + +# Use opencode +ZAGI_AGENT=opencode zagi agent run + +# Use custom command with auto mode flags +ZAGI_AGENT=claude ZAGI_AGENT_CMD="myclaude --flag" zagi agent run + +# Use completely custom tool (no auto flags) +ZAGI_AGENT_CMD="aider --yes" zagi agent run +``` + +See [docs/setup.md](docs/setup.md) for full configuration details. + ### Blocked Commands (in agent mode) These commands cause unrecoverable data loss and are blocked when `ZAGI_AGENT` is set: diff --git a/CONTEXT.md b/CONTEXT.md new file mode 100644 index 0000000..9f687a3 --- /dev/null +++ b/CONTEXT.md @@ -0,0 +1,89 @@ +# Context + +Ephemeral working context for this PR. Delete before merging to main. + +## Mission + +Build git-native task management and autonomous agent execution for zagi. + +## What is zagi? + +A Zig + libgit2 wrapper that makes git output concise and agent-friendly: +- Smaller output (agents pay per token) +- Guardrails block destructive commands when `ZAGI_AGENT` is set +- Prompt provenance via `git commit --prompt "why this change"` +- Task management via `zagi tasks` (stored in `refs/tasks/`) + +## Current Focus + +Implementing RALPH-driven development for autonomous agent execution. + +**RALPH**: https://lukeparker.dev/stop-chatting-with-ai-start-loops-ralph-driven-development + +The loop: +1. `zagi agent plan` - Interactive planning session with user +2. `zagi agent run` - Autonomous execution of tasks +3. Tasks stored as git objects (`refs/tasks/`) +4. Agent picks pending task, completes it, marks done +5. Loop continues until all tasks complete + +## Work Streams + +### 1. Agent Execution (tasks 001-012) - DONE +- Subcommand refactor (plan/run) +- Executor config (ZAGI_AGENT, ZAGI_AGENT_CMD) +- Validation and bug fixes + +### 2. Cleanup & Polish (tasks 013-023) +- Memory leaks in tasks.zig +- Hardcoded paths +- Documentation updates +- Style conformance + +### 3. Testing (tasks 024-030) +- Agent plan/run tests +- Error condition coverage +- Full test suite pass + +### 4. git edit Feature (tasks 037-062) +- jj-style mid-stack editing +- Lets agents fix commits from earlier in history +- Auto-rebases descendants after edit + +### 5. Interactive Planning (tasks 032-036) +- Make `zagi agent plan` interactive (stdin/stdout passthrough) +- Agent explores codebase, asks questions, builds plan with user +- Convert approved plan to tasks + +### 6. Observability (tasks 064-067) +- Streaming JSON output for debugging +- CONTEXT.md generation during planning + +## Key Files + +- `src/cmds/tasks.zig` - Task CRUD operations +- `src/cmds/agent.zig` - Agent plan/run subcommands +- `start.sh` - Independent RALPH loop runner +- `friction.md` - Issues encountered during development + +## Constraints + +- No external dependencies (everything in git) +- Concise output (agents pay per token) +- No emojis in code or output +- Agents cannot edit/delete tasks (guardrail) +- Always use `--prompt` when committing +- Never `git push` (only commit) + +## Build & Test + +```bash +zig build # Build +zig build test # Zig unit tests +cd test && bun run test # Integration tests +``` + +## Environment + +- `ZAGI_AGENT=claude|opencode` - Executor, enables guardrails +- `ZAGI_AGENT_CMD` - Custom command override diff --git a/README.md b/README.md index 45131cc..8e90f7e 100644 --- a/README.md +++ b/README.md @@ -69,29 +69,45 @@ git fork --delete-all ### Agent mode -Set `ZAGI_AGENT` to enable agent-specific features. +Agent mode is automatically enabled when running inside AI tools (Claude Code, OpenCode, Cursor, Windsurf, VS Code). You can also enable it manually: ```bash -export ZAGI_AGENT=claude-code +export ZAGI_AGENT=my-agent ``` -The value can be any string describing your agent (e.g. `claude-code`, `cursor`, `opencode`) - this will be used in future features for agent-specific behavior. - This enables: - **Prompt tracking**: `git commit` requires `--prompt` to record the user request that created the commit +- **AI attribution**: Automatically detects and stores which AI agent made the commit - **Guardrails**: Blocks destructive commands (`reset --hard`, `checkout .`, `clean -f`, `push --force`) to prevent data loss ```bash -git commit -m "Add feature" --prompt "Add a logout button to the header.." -git log --prompts # view prompts +git commit -m "Add feature" --prompt "Add a logout button to the header" +git log --prompts # view prompts +git log --agent # view which AI agent made commits +git log --session # view full session transcript (with pagination) ``` -To prevent child processes from overriding `ZAGI_AGENT`, make it readonly: +Metadata is stored in git notes (`refs/notes/agent`, `refs/notes/prompt`, `refs/notes/session`) which are local by default and don't affect commit history. + +### Environment variables + +| Variable | Description | Default | Valid values | +|----------|-------------|---------|--------------| +| `ZAGI_AGENT` | Manually enable agent mode. Auto-detected from `CLAUDECODE`, `OPENCODE`, or IDE environment. | (auto) | Any string enables agent mode. For executors: `claude`, `opencode` | +| `ZAGI_AGENT_CMD` | Custom executor command override. When set, the prompt is appended as the final argument. | (unset) | Any shell command (e.g., `aider --yes`) | +| `ZAGI_STRIP_COAUTHORS` | Strips `Co-Authored-By:` lines from commit messages. | (unset) | `1` to enable | + +**Agent detection**: Agent mode is automatically enabled when `CLAUDECODE=1` or `OPENCODE=1` is set (by Claude Code or OpenCode), or when running in VS Code/Cursor/Windsurf terminals. ```bash -# bash/zsh -export ZAGI_AGENT=claude-code -readonly ZAGI_AGENT +# Use Claude Code (default) +ZAGI_AGENT=claude zagi agent run + +# Use opencode +ZAGI_AGENT=opencode zagi agent run + +# Use a custom command +ZAGI_AGENT_CMD="aider --yes" zagi agent run ``` ### Strip co-authors diff --git a/build.zig b/build.zig index 8980e10..3202e5a 100644 --- a/build.zig +++ b/build.zig @@ -93,6 +93,14 @@ pub fn build(b: *std.Build) void { }); diff_tests.root_module.linkLibrary(libgit2_dep.artifact("git2")); + const agent_tests = b.addTest(.{ + .root_module = b.createModule(.{ + .root_source_file = b.path("src/cmds/agent.zig"), + .target = target, + .optimize = optimize, + }), + }); + const run_exe_unit_tests = b.addRunArtifact(exe_unit_tests); const run_log_tests = b.addRunArtifact(log_tests); const run_git_tests = b.addRunArtifact(git_tests); @@ -100,6 +108,7 @@ pub fn build(b: *std.Build) void { const run_add_tests = b.addRunArtifact(add_tests); const run_commit_tests = b.addRunArtifact(commit_tests); const run_diff_tests = b.addRunArtifact(diff_tests); + const run_agent_tests = b.addRunArtifact(agent_tests); const test_step = b.step("test", "Run unit tests"); test_step.dependOn(&run_exe_unit_tests.step); @@ -109,4 +118,5 @@ pub fn build(b: *std.Build) void { test_step.dependOn(&run_add_tests.step); test_step.dependOn(&run_commit_tests.step); test_step.dependOn(&run_diff_tests.step); + test_step.dependOn(&run_agent_tests.step); } diff --git a/docs/features.md b/docs/features.md index efec194..f45233e 100644 --- a/docs/features.md +++ b/docs/features.md @@ -39,6 +39,11 @@ Options: - `--grep=` - filter by commit message - `--since=` - commits after date (e.g. "2025-01-01", "1 week ago") - `--until=` - commits before date +- `--prompts` - show AI prompts attached to commits +- `--agent` - show which AI agent made the commit +- `--session` - show session transcript (first 20k bytes) +- `--session-offset=N` - start session display at byte N +- `--session-limit=N` - limit session display to N bytes - `-- ...` - filter to commits affecting paths ### git diff @@ -110,34 +115,48 @@ Forks are git worktrees. The `.forks/` directory is auto-added to `.gitignore`. - `--pick` performs a proper git merge, preserving both base and fork history - `--promote` moves HEAD to the fork's commit, discarding any base-only commits (stash uncommitted changes first) -### --prompt +### --prompt (AI Attribution) -Store the user prompt that created a commit: +Store the user prompt and AI metadata with a commit: ```bash git commit -m "Add feature" --prompt "Add a logout button to the header" -git log --prompts # view prompts in log output ``` -### ZAGI_AGENT +When `--prompt` is used, zagi stores metadata in git notes: +- `refs/notes/agent` - detected AI agent (claude, opencode, cursor, etc.) +- `refs/notes/prompt` - the user prompt text +- `refs/notes/session` - full session transcript (Claude Code, OpenCode) -Set `ZAGI_AGENT` to enable agent-specific features. The value can be any string describing your agent (e.g. `claude-code`, `cursor`, `aider`) - this will be used in future features for agent-specific behavior. +View with log flags: +```bash +git log --prompts # show prompts (truncated to 200 chars) +git log --agent # show agent name +git log --session # show session transcript (paginated) +git log --session --session-limit=1000 # first 1000 bytes +git log --session --session-offset=1000 # start at byte 1000 +``` + +Git notes are local by default and don't modify commit history. +### Agent Mode + +Agent mode is automatically enabled when running inside AI tools: +- Claude Code (`CLAUDECODE=1`) +- OpenCode (`OPENCODE=1`) +- VS Code, Cursor, Windsurf (detected from `VSCODE_GIT_ASKPASS_NODE`) + +You can also enable it manually: ```bash -export ZAGI_AGENT=claude-code -git commit -m "x" # error: --prompt required +export ZAGI_AGENT=my-agent ``` -When `ZAGI_AGENT` is set: +When agent mode is active: - `git commit` requires `--prompt` to record the user request - Destructive commands are blocked (guardrails) -To prevent child processes from overriding `ZAGI_AGENT`, make it readonly: - ```bash -# bash/zsh -export ZAGI_AGENT=claude-code -readonly ZAGI_AGENT +git commit -m "x" # error: --prompt required in agent mode ``` ### ZAGI_STRIP_COAUTHORS diff --git a/docs/setup.md b/docs/setup.md new file mode 100644 index 0000000..2603ca6 --- /dev/null +++ b/docs/setup.md @@ -0,0 +1,84 @@ +# Setup + +## Requirements + +- Zig 0.15+ +- Bun (for integration tests) + +## Building + +```bash +zig build +``` + +The binary will be at `./zig-out/bin/zagi`. + +## Executor Configuration + +zagi agent commands (`agent plan`, `agent run`) use AI agents to execute tasks. Configure which agent to use with environment variables. + +### ZAGI_AGENT + +Select a built-in executor for `zagi agent` commands: + +```bash +# Use Claude Code (default) +ZAGI_AGENT=claude zagi agent run + +# Use opencode +ZAGI_AGENT=opencode zagi agent run +``` + +Valid values for executors: `claude`, `opencode` + +Note: Agent mode (guardrails, `--prompt` requirement) is auto-detected from the environment. Setting `ZAGI_AGENT` also enables agent mode, but this is primarily for selecting the executor. + +Built-in executors automatically handle mode flags: +- `claude`: adds `-p` for headless mode (`agent run`) +- `opencode`: adds `run` for headless mode (`agent run`) + +### ZAGI_AGENT_CMD + +Override the command used to invoke the agent: + +```bash +# Use a custom Claude binary with extra flags +ZAGI_AGENT=claude +ZAGI_AGENT_CMD="~/my-claude --dangerously-skip-permissions" +zagi agent run +# → Executes: ~/my-claude --dangerously-skip-permissions -p "" +``` + +When both `ZAGI_AGENT` and `ZAGI_AGENT_CMD` are set: +- `ZAGI_AGENT_CMD` provides the base command +- `ZAGI_AGENT` determines what mode flags to add (`-p` for claude, `run` for opencode) + +### Custom Tools + +For tools that aren't claude or opencode, just set `ZAGI_AGENT_CMD`: + +```bash +# Use aider (no auto flags added) +ZAGI_AGENT_CMD="aider --yes" zagi agent run +# → Executes: aider --yes "" +``` + +When only `ZAGI_AGENT_CMD` is set (no `ZAGI_AGENT`), the command is used as-is with no automatic flags. + +### Examples + +| ZAGI_AGENT | ZAGI_AGENT_CMD | agent run executes | +|------------|----------------|-------------------| +| `claude` | (not set) | `claude -p ""` | +| `opencode` | (not set) | `opencode run ""` | +| `claude` | `myclaude --flag` | `myclaude --flag -p ""` | +| `opencode` | `myopencode --flag` | `myopencode --flag run ""` | +| (not set) | `aider --yes` | `aider --yes ""` | + +### Agent Mode Safety + +Agent mode is automatically enabled when running inside AI tools (Claude Code, OpenCode, Cursor, Windsurf, VS Code) or when `ZAGI_AGENT` is set. When active, destructive git commands are blocked to prevent data loss. See [AGENTS.md](../AGENTS.md#blocked-commands-in-agent-mode) for the full list. + +## Log Files + +Task execution logs are written to `/tmp/zagi//.log`. Output is streamed in real-time to both the console and log file. diff --git a/friction.md b/friction.md new file mode 100644 index 0000000..85b2464 --- /dev/null +++ b/friction.md @@ -0,0 +1,79 @@ +# Friction Log + +Issues encountered while developing zagi agent functionality. + +## Critical Issues + +### 1. Segfault in consecutive_failures hashmap +**Status:** Fixed in commit b12ab5c +**Issue:** Use-after-free bug in `agent.zig` when storing task IDs in hashmap +**Root cause:** `task.id` memory was freed in defer block but hashmap kept reference +**Fix:** Duplicate the key before storing in hashmap +**Location:** `src/cmds/agent.zig:429-438` + +### 2. ZAGI_AGENT=1 treated as executor name +**Status:** Fixed +**Issue:** When ZAGI_AGENT is set to "1" (boolean-style), it's used literally as executor name +**Fix:** Added `getValidatedExecutor()` that validates against known values ("claude", "opencode") +**Location:** `src/cmds/agent.zig:74-89` + +### 3. Agent run internally uses relative path `./zig-out/bin/zagi` +**Status:** Fixed (task-012) +**Issue:** `getPendingTasks()` shells out to `./zig-out/bin/zagi tasks list --json` with relative path +**Impact:** Agent can only run from repo root directory +**Impact:** Integration tests fail when running from fixture directories +**Fix:** Used `std.fs.selfExePath()` to get absolute path to current executable +**Location:** `src/cmds/agent.zig:503-506` + +## Memory Management Issues + +### 4. Memory leaks in tasks.zig +**Status:** Open (task-013) +**Issue:** Memory leaks in `runAdd` when allocating task content +**Location:** `src/cmds/tasks.zig` - allocations via `toOwnedSlice` not properly freed +**Impact:** Memory leaks during repeated task creation + +### 5. Hashmap key ownership +**Status:** Fixed +**Issue:** Keys stored in hashmap must outlive their usage +**Learning:** In Zig, always `allocator.dupe()` strings before storing in containers that outlive the source + +## API Design Issues + +### 6. No validation of ZAGI_AGENT values +**Status:** Fixed (see #2) +**Issue:** Invalid executor values like "1" silently used as custom command +**Fix:** `getValidatedExecutor()` validates and returns clear error for invalid values + +### 7. Agent prompt path hardcoded +**Status:** Open (task-014) +**Issue:** Task completion instruction uses `./zig-out/bin/zagi tasks done` +**Impact:** Won't work if binary is installed elsewhere +**Location:** `src/cmds/agent.zig:108,118,518` +**Also:** Planning prompt at line 108, 118 + +### 8. JSON escaping was broken +**Status:** Fixed in commit be3156a +**Issue:** Task content with special characters (quotes, newlines) broke JSON output +**Fix:** Added `escapeJsonString()` function in tasks.zig +**Location:** `src/cmds/tasks.zig:7-32` + +## Test Infrastructure Issues + +### 9. Tests referenced removed features +**Status:** Fixed +**Issue:** Tests still referenced `--after` flag and `ready` command which were removed +**Fix:** Updated tests to match current implementation + +### 10. Slow bun test performance +**Status:** Open (task-064) +**Issue:** Test suite takes too long - process spawning overhead +**Suggestion:** Migrate more tests to native Zig tests (task-065) + +## Suggestions for Future Work + +1. **Use absolute paths:** Resolve binary path at startup using `std.fs.selfExePath()` +2. **Validate env vars:** Check ZAGI_AGENT is "claude", "opencode", or empty +3. **Better error messages:** Indicate what went wrong with task loading +4. **Memory management audit:** Review all `toOwnedSlice` calls for proper cleanup +5. **Integration test isolation:** Run tests in temp directories with absolute paths diff --git a/plan.md b/plan.md index 1e91e0e..037e155 100644 --- a/plan.md +++ b/plan.md @@ -11,7 +11,7 @@ - [x] Implement `git tasks ready` - list pending tasks (no blockers) - [x] Add --after flag to `git tasks add` for dependencies - [x] Implement `git tasks pr` - export markdown for PR description -- [ ] Add --json flag to all commands -- [ ] Block edit/delete when ZAGI_AGENT is set +- [x] Add --json flag to all commands +- [x] Block edit/delete when ZAGI_AGENT is set - [x] Add routing in main.zig for tasks command -- [ ] Write integration tests in test/src/tasks.test.ts +- [x] Write integration tests in test/src/tasks.test.ts diff --git a/src/cmds/agent.zig b/src/cmds/agent.zig new file mode 100644 index 0000000..be75b31 --- /dev/null +++ b/src/cmds/agent.zig @@ -0,0 +1,940 @@ +const std = @import("std"); +const git = @import("git.zig"); +const c = git.c; + +pub const help = + \\usage: git agent [options] + \\ + \\AI agent for automated task execution. + \\ + \\Commands: + \\ run Execute RALPH loop to complete tasks + \\ plan Start planning session to create tasks + \\ + \\Run 'git agent --help' for command-specific options. + \\ + \\Environment: + \\ ZAGI_AGENT Executor: claude (default) or opencode + \\ ZAGI_AGENT_CMD Custom command override (e.g., "aider --yes") + \\ +; + +const run_help = + \\usage: git agent run [options] + \\ + \\Execute RALPH loop to automatically complete tasks. + \\ + \\Options: + \\ --once Run only one task, then exit + \\ --dry-run Show what would run without executing + \\ --delay Delay between tasks (default: 2) + \\ --max-tasks Stop after n tasks (safety limit) + \\ -h, --help Show this help message + \\ + \\Examples: + \\ git agent run + \\ git agent run --once + \\ git agent run --dry-run + \\ ZAGI_AGENT=opencode git agent run + \\ +; + +const plan_help = + \\usage: git agent plan [options] [description] + \\ + \\Start an interactive planning session with an AI agent. + \\ + \\The agent will: + \\1. Ask questions to understand your requirements + \\2. Explore the codebase to understand the architecture + \\3. Present a detailed plan for your approval + \\4. Create tasks only after you confirm + \\ + \\Options: + \\ --dry-run Show prompt without executing + \\ -h, --help Show this help message + \\ + \\Examples: + \\ git agent plan # Start interactive session + \\ git agent plan "Add user authentication" # Start with initial context + \\ +; + +pub const Error = git.Error || error{ + InvalidCommand, + AllocationError, + OutOfMemory, + SpawnFailed, + TaskLoadFailed, + InvalidExecutor, +}; + +/// Valid executor values for ZAGI_AGENT +const valid_executors = [_][]const u8{ "claude", "opencode" }; + +/// Builds command arguments for the specified executor. +/// Returns an ArrayList that the caller must deinit. +/// +/// The `interactive` parameter controls whether the executor runs in interactive +/// mode (user can converse with agent) or headless mode (non-interactive, for +/// autonomous task execution): +/// - interactive=true: Claude runs without -p, opencode uses plain mode +/// - interactive=false: Claude runs with -p (print mode), opencode uses run mode +fn buildExecutorArgs( + allocator: std.mem.Allocator, + executor: []const u8, + agent_cmd: ?[]const u8, + prompt: []const u8, + interactive: bool, +) !std.ArrayList([]const u8) { + var args = std.ArrayList([]const u8){}; + errdefer args.deinit(allocator); + + if (agent_cmd) |cmd| { + // Custom command as base, but use executor to determine mode flags + var parts = std.mem.splitScalar(u8, cmd, ' '); + while (parts.next()) |part| { + if (part.len > 0) try args.append(allocator, part); + } + // Add mode flags based on executor type (if known) + if (!interactive) { + if (std.mem.eql(u8, executor, "claude")) { + try args.append(allocator, "-p"); + } else if (std.mem.eql(u8, executor, "opencode")) { + try args.append(allocator, "run"); + } + // Unknown executor: no auto flags added + } + try args.append(allocator, prompt); + } else if (std.mem.eql(u8, executor, "claude")) { + try args.append(allocator, "claude"); + if (!interactive) { + try args.append(allocator, "-p"); + } + try args.append(allocator, prompt); + } else if (std.mem.eql(u8, executor, "opencode")) { + try args.append(allocator, "opencode"); + if (!interactive) { + try args.append(allocator, "run"); + } + try args.append(allocator, prompt); + } else { + var parts = std.mem.splitScalar(u8, executor, ' '); + while (parts.next()) |part| { + if (part.len > 0) try args.append(allocator, part); + } + try args.append(allocator, prompt); + } + + return args; +} + +/// Formats the executor command for dry-run display. +/// The `interactive` parameter mirrors the buildExecutorArgs behavior. +fn formatExecutorCommand(executor: []const u8, agent_cmd: ?[]const u8, interactive: bool) []const u8 { + // Custom command - shown as-is, user is responsible for including flags + if (agent_cmd) |cmd| return cmd; + if (std.mem.eql(u8, executor, "claude")) { + return if (interactive) "claude" else "claude -p"; + } + if (std.mem.eql(u8, executor, "opencode")) { + return if (interactive) "opencode" else "opencode run"; + } + return executor; +} + +/// Updates the failure count for a task in the consecutive_failures tracking map. +/// +/// Called after each task execution attempt: +/// - On success: pass new_count = 0 to reset the counter (task proved it can work) +/// - On failure: pass the incremented count (previous + 1) +/// +/// The map uses duplicated keys because task_id strings are freed after each +/// loop iteration. If the key doesn't exist yet, we allocate a copy. +fn updateFailureCount(allocator: std.mem.Allocator, map: *std.StringHashMap(u32), task_id: []const u8, new_count: u32) void { + const gop = map.getOrPut(task_id) catch return; + if (!gop.found_existing) { + gop.key_ptr.* = allocator.dupe(u8, task_id) catch task_id; + } + gop.value_ptr.* = new_count; +} + +/// Creates a log path in /tmp/zagi//.log +/// Creates the directory structure but not the file (tee will create it). +fn createLogPath(path_buf: *[512]u8) ?[]const u8 { + // Get current working directory name for the log subdirectory + var cwd_buf: [std.fs.max_path_bytes]u8 = undefined; + const cwd = std.fs.cwd().realpath(".", &cwd_buf) catch return null; + const repo_name = std.fs.path.basename(cwd); + + // Generate a random ID for this run + var random_bytes: [8]u8 = undefined; + std.crypto.random.bytes(&random_bytes); + + // Format as hex string + var hex_buf: [16]u8 = undefined; + const hex_chars = "0123456789abcdef"; + for (random_bytes, 0..) |byte, i| { + hex_buf[i * 2] = hex_chars[byte >> 4]; + hex_buf[i * 2 + 1] = hex_chars[byte & 0xf]; + } + + // Build directory path and create it + var dir_buf: [256]u8 = undefined; + const dir_path = std.fmt.bufPrint(&dir_buf, "/tmp/zagi/{s}", .{repo_name}) catch return null; + std.fs.cwd().makePath(dir_path) catch {}; + + // Build full file path + const path = std.fmt.bufPrint(path_buf, "/tmp/zagi/{s}/{s}.log", .{ repo_name, hex_buf }) catch return null; + + return path; +} + +/// Validates ZAGI_AGENT env var. Returns validated executor or error. +/// If not set, returns "claude" as default. +/// If set to invalid value (like "1"), returns error. +fn getValidatedExecutor(stdout: anytype) Error![]const u8 { + const env_value = std.posix.getenv("ZAGI_AGENT") orelse return "claude"; + + // Check if it's a valid executor + for (valid_executors) |valid| { + if (std.mem.eql(u8, env_value, valid)) { + return env_value; + } + } + + // Invalid value - show error with valid options + stdout.print("error: invalid ZAGI_AGENT value '{s}'\n", .{env_value}) catch {}; + stdout.print(" valid values: claude, opencode (or unset for default)\n", .{}) catch {}; + stdout.print(" note: use ZAGI_AGENT_CMD for custom executors\n", .{}) catch {}; + return Error.InvalidExecutor; +} + +pub fn run(allocator: std.mem.Allocator, args: [][:0]u8) Error!void { + const stdout = std.fs.File.stdout().deprecatedWriter(); + + // Need at least "zagi agent " + if (args.len < 3) { + stdout.print("{s}", .{help}) catch {}; + return; + } + + const subcommand = std.mem.sliceTo(args[2], 0); + + if (std.mem.eql(u8, subcommand, "run")) { + return runRun(allocator, args); + } else if (std.mem.eql(u8, subcommand, "plan")) { + return runPlan(allocator, args); + } else if (std.mem.eql(u8, subcommand, "-h") or std.mem.eql(u8, subcommand, "--help")) { + stdout.print("{s}", .{help}) catch {}; + return; + } else { + stdout.print("error: unknown command '{s}'\n\n{s}", .{ subcommand, help }) catch {}; + return Error.InvalidCommand; + } +} + +/// Planning prompt template for the `zagi agent plan` subcommand. +/// +/// This prompt instructs an AI agent to conduct an INTERACTIVE planning session +/// where it explores the codebase first, then asks clarifying questions about +/// scope, constraints, and preferences before drafting any plan. The session is +/// collaborative - the agent understands the architecture, asks targeted questions +/// to gather requirements, and only creates tasks after user approval. +/// +/// Template placeholders: +/// - {0s}: Optional initial context from the user (may be empty) +/// - {1s}: Absolute path to the zagi binary (for task creation commands) +/// +/// The planning agent follows a strict protocol: +/// 1. EXPLORE: Read the codebase to understand architecture FIRST +/// 2. ASK: Ask clarifying questions about scope, constraints, preferences +/// 3. PROPOSE: Present a numbered plan referencing specific files/patterns +/// 4. CONFIRM: Only create tasks after explicit approval +const planning_prompt_template = + \\You are an interactive planning agent. Your job is to collaboratively design + \\an implementation plan with the user through conversation. + \\ + \\INITIAL CONTEXT: {0s} + \\ + \\=== INTERACTIVE PLANNING PROTOCOL === + \\ + \\PHASE 1: EXPLORE CODEBASE (do this FIRST, before asking questions) + \\Before asking any questions, silently explore the codebase to understand: + \\- Read AGENTS.md for project conventions, build commands, and patterns + \\- Examine the directory structure to understand the project layout + \\- Identify key files and their purposes + \\- Understand the current architecture and patterns in use + \\- Find existing code related to the initial context (if provided) + \\- Note any relevant tests, configs, or documentation + \\ + \\This exploration helps you ask informed questions and propose realistic plans. + \\ + \\PHASE 2: ASK CLARIFYING QUESTIONS (critical - do not skip) + \\DO NOT draft a plan yet. First, engage the user with clarifying questions. + \\Ask about these areas before proposing any implementation: + \\ + \\SCOPE questions: + \\- What specific functionality should be included/excluded? + \\- Are there edge cases or error scenarios to consider? + \\- What's the minimum viable version vs nice-to-haves? + \\ + \\CONSTRAINTS questions: + \\- Are there performance requirements? + \\- Any dependencies or compatibility concerns? + \\- Time/effort budget considerations? + \\ + \\PREFERENCES questions: + \\- Preferred approach or patterns? (e.g., "should this use X or Y?") + \\- How should this integrate with existing code? + \\- Testing requirements or coverage expectations? + \\ + \\ACCEPTANCE CRITERIA questions: + \\- How will we know when this is done? + \\- What does success look like? + \\ + \\Guidelines: + \\- Ask 2-4 focused questions at a time, not a wall of questions + \\- Reference what you found in the codebase to make questions specific + \\- Keep asking until you have clarity on scope, constraints, and preferences + \\- If context was provided, acknowledge it and ask clarifying follow-ups + \\- If no context, start by asking "What would you like to build?" + \\ + \\PHASE 3: PROPOSE PLAN (only after clarifying questions answered) + \\Present a detailed, numbered implementation plan: + \\- Reference specific files and patterns discovered in Phase 1 + \\- Break work into small, self-contained tasks + \\- Each task should be completable in one session + \\- Include acceptance criteria for each task + \\- Order tasks by dependencies (foundations first) + \\- Format as a numbered list the user can review + \\ + \\Example plan format: + \\ "Based on our discussion and my exploration, here's my proposed plan: + \\ + \\ 1. Add user model - create src/models/user.zig following the struct + \\ patterns I found in src/cmds/git.zig + \\ 2. Add auth endpoint - POST /api/login, integrating with your existing + \\ error handling in src/cmds/git.zig + \\ 3. Add middleware - JWT validation following your existing patterns + \\ 4. Add tests - unit tests matching your test/ structure + \\ + \\ Does this look good? Should I adjust anything before creating tasks?" + \\ + \\PHASE 4: CREATE TASKS (only after approval) + \\Wait for explicit user confirmation before creating any tasks. + \\When approved, you have two options: + \\ + \\Option A - Write plan to file, then import: + \\ 1. Write plan to plan.md with numbered list (1. Task one, 2. Task two, ...) + \\ 2. Run: {1s} tasks import plan.md + \\ This creates all tasks at once from the markdown file. + \\ + \\Option B - Add tasks individually: + \\ {1s} tasks add "" + \\ + \\After creating all tasks, show the final list: + \\ {1s} tasks list + \\ + \\=== RULES === + \\- ALWAYS explore the codebase BEFORE asking questions + \\- ALWAYS ask clarifying questions BEFORE drafting a plan + \\- Ask informed questions that reference what you found + \\- NEVER create tasks without explicit user approval + \\- NEVER git push (only commit) + \\- Keep the conversation focused and productive + \\- If the user wants to change the plan, update it and confirm again + \\ +; + +fn runPlan(allocator: std.mem.Allocator, args: [][:0]u8) Error!void { + const stdout = std.fs.File.stdout().deprecatedWriter(); + const stderr = std.fs.File.stderr().deprecatedWriter(); + + var dry_run = false; + var initial_context: ?[]const u8 = null; + + var i: usize = 3; // Start after "zagi agent plan" + while (i < args.len) { + const arg = std.mem.sliceTo(args[i], 0); + + if (std.mem.eql(u8, arg, "--dry-run")) { + dry_run = true; + } else if (std.mem.eql(u8, arg, "-h") or std.mem.eql(u8, arg, "--help")) { + stdout.print("{s}", .{plan_help}) catch {}; + return; + } else if (arg[0] == '-') { + stdout.print("error: unknown option '{s}'\n", .{arg}) catch {}; + return Error.InvalidCommand; + } else { + initial_context = arg; + } + i += 1; + } + + // Check ZAGI_AGENT_CMD for custom command override + const agent_cmd = std.posix.getenv("ZAGI_AGENT_CMD"); + const executor = if (agent_cmd != null) + std.posix.getenv("ZAGI_AGENT") orelse "" // No auto flags when only ZAGI_AGENT_CMD is set + else + try getValidatedExecutor(stdout); + + // Get absolute path to current executable + var exe_path_buf: [std.fs.max_path_bytes]u8 = undefined; + const exe_path = std.fs.selfExePath(&exe_path_buf) catch { + stderr.print("error: failed to resolve executable path\n", .{}) catch {}; + return Error.SpawnFailed; + }; + + // Use provided context or indicate none was given + const context_str = initial_context orelse "(none - start by asking what the user wants to build)"; + + // Create the planning prompt with dynamic path + const prompt = std.fmt.allocPrint(allocator, planning_prompt_template, .{ context_str, exe_path }) catch return Error.OutOfMemory; + defer allocator.free(prompt); + + if (dry_run) { + stdout.print("=== Interactive Planning Session (dry-run) ===\n\n", .{}) catch {}; + if (initial_context) |ctx| { + stdout.print("Initial context: {s}\n\n", .{ctx}) catch {}; + } else { + stdout.print("Initial context: (none - will ask user)\n\n", .{}) catch {}; + } + stdout.print("Would execute:\n", .{}) catch {}; + stdout.print(" {s} \"\"\n", .{formatExecutorCommand(executor, agent_cmd, true)}) catch {}; + stdout.print("\n--- Prompt Preview ---\n{s}\n", .{prompt}) catch {}; + return; + } + + stdout.print("=== Starting Interactive Planning Session ===\n", .{}) catch {}; + if (initial_context) |ctx| { + stdout.print("Initial context: {s}\n", .{ctx}) catch {}; + } + stdout.print("Executor: {s}\n", .{executor}) catch {}; + stdout.print("\nThe agent will ask questions to understand your requirements,\n", .{}) catch {}; + stdout.print("then propose a plan for your approval before creating tasks.\n\n", .{}) catch {}; + + // Build and execute command in interactive mode (user converses with agent) + var runner_args = buildExecutorArgs(allocator, executor, agent_cmd, prompt, true) catch return Error.OutOfMemory; + defer runner_args.deinit(allocator); + + var child = std.process.Child.init(runner_args.items, allocator); + child.stdin_behavior = .Inherit; + child.stdout_behavior = .Inherit; + child.stderr_behavior = .Inherit; + + const term = child.spawnAndWait() catch |err| { + stderr.print("Error executing agent: {s}\n", .{@errorName(err)}) catch {}; + return Error.SpawnFailed; + }; + + const success = term == .Exited and term.Exited == 0; + + if (success) { + stdout.print("\n=== Planning session completed ===\n", .{}) catch {}; + stdout.print("Run 'zagi tasks list' to see created tasks\n", .{}) catch {}; + stdout.print("Run 'zagi agent run' to execute tasks\n", .{}) catch {}; + } else { + stdout.print("\n=== Planning session ended ===\n", .{}) catch {}; + } +} + +/// Executes the RALPH (Recursive Agent Loop Pattern for Humans) loop. +/// +/// The RALPH loop is an autonomous task execution pattern: +/// +/// ``` +/// ┌─────────────────────────────────────────────────────────┐ +/// │ RALPH Loop Algorithm │ +/// ├─────────────────────────────────────────────────────────┤ +/// │ 1. Load pending tasks from git refs │ +/// │ 2. Find next task with < 3 consecutive failures │ +/// │ 3. If no eligible task found → exit loop │ +/// │ 4. Execute task with configured AI agent │ +/// │ 5. On success: reset failure counter, increment count │ +/// │ On failure: increment failure counter │ +/// │ 6. If --once flag set → exit loop │ +/// │ 7. Wait delay seconds, goto step 1 │ +/// └─────────────────────────────────────────────────────────┘ +/// ``` +/// +/// Key behaviors: +/// - **Failure tolerance**: Tasks are skipped after 3 consecutive failures +/// to prevent infinite loops on broken tasks +/// - **Safety limits**: Optional --max-tasks prevents runaway execution +/// - **Observability**: Output streamed to /tmp/zagi//.log +/// - **Graceful completion**: Exits when all tasks done or all remaining +/// tasks have exceeded failure threshold +fn runRun(allocator: std.mem.Allocator, args: [][:0]u8) Error!void { + const stdout = std.fs.File.stdout().deprecatedWriter(); + const stderr = std.fs.File.stderr().deprecatedWriter(); + + // Parse command options + var once = false; + var dry_run = false; + var delay: u32 = 2; + var max_tasks: ?u32 = null; + + var i: usize = 3; // Start after "zagi agent run" + while (i < args.len) { + const arg = std.mem.sliceTo(args[i], 0); + + if (std.mem.eql(u8, arg, "--once")) { + once = true; + } else if (std.mem.eql(u8, arg, "--dry-run")) { + dry_run = true; + } else if (std.mem.eql(u8, arg, "--delay")) { + i += 1; + if (i >= args.len) { + stdout.print("error: --delay requires a number of seconds\n", .{}) catch {}; + return Error.InvalidCommand; + } + const delay_str = std.mem.sliceTo(args[i], 0); + delay = std.fmt.parseInt(u32, delay_str, 10) catch { + stdout.print("error: invalid delay value '{s}'\n", .{delay_str}) catch {}; + return Error.InvalidCommand; + }; + } else if (std.mem.eql(u8, arg, "--max-tasks")) { + i += 1; + if (i >= args.len) { + stdout.print("error: --max-tasks requires a number\n", .{}) catch {}; + return Error.InvalidCommand; + } + const max_str = std.mem.sliceTo(args[i], 0); + max_tasks = std.fmt.parseInt(u32, max_str, 10) catch { + stdout.print("error: invalid max-tasks value '{s}'\n", .{max_str}) catch {}; + return Error.InvalidCommand; + }; + } else if (std.mem.eql(u8, arg, "-h") or std.mem.eql(u8, arg, "--help")) { + stdout.print("{s}", .{run_help}) catch {}; + return; + } else { + stdout.print("error: unknown option '{s}'\n", .{arg}) catch {}; + return Error.InvalidCommand; + } + i += 1; + } + + // Check ZAGI_AGENT_CMD for custom command override + const agent_cmd = std.posix.getenv("ZAGI_AGENT_CMD"); + const executor = if (agent_cmd != null) + std.posix.getenv("ZAGI_AGENT") orelse "" // No auto flags when only ZAGI_AGENT_CMD is set + else + try getValidatedExecutor(stdout); + + // Get absolute path to current executable + var exe_path_buf: [std.fs.max_path_bytes]u8 = undefined; + const exe_path = std.fs.selfExePath(&exe_path_buf) catch { + stderr.print("error: failed to resolve executable path\n", .{}) catch {}; + return Error.SpawnFailed; + }; + + // Initialize libgit2 + if (c.git_libgit2_init() < 0) { + return git.Error.InitFailed; + } + defer _ = c.git_libgit2_shutdown(); + + // Open repository + var repo: ?*c.git_repository = null; + if (c.git_repository_open_ext(&repo, ".", 0, null) < 0) { + return git.Error.NotARepository; + } + defer c.git_repository_free(repo); + + // Create log path in temp directory: /tmp/zagi//.log + var log_path_buf: [512]u8 = undefined; + const log_path = createLogPath(&log_path_buf); + + if (log_path) |p| { + stdout.print("Log file: {s}\n", .{p}) catch {}; + } + + var tasks_completed: u32 = 0; + + // Consecutive failure tracking map: task_id -> failure_count + // + // This map tracks how many times each task has failed IN A ROW. The key + // insight is "consecutive" - a task that succeeds resets its counter to 0. + // + // Why track consecutive failures instead of total failures? + // - Transient errors (network issues, race conditions) shouldn't permanently + // disqualify a task + // - If a task succeeds once, it proves the task CAN work + // - 3 consecutive failures strongly suggests the task itself is broken + // + // Memory management: Keys are duplicated because task IDs come from + // getPendingTasks() which frees its memory after each iteration. The + // deferred cleanup frees all duplicated keys before map deinit. + var consecutive_failures = std.StringHashMap(u32).init(allocator); + defer { + // Free all the duplicated keys before deiniting the map + var key_iter = consecutive_failures.keyIterator(); + while (key_iter.next()) |key_ptr| { + allocator.free(key_ptr.*); + } + consecutive_failures.deinit(); + } + + stdout.print("Starting RALPH loop...\n", .{}) catch {}; + if (dry_run) { + stdout.print("(dry-run mode - no commands will be executed)\n", .{}) catch {}; + } + stdout.print("Executor: {s}\n\n", .{executor}) catch {}; + + while (true) { + if (max_tasks) |max| { + if (tasks_completed >= max) { + stdout.print("Reached maximum task limit ({})\n", .{max}) catch {}; + break; + } + } + + const pending = getPendingTasks(allocator) catch { + stderr.print("error: failed to load tasks\n", .{}) catch {}; + return Error.TaskLoadFailed; + }; + defer allocator.free(pending.tasks); + defer for (pending.tasks) |t| { + allocator.free(t.id); + allocator.free(t.content); + }; + + if (pending.tasks.len == 0) { + stdout.print("No pending tasks remaining. All tasks complete!\n", .{}) catch {}; + stdout.print("Run: zagi tasks pr\n", .{}) catch {}; + break; + } + + var next_task: ?PendingTask = null; + for (pending.tasks) |task| { + const failure_count = consecutive_failures.get(task.id) orelse 0; + if (failure_count < 3) { + next_task = task; + break; + } + } + + if (next_task == null) { + stdout.print("All remaining tasks have failed 3+ times. Stopping.\n", .{}) catch {}; + break; + } + + const task = next_task.?; + stdout.print("Starting task: {s}\n", .{task.id}) catch {}; + stdout.print(" {s}\n\n", .{task.content}) catch {}; + + if (dry_run) { + stdout.print("Would execute:\n", .{}) catch {}; + if (agent_cmd) |cmd| { + // Show custom command with mode flags based on executor + if (std.mem.eql(u8, executor, "claude")) { + stdout.print(" {s} -p \"\"\n", .{cmd}) catch {}; + } else if (std.mem.eql(u8, executor, "opencode")) { + stdout.print(" {s} run \"\"\n", .{cmd}) catch {}; + } else { + stdout.print(" {s} \"\"\n", .{cmd}) catch {}; + } + } else { + stdout.print(" {s} \"\"\n", .{formatExecutorCommand(executor, agent_cmd, false)}) catch {}; + } + stdout.print("\n", .{}) catch {}; + tasks_completed += 1; + } else { + const success = executeTask(allocator, executor, agent_cmd, exe_path, task.id, task.content, log_path) catch false; + + if (success) { + updateFailureCount(allocator, &consecutive_failures, task.id, 0); + tasks_completed += 1; + stdout.print("Task completed successfully\n\n", .{}) catch {}; + } else { + const new_failures = (consecutive_failures.get(task.id) orelse 0) + 1; + updateFailureCount(allocator, &consecutive_failures, task.id, new_failures); + stdout.print("Task failed ({} consecutive failures)\n", .{new_failures}) catch {}; + if (new_failures >= 3) { + stdout.print("Skipping task after 3 consecutive failures\n", .{}) catch {}; + } + stdout.print("\n", .{}) catch {}; + } + } + + if (once) { + stdout.print("Exiting after one task (--once flag set)\n", .{}) catch {}; + break; + } + + if (!dry_run and delay > 0) { + stdout.print("Waiting {} seconds before next task...\n\n", .{delay}) catch {}; + std.Thread.sleep(delay * std.time.ns_per_s); + } + } + + stdout.print("RALPH loop completed. {} tasks processed.\n", .{tasks_completed}) catch {}; +} + +const PendingTask = struct { + id: []const u8, + content: []const u8, +}; + +const PendingTasks = struct { + tasks: []PendingTask, +}; + +fn getPendingTasks(allocator: std.mem.Allocator) !PendingTasks { + // Get absolute path to current executable to avoid relative path issues + var exe_path_buf: [std.fs.max_path_bytes]u8 = undefined; + const exe_path = std.fs.selfExePath(&exe_path_buf) catch return error.SpawnFailed; + + const result = std.process.Child.run(.{ + .allocator = allocator, + .argv = &.{ exe_path, "tasks", "list", "--json" }, + }) catch return error.SpawnFailed; + defer allocator.free(result.stdout); + defer allocator.free(result.stderr); + + const parsed = std.json.parseFromSlice(struct { + tasks: []const struct { + id: []const u8, + content: []const u8, + status: []const u8, + created: i64, + completed: ?i64, + }, + }, allocator, result.stdout, .{}) catch { + return PendingTasks{ .tasks = &.{} }; + }; + defer parsed.deinit(); + + var pending = std.ArrayList(PendingTask){}; + for (parsed.value.tasks) |task| { + if (!std.mem.eql(u8, task.status, "completed")) { + const id_dupe = allocator.dupe(u8, task.id) catch continue; + const content_dupe = allocator.dupe(u8, task.content) catch { + allocator.free(id_dupe); // Free id if content alloc fails + continue; + }; + pending.append(allocator, .{ + .id = id_dupe, + .content = content_dupe, + }) catch { + allocator.free(id_dupe); + allocator.free(content_dupe); + continue; + }; + } + } + + return PendingTasks{ .tasks = pending.toOwnedSlice(allocator) catch &.{} }; +} + +fn createPrompt(allocator: std.mem.Allocator, exe_path: []const u8, task_id: []const u8, task_content: []const u8) ![]u8 { + return std.fmt.allocPrint(allocator, + \\Task ID: {0s} + \\Task: {1s} + \\ + \\Instructions: + \\1. Read AGENTS.md if it exists for project context + \\2. Complete this ONE task only + \\3. Verify your work (run tests if applicable) + \\4. Commit changes: git commit -m "" + \\5. Mark done: {2s} tasks done {0s} + \\6. Output the COMPLETION PROMISE below + \\ + \\Knowledge Persistence: + \\If you discover important insights during this task, update AGENTS.md: + \\- Build commands that work (or gotchas that don't) + \\- Key file locations and their purposes + \\- Project conventions not documented elsewhere + \\- Common errors and their solutions + \\Only add genuinely useful operational knowledge, not task-specific details. + \\ + \\COMPLETION PROMISE (required - output this exactly when done): + \\ + \\COMPLETION PROMISE: I confirm that: + \\- Tests pass: [which tests ran, or "N/A" if no tests] + \\- Build succeeds: [build command, or "N/A" if no build] + \\- Changes committed: [commit hash and message] + \\- Task completed: [brief summary of what was done] + \\-- I have not taken any shortcuts or skipped verification. + \\ + \\Rules: + \\- NEVER git push + \\- Only work on this task + \\- Must output the completion promise when done + , .{ task_id, task_content, exe_path }); +} + +fn executeTask(allocator: std.mem.Allocator, executor: []const u8, agent_cmd: ?[]const u8, exe_path: []const u8, task_id: []const u8, task_content: []const u8, log_path: ?[]const u8) !bool { + const stderr_writer = std.fs.File.stderr().deprecatedWriter(); + + const prompt = try createPrompt(allocator, exe_path, task_id, task_content); + defer allocator.free(prompt); + + // Use headless mode (interactive=false) for autonomous task execution + var runner_args = try buildExecutorArgs(allocator, executor, agent_cmd, prompt, false); + defer runner_args.deinit(allocator); + + // Build command string for shell execution with tee + var cmd_buf: [4096]u8 = undefined; + var cmd_len: usize = 0; + + // Quote each argument and join + for (runner_args.items) |arg| { + if (cmd_len > 0) { + cmd_buf[cmd_len] = ' '; + cmd_len += 1; + } + // Add single quotes around arg, escaping any single quotes in it + cmd_buf[cmd_len] = '\''; + cmd_len += 1; + for (arg) |ch| { + if (ch == '\'') { + // End quote, add escaped quote, restart quote: '\'' + if (cmd_len + 4 < cmd_buf.len) { + cmd_buf[cmd_len] = '\''; + cmd_buf[cmd_len + 1] = '\\'; + cmd_buf[cmd_len + 2] = '\''; + cmd_buf[cmd_len + 3] = '\''; + cmd_len += 4; + } + } else { + if (cmd_len < cmd_buf.len) { + cmd_buf[cmd_len] = ch; + cmd_len += 1; + } + } + } + cmd_buf[cmd_len] = '\''; + cmd_len += 1; + } + + // Add tee to log file if we have a log path + if (log_path) |lp| { + const tee_suffix = std.fmt.bufPrint(cmd_buf[cmd_len..], " 2>&1 | tee '{s}'", .{lp}) catch return false; + cmd_len += tee_suffix.len; + } + + const shell_cmd = cmd_buf[0..cmd_len]; + + // Run via shell to get tee piping + var child = std.process.Child.init(&.{ "/bin/sh", "-c", shell_cmd }, allocator); + child.stdin_behavior = .Inherit; + child.stdout_behavior = .Inherit; + child.stderr_behavior = .Inherit; + + const term = child.spawnAndWait() catch |err| { + stderr_writer.print("Error executing runner: {s}\n", .{@errorName(err)}) catch {}; + return false; + }; + + const exited_ok = term == .Exited and term.Exited == 0; + + // Check log file for completion promise + var found_completion = false; + if (log_path) |lp| { + const log_content = std.fs.cwd().readFileAlloc(allocator, lp, 10 * 1024 * 1024) catch null; + if (log_content) |content| { + defer allocator.free(content); + const promise_start = "COMPLETION PROMISE: I confirm that:"; + const promise_end = "-- I have not taken any shortcuts or skipped verification."; + const found_start = std.mem.indexOf(u8, content, promise_start) != null; + const found_end = std.mem.indexOf(u8, content, promise_end) != null; + found_completion = found_start and found_end; + } + } + + if (!found_completion) { + stderr_writer.print("Task did not output completion promise\n", .{}) catch {}; + } + + return exited_ok and found_completion; +} + +// Tests for buildExecutorArgs +const testing = std.testing; + +test "buildExecutorArgs - claude headless includes -p" { + var args = try buildExecutorArgs(testing.allocator, "claude", null, "test prompt", false); + defer args.deinit(testing.allocator); + + try testing.expectEqual(@as(usize, 3), args.items.len); + try testing.expectEqualStrings("claude", args.items[0]); + try testing.expectEqualStrings("-p", args.items[1]); + try testing.expectEqualStrings("test prompt", args.items[2]); +} + +test "buildExecutorArgs - claude interactive no -p" { + var args = try buildExecutorArgs(testing.allocator, "claude", null, "test prompt", true); + defer args.deinit(testing.allocator); + + try testing.expectEqual(@as(usize, 2), args.items.len); + try testing.expectEqualStrings("claude", args.items[0]); + try testing.expectEqualStrings("test prompt", args.items[1]); +} + +test "buildExecutorArgs - opencode headless includes run" { + var args = try buildExecutorArgs(testing.allocator, "opencode", null, "test prompt", false); + defer args.deinit(testing.allocator); + + try testing.expectEqual(@as(usize, 3), args.items.len); + try testing.expectEqualStrings("opencode", args.items[0]); + try testing.expectEqualStrings("run", args.items[1]); + try testing.expectEqualStrings("test prompt", args.items[2]); +} + +test "buildExecutorArgs - opencode interactive no run" { + var args = try buildExecutorArgs(testing.allocator, "opencode", null, "test prompt", true); + defer args.deinit(testing.allocator); + + try testing.expectEqual(@as(usize, 2), args.items.len); + try testing.expectEqualStrings("opencode", args.items[0]); + try testing.expectEqualStrings("test prompt", args.items[1]); +} + +test "buildExecutorArgs - custom cmd with claude executor gets -p" { + // Custom command + ZAGI_AGENT=claude → adds -p for headless + var args = try buildExecutorArgs(testing.allocator, "claude", "my-claude --flag", "test prompt", false); + defer args.deinit(testing.allocator); + + try testing.expectEqual(@as(usize, 4), args.items.len); + try testing.expectEqualStrings("my-claude", args.items[0]); + try testing.expectEqualStrings("--flag", args.items[1]); + try testing.expectEqualStrings("-p", args.items[2]); + try testing.expectEqualStrings("test prompt", args.items[3]); +} + +test "buildExecutorArgs - custom cmd with opencode executor gets run" { + // Custom command + ZAGI_AGENT=opencode → adds run for headless + var args = try buildExecutorArgs(testing.allocator, "opencode", "my-opencode --flag", "test prompt", false); + defer args.deinit(testing.allocator); + + try testing.expectEqual(@as(usize, 4), args.items.len); + try testing.expectEqualStrings("my-opencode", args.items[0]); + try testing.expectEqualStrings("--flag", args.items[1]); + try testing.expectEqualStrings("run", args.items[2]); + try testing.expectEqualStrings("test prompt", args.items[3]); +} + +test "buildExecutorArgs - custom cmd with unknown executor no auto flags" { + // Custom command + unknown executor → no auto flags + var args = try buildExecutorArgs(testing.allocator, "aider", "aider --yes", "test prompt", false); + defer args.deinit(testing.allocator); + + try testing.expectEqual(@as(usize, 3), args.items.len); + try testing.expectEqualStrings("aider", args.items[0]); + try testing.expectEqualStrings("--yes", args.items[1]); + try testing.expectEqualStrings("test prompt", args.items[2]); +} + +test "buildExecutorArgs - custom cmd interactive no flags added" { + // Interactive mode never adds mode flags + var args = try buildExecutorArgs(testing.allocator, "claude", "my-claude --flag", "test prompt", true); + defer args.deinit(testing.allocator); + + try testing.expectEqual(@as(usize, 3), args.items.len); + try testing.expectEqualStrings("my-claude", args.items[0]); + try testing.expectEqualStrings("--flag", args.items[1]); + try testing.expectEqualStrings("test prompt", args.items[2]); +} + diff --git a/src/cmds/commit.zig b/src/cmds/commit.zig index 5a0b921..f367f9b 100644 --- a/src/cmds/commit.zig +++ b/src/cmds/commit.zig @@ -1,5 +1,6 @@ const std = @import("std"); const git = @import("git.zig"); +const detect = @import("detect.zig"); const c = git.c; pub const help = @@ -14,9 +15,10 @@ pub const help = \\ --prompt

Store the complete user prompt that created this commit \\ \\Environment: - \\ ZAGI_AGENT= Declare agent operator (requires --prompt) \\ ZAGI_STRIP_COAUTHORS=1 Remove Co-Authored-By lines from message \\ + \\Agent mode requires --prompt when agent is detected. + \\ ; pub fn run(allocator: std.mem.Allocator, args: [][:0]u8) git.Error!void { @@ -69,9 +71,9 @@ pub fn run(allocator: std.mem.Allocator, args: [][:0]u8) git.Error!void { return git.Error.UsageError; } - // Check if prompt is required (when running as an AI agent) - if (std.posix.getenv("ZAGI_AGENT") != null and prompt == null) { - stdout.print("error: --prompt required (ZAGI_AGENT is set)\n", .{}) catch {}; + // Check if prompt is required (when running in agent mode) + if (detect.isAgentMode() and prompt == null) { + stdout.print("error: --prompt required in agent mode\n", .{}) catch {}; stdout.print("hint: use --prompt to record the prompt that created this commit\n", .{}) catch {}; return git.Error.UsageError; } @@ -272,26 +274,66 @@ pub fn run(allocator: std.mem.Allocator, args: [][:0]u8) git.Error!void { } } - // Store prompt as git note if provided + // Store metadata in separate git notes (plain text, not JSON) if (prompt) |p| { - // Copy prompt to null-terminated buffer - var prompt_buf: [8192]u8 = undefined; - if (p.len < prompt_buf.len) { - @memcpy(prompt_buf[0..p.len], p); - prompt_buf[p.len] = 0; + // Detect agent + const agent = detect.detectAgent(); + var note_oid: c.git_oid = undefined; - var note_oid: c.git_oid = undefined; + // 1. Store agent name in refs/notes/agent + _ = c.git_note_create( + ¬e_oid, + repo, + "refs/notes/agent", + signature, + signature, + &commit_oid, + agent.name().ptr, + 0, + ); + + // 2. Store prompt in refs/notes/prompt + const prompt_z = allocator.allocSentinel(u8, p.len, 0) catch null; + if (prompt_z) |pz| { + defer allocator.free(pz); + @memcpy(pz, p); _ = c.git_note_create( ¬e_oid, repo, - "refs/notes/prompts", // Custom namespace for AI prompts + "refs/notes/prompt", signature, signature, &commit_oid, - &prompt_buf, - 0, // Don't force overwrite + pz.ptr, + 0, ); } + + // 3. Store session transcript in refs/notes/session (if available) + var cwd_buf: [std.fs.max_path_bytes]u8 = undefined; + const cwd = std.fs.cwd().realpath(".", &cwd_buf) catch null; + if (cwd) |c_path| { + if (detect.readCurrentSession(allocator, agent, c_path)) |session| { + defer allocator.free(session.path); + defer allocator.free(session.transcript); + + const session_z = allocator.allocSentinel(u8, session.transcript.len, 0) catch null; + if (session_z) |sz| { + defer allocator.free(sz); + @memcpy(sz, session.transcript); + _ = c.git_note_create( + ¬e_oid, + repo, + "refs/notes/session", + signature, + signature, + &commit_oid, + sz.ptr, + 0, + ); + } + } + } } // Format output diff --git a/src/cmds/detect.zig b/src/cmds/detect.zig new file mode 100644 index 0000000..da95931 --- /dev/null +++ b/src/cmds/detect.zig @@ -0,0 +1,259 @@ +const std = @import("std"); + +/// Known AI agents/tools +pub const Agent = enum { + claude, + opencode, + windsurf, + cursor, + vscode, + vscode_fork, + terminal, + + /// Returns the string name for this agent + pub fn name(self: Agent) []const u8 { + return switch (self) { + .claude => "claude", + .opencode => "opencode", + .windsurf => "windsurf", + .cursor => "cursor", + .vscode => "vscode", + .vscode_fork => "vscode-fork", + .terminal => "terminal", + }; + } +}; + +/// Agent mode detection (for guardrails + --prompt requirement) +/// Checks signals that are set by parent process and hard to bypass +pub fn isAgentMode() bool { + // Native agent signals (set by IDE/CLI parent process) + if (std.posix.getenv("CLAUDECODE") != null) return true; + if (std.posix.getenv("OPENCODE") != null) return true; + // Custom agent signal (must be non-empty to enable agent mode) + if (std.posix.getenv("ZAGI_AGENT")) |v| { + if (v.len > 0) return true; + } + return false; +} + +/// Detect the AI agent/tool from environment +pub fn detectAgent() Agent { + // CLI tools - most specific signals + if (std.posix.getenv("CLAUDECODE") != null) return .claude; + if (std.posix.getenv("OPENCODE") != null) return .opencode; + + // VSCode forks - check app path in VSCODE_GIT_ASKPASS_NODE + if (std.posix.getenv("VSCODE_GIT_ASKPASS_NODE")) |path| { + if (std.mem.indexOf(u8, path, "Windsurf") != null) return .windsurf; + if (std.mem.indexOf(u8, path, "Cursor") != null) return .cursor; + if (std.mem.indexOf(u8, path, "Code") != null) return .vscode; + return .vscode_fork; + } + + return .terminal; +} + +// TODO: Extract model from session transcript when surfacing agent metadata +// The model info is available in the session JSONL files and could be parsed +// from there for display in `git log --prompts` + +/// Session data for transcript storage +pub const Session = struct { + path: []const u8, + transcript: []const u8, +}; + +/// Read current session transcript +pub fn readCurrentSession(allocator: std.mem.Allocator, agent: Agent, cwd: []const u8) ?Session { + return switch (agent) { + .claude => readClaudeCodeSession(allocator, cwd), + .opencode => readOpenCodeSession(allocator), + else => null, + }; +} + +/// Read Claude Code session from ~/.claude/projects/{project-hash}/ +fn readClaudeCodeSession(allocator: std.mem.Allocator, cwd: []const u8) ?Session { + const home = std.posix.getenv("HOME") orelse return null; + + // Convert cwd to project hash (replace / with -) + // e.g., /Users/matt/Documents/Github/zagi -> -Users-matt-Documents-Github-zagi + var project_hash_buf: [512]u8 = undefined; + var hash_len: usize = 0; + for (cwd) |char| { + if (hash_len >= project_hash_buf.len) break; + project_hash_buf[hash_len] = if (char == '/') '-' else char; + hash_len += 1; + } + const project_hash = project_hash_buf[0..hash_len]; + + // Build project directory path + const project_dir = std.fmt.allocPrint(allocator, "{s}/.claude/projects/{s}", .{ home, project_hash }) catch return null; + defer allocator.free(project_dir); + + // Find most recent .jsonl file + var dir = std.fs.cwd().openDir(project_dir, .{ .iterate = true }) catch return null; + defer dir.close(); + + var most_recent_path: ?[]const u8 = null; + var most_recent_mtime: i128 = 0; + + var iter = dir.iterate(); + while (iter.next() catch null) |entry| { + if (entry.kind != .file) continue; + if (!std.mem.endsWith(u8, entry.name, ".jsonl")) continue; + + // Get file stat for modification time + const stat = dir.statFile(entry.name) catch continue; + const mtime = stat.mtime; + + if (most_recent_path == null or mtime > most_recent_mtime) { + if (most_recent_path) |old| allocator.free(old); + most_recent_path = std.fmt.allocPrint(allocator, "{s}/{s}", .{ project_dir, entry.name }) catch continue; + most_recent_mtime = mtime; + } + } + + if (most_recent_path) |path| { + // Read file content + const file = std.fs.cwd().openFile(path, .{}) catch { + allocator.free(path); + return null; + }; + defer file.close(); + + // Read up to 10MB of transcript + const content = file.readToEndAlloc(allocator, 10 * 1024 * 1024) catch { + allocator.free(path); + return null; + }; + + // Convert JSONL to JSON array + const transcript = convertJsonlToArray(allocator, content) catch { + allocator.free(content); + allocator.free(path); + return null; + }; + allocator.free(content); + + return Session{ + .path = path, + .transcript = transcript, + }; + } + + return null; +} + +/// Convert JSONL (newline-delimited JSON) to a JSON array +fn convertJsonlToArray(allocator: std.mem.Allocator, jsonl: []const u8) ![]const u8 { + var result = std.array_list.Managed(u8).init(allocator); + errdefer result.deinit(); + + try result.append('['); + + var first = true; + var lines = std.mem.splitScalar(u8, jsonl, '\n'); + while (lines.next()) |line| { + const trimmed = std.mem.trim(u8, line, " \t\r"); + if (trimmed.len == 0) continue; + + if (!first) { + try result.append(','); + } + first = false; + try result.appendSlice(trimmed); + } + + try result.append(']'); + return result.toOwnedSlice(); +} + +/// Read OpenCode session +fn readOpenCodeSession(allocator: std.mem.Allocator) ?Session { + const home = std.posix.getenv("HOME") orelse return null; + + // OpenCode stores sessions in ~/.local/share/opencode/storage/session/ + const base_dir = std.fmt.allocPrint(allocator, "{s}/.local/share/opencode/storage/message", .{home}) catch return null; + defer allocator.free(base_dir); + + // Find most recent session directory + var dir = std.fs.cwd().openDir(base_dir, .{ .iterate = true }) catch return null; + defer dir.close(); + + var most_recent_dir: ?[]const u8 = null; + var most_recent_mtime: i128 = 0; + + var iter = dir.iterate(); + while (iter.next() catch null) |entry| { + if (entry.kind != .directory) continue; + + const stat = dir.statFile(entry.name) catch continue; + const mtime = stat.mtime; + + if (most_recent_dir == null or mtime > most_recent_mtime) { + if (most_recent_dir) |old| allocator.free(old); + most_recent_dir = allocator.dupe(u8, entry.name) catch continue; + most_recent_mtime = mtime; + } + } + + if (most_recent_dir) |session_id| { + defer allocator.free(session_id); + + // Read all message files in this session + const session_dir = std.fmt.allocPrint(allocator, "{s}/{s}", .{ base_dir, session_id }) catch return null; + + var messages_dir = std.fs.cwd().openDir(session_dir, .{ .iterate = true }) catch { + allocator.free(session_dir); + return null; + }; + defer messages_dir.close(); + + // Collect all messages into an array + var messages = std.array_list.Managed(u8).init(allocator); + errdefer messages.deinit(); + + messages.append('[') catch return null; + + var first = true; + var msg_iter = messages_dir.iterate(); + while (msg_iter.next() catch null) |entry| { + if (entry.kind != .file) continue; + if (!std.mem.endsWith(u8, entry.name, ".json")) continue; + + const msg_content = messages_dir.readFileAlloc(allocator, entry.name, 1024 * 1024) catch continue; + defer allocator.free(msg_content); + + if (!first) { + messages.append(',') catch continue; + } + first = false; + messages.appendSlice(msg_content) catch continue; + } + + messages.append(']') catch return null; + + return Session{ + .path = session_dir, + .transcript = messages.toOwnedSlice() catch return null, + }; + } + + return null; +} + +// Tests +test "isAgentMode returns false when no env vars set" { + // Note: This test assumes env vars are not set in test environment + // In practice, we can't easily unset env vars in Zig tests + const result = isAgentMode(); + _ = result; // Just verify it compiles and runs +} + +test "detectAgent returns based on env vars" { + const agent = detectAgent(); + // Without mocking, this will return based on actual env + _ = agent.name(); +} diff --git a/src/cmds/diff.zig b/src/cmds/diff.zig index e7e49a5..cc4c976 100644 --- a/src/cmds/diff.zig +++ b/src/cmds/diff.zig @@ -138,8 +138,19 @@ pub fn run(_: std.mem.Allocator, args: [][:0]u8) DiffError!void { // Unknown flag - passthrough to git return git.Error.UnsupportedFlag; } else if (!std.mem.startsWith(u8, a, "-")) { - // Non-flag argument is a revision spec - rev_spec = a; + // Non-flag argument: could be revision spec or path + // Check if it's an existing path first (file or directory) + const stat = std.fs.cwd().statFile(a) catch null; + if (stat != null) { + // It's an existing path - treat as pathspec + if (pathspec_count < MAX_PATHSPECS) { + pathspecs[pathspec_count] = @constCast(arg.ptr); + pathspec_count += 1; + } + } else { + // Not a path - treat as revision spec + rev_spec = a; + } } } diff --git a/src/cmds/log.zig b/src/cmds/log.zig index 55c9747..fad69cd 100644 --- a/src/cmds/log.zig +++ b/src/cmds/log.zig @@ -4,7 +4,8 @@ const c = git.c; pub const help = \\usage: git log [-n ] [--author=] [--grep=] - \\ [--since=] [--until=] [--prompts] [-- ...] + \\ [--since=] [--until=] [--prompts] [--agent] + \\ [--session] [-- ...] \\ \\Show commit history. \\ @@ -15,6 +16,10 @@ pub const help = \\ --since= Show commits after date (e.g. 2025-01-01, "1 week ago") \\ --until= Show commits before date \\ --prompts Show AI prompts attached to commits + \\ --agent Show AI agent that made the commit + \\ --session Show session transcript (first 20k chars) + \\ --session-offset=N Start session display at byte N + \\ --session-limit=N Limit session display to N bytes (default: 20000) \\ -- ... Show commits affecting paths \\ ; @@ -30,6 +35,10 @@ const Options = struct { pathspecs: [MAX_PATHSPECS][*c]u8 = undefined, pathspec_count: usize = 0, show_prompts: bool = false, + show_agent: bool = false, + show_session: bool = false, + session_offset: usize = 0, + session_limit: usize = 20000, }; pub fn run(allocator: std.mem.Allocator, args: [][:0]u8) (git.Error || error{OutOfMemory})!void { @@ -89,6 +98,14 @@ pub fn run(allocator: std.mem.Allocator, args: [][:0]u8) (git.Error || error{Out // Already one-line format by default, ignore } else if (std.mem.eql(u8, arg, "--prompts")) { opts.show_prompts = true; + } else if (std.mem.eql(u8, arg, "--agent")) { + opts.show_agent = true; + } else if (std.mem.eql(u8, arg, "--session")) { + opts.show_session = true; + } else if (std.mem.startsWith(u8, arg, "--session-offset=")) { + opts.session_offset = std.fmt.parseInt(usize, arg[17..], 10) catch 0; + } else if (std.mem.startsWith(u8, arg, "--session-limit=")) { + opts.session_limit = std.fmt.parseInt(usize, arg[16..], 10) catch 20000; } else if (std.mem.startsWith(u8, arg, "-") or std.mem.startsWith(u8, arg, "--")) { // Unknown flag - passthrough to git return git.Error.UnsupportedFlag; @@ -196,16 +213,43 @@ fn printCommit( try writer.print("{s} {s}\n", .{ sha[0..7], subject }); } - // Show prompt note if requested + const repo = c.git_commit_owner(commit); + var note: ?*c.git_note = null; + + // Show agent if requested + if (opts.show_agent) { + if (c.git_note_read(¬e, repo, "refs/notes/agent", oid) == 0) { + defer c.git_note_free(note); + const note_msg = c.git_note_message(note); + if (note_msg) |msg| { + const agent_name = std.mem.sliceTo(msg, 0); + try writer.print(" agent: {s}\n", .{agent_name}); + } + } + note = null; + } + + // Show prompt if requested if (opts.show_prompts) { - const repo = c.git_commit_owner(commit); - var note: ?*c.git_note = null; - if (c.git_note_read(¬e, repo, "refs/notes/prompts", oid) == 0) { + if (c.git_note_read(¬e, repo, "refs/notes/prompt", oid) == 0) { + defer c.git_note_free(note); + const note_msg = c.git_note_message(note); + if (note_msg) |msg| { + const prompt_text = std.mem.sliceTo(msg, 0); + const max_len: usize = 200; + if (prompt_text.len > max_len) { + try writer.print(" prompt: {s}...\n", .{prompt_text[0..max_len]}); + } else { + try writer.print(" prompt: {s}\n", .{prompt_text}); + } + } + } + // Fallback to legacy refs/notes/prompts (will be removed in future) + else if (c.git_note_read(¬e, repo, "refs/notes/prompts", oid) == 0) { defer c.git_note_free(note); const note_msg = c.git_note_message(note); if (note_msg) |msg| { const prompt_text = std.mem.sliceTo(msg, 0); - // Truncate long prompts for display const max_len: usize = 200; if (prompt_text.len > max_len) { try writer.print(" prompt: {s}...\n", .{prompt_text[0..max_len]}); @@ -214,6 +258,42 @@ fn printCommit( } } } + note = null; + } + + // Show session transcript if requested (with offset/limit pagination) + if (opts.show_session) { + if (c.git_note_read(¬e, repo, "refs/notes/session", oid) == 0) { + defer c.git_note_free(note); + const note_msg = c.git_note_message(note); + if (note_msg) |msg| { + const session_text = std.mem.sliceTo(msg, 0); + const total_len = session_text.len; + + // Apply offset and limit + if (opts.session_offset >= total_len) { + try writer.print(" session: (offset {d} beyond end, total {d} bytes)\n", .{ opts.session_offset, total_len }); + } else { + const start = opts.session_offset; + const remaining = total_len - start; + const display_len = @min(remaining, opts.session_limit); + const end = start + display_len; + + if (start > 0 or end < total_len) { + // Show range info when using offset or truncated + try writer.print(" session [{d}-{d} of {d} bytes]:\n ", .{ start, end, total_len }); + } else { + try writer.print(" session:\n ", .{}); + } + try writer.print("{s}", .{session_text[start..end]}); + if (end < total_len) { + try writer.print("\n ... ({d} more bytes, use --session-offset={d})\n", .{ total_len - end, end }); + } else { + try writer.print("\n", .{}); + } + } + } + } } } diff --git a/src/cmds/tasks.zig b/src/cmds/tasks.zig index cef5d37..c4a238b 100644 --- a/src/cmds/tasks.zig +++ b/src/cmds/tasks.zig @@ -3,6 +3,34 @@ const git = @import("git.zig"); const c = git.c; const json = std.json; +/// Escape a string for JSON output (escapes quotes, backslashes, newlines, etc.) +fn escapeJsonString(allocator: std.mem.Allocator, input: []const u8) ![]u8 { + var result = std.ArrayList(u8){}; + errdefer result.deinit(allocator); + + for (input) |char| { + switch (char) { + '"' => try result.appendSlice(allocator, "\\\""), + '\\' => try result.appendSlice(allocator, "\\\\"), + '\n' => try result.appendSlice(allocator, "\\n"), + '\r' => try result.appendSlice(allocator, "\\r"), + '\t' => try result.appendSlice(allocator, "\\t"), + else => { + if (char < 0x20) { + // Control characters - output as \u00XX + var buf: [6]u8 = undefined; + const len = std.fmt.bufPrint(&buf, "\\u{x:0>4}", .{char}) catch unreachable; + try result.appendSlice(allocator, len); + } else { + try result.append(allocator, char); + } + }, + } + } + + return result.toOwnedSlice(allocator); +} + pub const help = \\usage: git tasks [options] \\ @@ -12,27 +40,26 @@ pub const help = \\ add Add a new task \\ list List all tasks \\ show Show task details - \\ edit Edit task content + \\ edit Replace task content (blocked in agent mode) + \\ append Append to task content \\ delete Delete a task \\ done Mark task as complete - \\ ready List tasks ready to work on (no blockers) \\ pr Export tasks as markdown for PR description + \\ import Import tasks from a plan file (markdown) \\ \\Options: - \\ --after Add task dependency (use with 'add') \\ --json Output in JSON format \\ -h, --help Show this help message \\ \\Examples: \\ git tasks add "Fix authentication bug" - \\ git tasks add "Add tests" --after task-001 \\ git tasks list \\ git tasks show task-001 \\ git tasks edit task-001 "Fix authentication and authorization bug" \\ git tasks delete task-001 \\ git tasks done task-001 - \\ git tasks ready \\ git tasks pr + \\ git tasks import plan.md \\ ; @@ -49,6 +76,9 @@ pub const Error = git.Error || error{ JsonParseError, JsonWriteError, OutOfMemory, + FileNotFound, + FileReadError, + NoTasksFound, }; /// Represents a single task @@ -58,7 +88,6 @@ const Task = struct { status: []const u8 = "pending", created: i64, completed: ?i64 = null, - after: ?[]const u8 = null, // ID of task this depends on const Self = @This(); @@ -66,9 +95,6 @@ const Task = struct { allocator.free(self.id); allocator.free(self.content); allocator.free(self.status); - if (self.after) |after_id| { - allocator.free(after_id); - } } }; @@ -99,6 +125,57 @@ const TaskList = struct { return id; } + /// Escape newlines in content as literal backslash-n for line-based storage + fn escapeNewlines(allocator: std.mem.Allocator, content: []const u8) ![]u8 { + var count: usize = 0; + for (content) |ch| { + if (ch == '\n') count += 1; + } + if (count == 0) return allocator.dupe(u8, content); + + var result = try allocator.alloc(u8, content.len + count); // each \n becomes \\n (+1 char) + var j: usize = 0; + for (content) |ch| { + if (ch == '\n') { + result[j] = '\\'; + result[j + 1] = 'n'; + j += 2; + } else { + result[j] = ch; + j += 1; + } + } + return result[0..j]; + } + + /// Unescape literal backslash-n to newlines + fn unescapeNewlines(allocator: std.mem.Allocator, content: []const u8) ![]u8 { + var count: usize = 0; + var i: usize = 0; + while (i < content.len) : (i += 1) { + if (i + 1 < content.len and content[i] == '\\' and content[i + 1] == 'n') { + count += 1; + i += 1; // skip the 'n' + } + } + if (count == 0) return allocator.dupe(u8, content); + + var result = try allocator.alloc(u8, content.len - count); // each \\n becomes \n (-1 char) + var j: usize = 0; + i = 0; + while (i < content.len) : (i += 1) { + if (i + 1 < content.len and content[i] == '\\' and content[i + 1] == 'n') { + result[j] = '\n'; + j += 1; + i += 1; // skip the 'n' + } else { + result[j] = content[i]; + j += 1; + } + } + return result[0..j]; + } + /// Serialize TaskList to simple text format pub fn toJson(self: Self, allocator: std.mem.Allocator) Error![]u8 { // Use a simple line-based format for now to avoid JSON complexity @@ -114,13 +191,17 @@ const TaskList = struct { const next_id_line = std.fmt.allocPrint(allocator, "next_id:{}", .{self.next_id}) catch return Error.OutOfMemory; lines.append(allocator, next_id_line) catch return Error.OutOfMemory; - // Task lines: id|content|status|created|completed|after + // Task lines: id|content|status|created|completed + // Content has newlines escaped as \\n to preserve line-based format for (self.tasks.items) |task| { const completed_str = if (task.completed) |comp_time| std.fmt.allocPrint(allocator, "{}", .{comp_time}) catch return Error.OutOfMemory else allocator.dupe(u8, "") catch return Error.OutOfMemory; - const after_str = if (task.after) |a| a else ""; - const task_line = std.fmt.allocPrint(allocator, "task:{s}|{s}|{s}|{}|{s}|{s}", - .{ task.id, task.content, task.status, task.created, completed_str, after_str } + // Escape newlines in content + const escaped_content = escapeNewlines(allocator, task.content) catch return Error.OutOfMemory; + defer allocator.free(escaped_content); + + const task_line = std.fmt.allocPrint(allocator, "task:{s}|{s}|{s}|{}|{s}", + .{ task.id, escaped_content, task.status, task.created, completed_str } ) catch return Error.OutOfMemory; lines.append(allocator, task_line) catch return Error.OutOfMemory; @@ -177,7 +258,7 @@ const TaskList = struct { const content = parts.next() orelse continue; // Skip malformed lines without content task.id = allocator.dupe(u8, id) catch return Error.AllocationError; - task.content = allocator.dupe(u8, content) catch { + task.content = unescapeNewlines(allocator, content) catch { allocator.free(task.id); return Error.AllocationError; }; @@ -212,11 +293,8 @@ const TaskList = struct { task.completed = std.fmt.parseInt(i64, completed, 10) catch null; } } - if (parts.next()) |after| { - if (after.len > 0) { - task.after = allocator.dupe(u8, after) catch return Error.AllocationError; - } - } + // Skip any remaining fields (legacy 'after' field for backwards compatibility) + _ = parts.next(); task_list.tasks.append(allocator, task) catch { allocator.free(task.id); @@ -418,14 +496,16 @@ pub fn run(allocator: std.mem.Allocator, args: [][:0]u8) Error!void { try runShow(allocator, args, repo); } else if (std.mem.eql(u8, subcommand, "edit")) { try runEdit(allocator, args, repo); + } else if (std.mem.eql(u8, subcommand, "append")) { + try runAppend(allocator, args, repo); } else if (std.mem.eql(u8, subcommand, "delete")) { try runDelete(allocator, args, repo); } else if (std.mem.eql(u8, subcommand, "done")) { try runDone(allocator, args, repo); - } else if (std.mem.eql(u8, subcommand, "ready")) { - try runReady(allocator, args, repo); } else if (std.mem.eql(u8, subcommand, "pr")) { try runPr(allocator, args, repo); + } else if (std.mem.eql(u8, subcommand, "import")) { + try runImport(allocator, args, repo); } else { stdout.print("error: unknown command '{s}'\n\n{s}", .{ subcommand, help }) catch {}; return Error.InvalidCommand; @@ -441,24 +521,16 @@ fn runAdd(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repository return Error.MissingTaskContent; } - // Parse arguments for content, --after flag, and --json flag + // Parse arguments for content and --json flag var content: ?[]const u8 = null; - var after_id: ?[]const u8 = null; + var content_allocated = false; // Track if content was allocated by us var use_json = false; var i: usize = 3; // Start after "git tasks add" while (i < args.len) { const arg = std.mem.sliceTo(args[i], 0); - if (std.mem.eql(u8, arg, "--after")) { - // Next argument should be the task ID - i += 1; - if (i >= args.len) { - stdout.print("error: --after requires a task ID\n", .{}) catch {}; - return Error.InvalidTaskId; - } - after_id = std.mem.sliceTo(args[i], 0); - } else if (std.mem.eql(u8, arg, "--json")) { + if (std.mem.eql(u8, arg, "--json")) { use_json = true; } else if (content == null) { // First non-flag argument is the content @@ -467,11 +539,19 @@ fn runAdd(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repository // Multiple content arguments - concatenate with spaces const existing = content.?; const combined = std.fmt.allocPrint(allocator, "{s} {s}", .{ existing, arg }) catch return Error.AllocationError; - // Note: we're not tracking these allocations, but they're short-lived + // Free previous allocation if we made one + if (content_allocated) { + allocator.free(@constCast(existing)); + } content = combined; + content_allocated = true; } i += 1; } + // Clean up content allocation on early returns or at end of function + defer if (content_allocated) { + if (content) |content_ptr| allocator.free(@constCast(content_ptr)); + }; if (content == null or content.?.len == 0) { stdout.print("error: task content cannot be empty\n", .{}) catch {}; @@ -508,7 +588,6 @@ fn runAdd(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repository .content = task_content, .status = task_status, .created = now, - .after = if (after_id) |aid| allocator.dupe(u8, aid) catch return Error.AllocationError else null, }; // Add task to list - after this, task_list owns the allocations @@ -529,12 +608,9 @@ fn runAdd(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repository // Output confirmation if (use_json) { // Manually construct JSON output - const after_str = if (after_id) |a| try std.fmt.allocPrint(allocator, "\"{s}\"", .{a}) else allocator.dupe(u8, "null") catch return Error.AllocationError; - defer allocator.free(after_str); - const json_output = try std.fmt.allocPrint(allocator, - "{{\"id\":\"{s}\",\"content\":\"{s}\",\"status\":\"pending\",\"created\":{},\"completed\":null,\"after\":{s}}}", - .{ task_id, content.?, now, after_str } + "{{\"id\":\"{s}\",\"content\":\"{s}\",\"status\":\"pending\",\"created\":{},\"completed\":null}}", + .{ task_id, content.?, now } ); defer allocator.free(json_output); @@ -572,7 +648,6 @@ fn runList(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repositor status: []const u8, created: i64, completed: ?i64, - after: ?[]const u8, }; @@ -587,7 +662,6 @@ fn runList(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repositor .status = task.status, .created = task.created, .completed = task.completed, - .after = task.after, }) catch return Error.AllocationError; } @@ -603,12 +677,13 @@ fn runList(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repositor const completed_str = if (task.completed) |comp| try std.fmt.allocPrint(allocator, "{}", .{comp}) else allocator.dupe(u8, "null") catch return Error.AllocationError; defer allocator.free(completed_str); - const after_str = if (task.after) |a| try std.fmt.allocPrint(allocator, "\"{s}\"", .{a}) else allocator.dupe(u8, "null") catch return Error.AllocationError; - defer allocator.free(after_str); + // Escape content for JSON (handles quotes, newlines, etc.) + const escaped_content = escapeJsonString(allocator, task.content) catch return Error.AllocationError; + defer allocator.free(escaped_content); const task_json = try std.fmt.allocPrint(allocator, - "{{\"id\":\"{s}\",\"content\":\"{s}\",\"status\":\"{s}\",\"created\":{},\"completed\":{s},\"after\":{s}}}", - .{ task.id, task.content, task.status, task.created, completed_str, after_str } + "{{\"id\":\"{s}\",\"content\":\"{s}\",\"status\":\"{s}\",\"created\":{},\"completed\":{s}}}", + .{ task.id, escaped_content, task.status, task.created, completed_str } ); defer allocator.free(task_json); @@ -644,15 +719,8 @@ fn runList(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repositor // List all tasks with compact format for (task_list.tasks.items) |task| { const status_mark = if (std.mem.eql(u8, task.status, "completed")) "✓" else " "; - - // Show dependency if present - if (task.after) |after_id| { - stdout.print("[{s}] {s} (after {s})\n {s}\n", - .{ status_mark, task.id, after_id, task.content }) catch {}; - } else { - stdout.print("[{s}] {s}\n {s}\n", - .{ status_mark, task.id, task.content }) catch {}; - } + stdout.print("[{s}] {s}\n {s}\n", + .{ status_mark, task.id, task.content }) catch {}; } } @@ -731,12 +799,9 @@ fn runShow(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repositor const completed_str = if (task.completed) |comp| try std.fmt.allocPrint(allocator, "{}", .{comp}) else allocator.dupe(u8, "null") catch return Error.AllocationError; defer allocator.free(completed_str); - const after_str = if (task.after) |a| try std.fmt.allocPrint(allocator, "\"{s}\"", .{a}) else allocator.dupe(u8, "null") catch return Error.AllocationError; - defer allocator.free(after_str); - const json_output = try std.fmt.allocPrint(allocator, - "{{\"id\":\"{s}\",\"content\":\"{s}\",\"status\":\"{s}\",\"created\":{},\"completed\":{s},\"after\":{s}}}", - .{ task.id, task.content, task.status, task.created, completed_str, after_str } + "{{\"id\":\"{s}\",\"content\":\"{s}\",\"status\":\"{s}\",\"created\":{},\"completed\":{s}}}", + .{ task.id, task.content, task.status, task.created, completed_str } ); defer allocator.free(json_output); @@ -764,10 +829,6 @@ fn runShow(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repositor if (completed_time) |ct| { stdout.print("completed: {s}\n", .{ct}) catch {}; } - - if (task.after) |after_id| { - stdout.print("depends on: {s}\n", .{after_id}) catch {}; - } } fn runDone(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repository) Error!void { @@ -826,9 +887,18 @@ fn runDone(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repositor } fn runReady(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repository) Error!void { - _ = args; // No additional args needed for ready const stdout = std.fs.File.stdout().deprecatedWriter(); + // Parse --json flag + var use_json = false; + for (args[3..]) |arg| { + const a = std.mem.sliceTo(arg, 0); + if (std.mem.eql(u8, a, "--json")) { + use_json = true; + break; + } + } + // Load task list from git ref var task_list = loadTaskList(repo, allocator) catch |err| { stdout.print("error: failed to load tasks: {}\n", .{err}) catch {}; @@ -842,57 +912,53 @@ fn runReady(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_reposito return; } - // Build a map of task ID -> completion status for dependency checking - var completed_tasks = std.HashMap([]const u8, bool, std.hash_map.StringContext, std.hash_map.default_max_load_percentage).init(allocator); - defer completed_tasks.deinit(); - - for (task_list.tasks.items) |task| { - const is_completed = std.mem.eql(u8, task.status, "completed"); - completed_tasks.put(task.id, is_completed) catch return Error.AllocationError; - } - - // Find tasks that are ready (pending and no unmet dependencies) + // Find pending tasks (all pending tasks are ready without dependencies) var ready_tasks = std.ArrayList(Task){}; defer ready_tasks.deinit(allocator); for (task_list.tasks.items) |task| { - // Skip completed tasks - if (std.mem.eql(u8, task.status, "completed")) { - continue; - } - - // Check if this task is ready - var is_ready = true; - - // If task has a dependency, check if it's completed - if (task.after) |dependency_id| { - if (completed_tasks.get(dependency_id)) |is_dep_completed| { - if (!is_dep_completed) { - is_ready = false; - } - } else { - // Dependency doesn't exist - this is an error state, but we'll treat it as not ready - is_ready = false; - } - } - - if (is_ready) { + if (!std.mem.eql(u8, task.status, "completed")) { ready_tasks.append(allocator, task) catch return Error.AllocationError; } } // If no ready tasks, show empty state if (ready_tasks.items.len == 0) { - stdout.print("no ready tasks (all tasks are either completed or waiting on dependencies)\n", .{}) catch {}; + if (use_json) { + stdout.print("[]\n", .{}) catch {}; + } else { + stdout.print("no pending tasks\n", .{}) catch {}; + } return; } - // Display ready task count - stdout.print("ready: {} task{s}\n\n", .{ ready_tasks.items.len, if (ready_tasks.items.len == 1) "" else "s" }) catch {}; + if (use_json) { + // JSON output - array of tasks + stdout.print("[", .{}) catch {}; + for (ready_tasks.items, 0..) |task, i| { + const completed_str = if (task.completed) |comp| try std.fmt.allocPrint(allocator, "{}", .{comp}) else allocator.dupe(u8, "null") catch return Error.AllocationError; + defer allocator.free(completed_str); + + const json_output = try std.fmt.allocPrint(allocator, + "{{\"id\":\"{s}\",\"content\":\"{s}\",\"status\":\"{s}\",\"created\":{},\"completed\":{s}}}", + .{ task.id, task.content, task.status, task.created, completed_str } + ); + defer allocator.free(json_output); - // List ready tasks with compact format - for (ready_tasks.items) |task| { - stdout.print("[ ] {s}\n {s}\n", .{ task.id, task.content }) catch {}; + if (i > 0) { + stdout.print(",", .{}) catch {}; + } + stdout.print("{s}", .{json_output}) catch {}; + } + stdout.print("]\n", .{}) catch {}; + } else { + // Display ready task count + stdout.print("ready: {} task{s}\n\n", .{ ready_tasks.items.len, if (ready_tasks.items.len == 1) "" else "s" }) catch {}; + + // List ready tasks with compact format + for (ready_tasks.items) |task| { + stdout.print("[ ] {s}\n {s}\n", .{ task.id, task.content }) catch {}; + } } } @@ -913,22 +979,17 @@ fn runPr(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repository) return; } - // Separate tasks by status and build dependency map + // Separate tasks by status var completed_tasks = std.ArrayList(Task){}; var pending_tasks = std.ArrayList(Task){}; defer completed_tasks.deinit(allocator); defer pending_tasks.deinit(allocator); - var completed_map = std.HashMap([]const u8, bool, std.hash_map.StringContext, std.hash_map.default_max_load_percentage).init(allocator); - defer completed_map.deinit(); - for (task_list.tasks.items) |task| { if (std.mem.eql(u8, task.status, "completed")) { completed_tasks.append(allocator, task) catch return Error.AllocationError; - completed_map.put(task.id, true) catch return Error.AllocationError; } else { pending_tasks.append(allocator, task) catch return Error.AllocationError; - completed_map.put(task.id, false) catch return Error.AllocationError; } } @@ -939,76 +1000,29 @@ fn runPr(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repository) if (completed_tasks.items.len > 0) { stdout.print("### Completed\n\n", .{}) catch {}; for (completed_tasks.items) |task| { - if (task.after) |after_id| { - stdout.print("- [x] {s} (after {s})\n", .{ task.content, after_id }) catch {}; - } else { - stdout.print("- [x] {s}\n", .{ task.content }) catch {}; - } + stdout.print("- [x] {s}\n", .{task.content}) catch {}; } stdout.print("\n", .{}) catch {}; } - // Show pending tasks, grouped by ready vs blocked + // Show pending tasks if (pending_tasks.items.len > 0) { - var ready_tasks = std.ArrayList(Task){}; - var blocked_tasks = std.ArrayList(Task){}; - defer ready_tasks.deinit(allocator); - defer blocked_tasks.deinit(allocator); - + stdout.print("### Pending\n\n", .{}) catch {}; for (pending_tasks.items) |task| { - var is_ready = true; - - // Check if this task is blocked by dependencies - if (task.after) |dependency_id| { - if (completed_map.get(dependency_id)) |is_dep_completed| { - if (!is_dep_completed) { - is_ready = false; - } - } else { - // Dependency doesn't exist - treat as blocked - is_ready = false; - } - } - - if (is_ready) { - ready_tasks.append(allocator, task) catch return Error.AllocationError; - } else { - blocked_tasks.append(allocator, task) catch return Error.AllocationError; - } - } - - // Show ready tasks first - if (ready_tasks.items.len > 0) { - stdout.print("### Ready\n\n", .{}) catch {}; - for (ready_tasks.items) |task| { - stdout.print("- [ ] {s}\n", .{ task.content }) catch {}; - } - stdout.print("\n", .{}) catch {}; - } - - // Show blocked tasks - if (blocked_tasks.items.len > 0) { - stdout.print("### Blocked\n\n", .{}) catch {}; - for (blocked_tasks.items) |task| { - if (task.after) |after_id| { - stdout.print("- [ ] {s} (after {s})\n", .{ task.content, after_id }) catch {}; - } else { - stdout.print("- [ ] {s}\n", .{ task.content }) catch {}; - } - } - stdout.print("\n", .{}) catch {}; + stdout.print("- [ ] {s}\n", .{task.content}) catch {}; } + stdout.print("\n", .{}) catch {}; } } fn runEdit(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repository) Error!void { const stdout = std.fs.File.stdout().deprecatedWriter(); - // Check if we should block this operation in agent mode - const guardrails = @import("../guardrails.zig"); - if (guardrails.isAgentMode()) { + // Block edit in agent mode - agents should use append instead + const detect = @import("detect.zig"); + if (detect.isAgentMode()) { stdout.print("error: edit command blocked (ZAGI_AGENT is set)\n", .{}) catch {}; - stdout.print("reason: modifying tasks could cause data loss\n", .{}) catch {}; + stdout.print("hint: use 'tasks append' to add notes to a task\n", .{}) catch {}; return Error.InvalidCommand; } @@ -1083,8 +1097,8 @@ fn runEdit(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repositor const task = found_task.?; - // Update task content - allocator.free(task.content); // Free old content + // Replace task content + allocator.free(task.content); task.content = allocator.dupe(u8, new_content) catch return Error.AllocationError; // Save updated task list @@ -1099,12 +1113,9 @@ fn runEdit(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repositor const completed_str = if (task.completed) |comp| try std.fmt.allocPrint(allocator, "{}", .{comp}) else allocator.dupe(u8, "null") catch return Error.AllocationError; defer allocator.free(completed_str); - const after_str = if (task.after) |a| try std.fmt.allocPrint(allocator, "\"{s}\"", .{a}) else allocator.dupe(u8, "null") catch return Error.AllocationError; - defer allocator.free(after_str); - const json_output = try std.fmt.allocPrint(allocator, - "{{\"id\":\"{s}\",\"content\":\"{s}\",\"status\":\"{s}\",\"created\":{},\"completed\":{s},\"after\":{s}}}", - .{ task.id, task.content, task.status, task.created, completed_str, after_str } + "{{\"id\":\"{s}\",\"content\":\"{s}\",\"status\":\"{s}\",\"created\":{},\"completed\":{s}}}", + .{ task.id, task.content, task.status, task.created, completed_str } ); defer allocator.free(json_output); @@ -1114,12 +1125,88 @@ fn runEdit(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repositor } } +fn runAppend(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repository) Error!void { + const stdout = std.fs.File.stdout().deprecatedWriter(); + + // Need at least: tasks append + if (args.len < 5) { + stdout.print("error: missing task ID or content\n\nusage: git tasks append \n", .{}) catch {}; + return Error.InvalidTaskId; + } + + const task_id = std.mem.sliceTo(args[3], 0); + + // Parse content arguments (everything from args[4] onwards) + var content_parts = std.ArrayList([]const u8){}; + defer content_parts.deinit(allocator); + + for (args[4..]) |arg| { + const arg_str = std.mem.sliceTo(arg, 0); + content_parts.append(allocator, arg_str) catch return Error.AllocationError; + } + + if (content_parts.items.len == 0) { + stdout.print("error: content cannot be empty\n", .{}) catch {}; + return Error.MissingTaskContent; + } + + // Join content parts with spaces + var content_buffer = std.ArrayList(u8){}; + defer content_buffer.deinit(allocator); + + for (content_parts.items, 0..) |part, i| { + if (i > 0) { + content_buffer.append(allocator, ' ') catch return Error.AllocationError; + } + content_buffer.appendSlice(allocator, part) catch return Error.AllocationError; + } + + const new_content = content_buffer.toOwnedSlice(allocator) catch return Error.AllocationError; + defer allocator.free(new_content); + + // Load task list + var task_list = loadTaskList(repo, allocator) catch |err| { + stdout.print("error: failed to load tasks: {}\n", .{err}) catch {}; + return err; + }; + defer task_list.deinit(allocator); + + // Find the task by ID + var found_task: ?*Task = null; + for (task_list.tasks.items) |*task| { + if (std.mem.eql(u8, task.id, task_id)) { + found_task = task; + break; + } + } + + if (found_task == null) { + stdout.print("error: task '{s}' not found\n", .{task_id}) catch {}; + return Error.TaskNotFound; + } + + const task = found_task.?; + + // Append new content to existing with newline separator + const appended = std.fmt.allocPrint(allocator, "{s}\n{s}", .{ task.content, new_content }) catch return Error.AllocationError; + allocator.free(task.content); + task.content = appended; + + // Save updated task list + saveTaskList(repo, task_list, allocator) catch |err| { + stdout.print("error: failed to save tasks: {}\n", .{err}) catch {}; + return err; + }; + + stdout.print("appended: {s}\n {s}\n", .{ task_id, new_content }) catch {}; +} + fn runDelete(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repository) Error!void { const stdout = std.fs.File.stdout().deprecatedWriter(); // Check if we should block this operation in agent mode - const guardrails = @import("../guardrails.zig"); - if (guardrails.isAgentMode()) { + const detect = @import("detect.zig"); + if (detect.isAgentMode()) { stdout.print("error: delete command blocked (ZAGI_AGENT is set)\n", .{}) catch {}; stdout.print("reason: deleting tasks causes permanent data loss\n", .{}) catch {}; return Error.InvalidCommand; @@ -1166,19 +1253,212 @@ fn runDelete(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_reposit return Error.TaskNotFound; } - // Check if any other tasks depend on this one - for (task_list.tasks.items) |task| { - if (task.after) |dependency_id| { - if (std.mem.eql(u8, dependency_id, task_id)) { - stdout.print("error: task '{s}' cannot be deleted (task '{s}' depends on it)\n", .{ task_id, task.id }) catch {}; - return Error.InvalidCommand; + // Remove the task + var removed_task = task_list.tasks.swapRemove(found_index.?); + + // Save updated task list + saveTaskList(repo, task_list, allocator) catch |err| { + stdout.print("error: failed to save tasks: {}\n", .{err}) catch {}; + removed_task.deinit(allocator); + return err; + }; + + // Output confirmation (must happen before deinit frees task_content) + if (use_json) { + stdout.print("{{\"deleted\":\"{s}\"}}\n", .{task_id}) catch {}; + } else { + stdout.print("deleted: {s}\n {s}\n", .{ task_id, task_content }) catch {}; + } + + removed_task.deinit(allocator); +} + +/// Parse markdown content and extract task items from numbered lists. +/// Supports formats like: +/// - "1. Task description" +/// - "1) Task description" +/// - "- [ ] Task description" (checkbox format) +/// - "- Task description" (bullet format) +fn parseTasksFromMarkdown(allocator: std.mem.Allocator, content: []const u8) !std.ArrayList([]const u8) { + var tasks = std.ArrayList([]const u8){}; + errdefer { + for (tasks.items) |t| allocator.free(t); + tasks.deinit(allocator); + } + + var lines = std.mem.splitScalar(u8, content, '\n'); + while (lines.next()) |line| { + const trimmed = std.mem.trim(u8, line, " \t\r"); + if (trimmed.len == 0) continue; + + var task_content: ?[]const u8 = null; + + // Check for numbered list: "1. " or "1) " + if (trimmed.len > 2) { + var i: usize = 0; + // Skip digits + while (i < trimmed.len and std.ascii.isDigit(trimmed[i])) { + i += 1; + } + // Check for ". " or ") " after digits + if (i > 0 and i < trimmed.len - 1) { + if ((trimmed[i] == '.' or trimmed[i] == ')') and trimmed[i + 1] == ' ') { + task_content = std.mem.trim(u8, trimmed[i + 2 ..], " \t"); + } + } + } + + // Check for checkbox format: "- [ ] " or "- [x] " + if (task_content == null and trimmed.len > 5) { + if (std.mem.startsWith(u8, trimmed, "- [ ] ") or std.mem.startsWith(u8, trimmed, "- [x] ") or std.mem.startsWith(u8, trimmed, "- [X] ")) { + task_content = std.mem.trim(u8, trimmed[6..], " \t"); + } + } + + // Check for bullet format: "- " + if (task_content == null and trimmed.len > 2) { + if (std.mem.startsWith(u8, trimmed, "- ")) { + task_content = std.mem.trim(u8, trimmed[2..], " \t"); + } + } + + // Add non-empty task content + if (task_content) |tc| { + if (tc.len > 0) { + const duped = try allocator.dupe(u8, tc); + try tasks.append(allocator, duped); } } } - // Remove the task - var removed_task = task_list.tasks.swapRemove(found_index.?); - removed_task.deinit(allocator); + return tasks; +} + +fn runImport(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_repository) Error!void { + const stdout = std.fs.File.stdout().deprecatedWriter(); + + // Need at least: tasks import + if (args.len < 4) { + stdout.print("error: missing file path\n\nusage: git tasks import [--dry-run]\n", .{}) catch {}; + return Error.InvalidCommand; + } + + // Parse arguments + var file_path: ?[]const u8 = null; + var dry_run = false; + var use_json = false; + + for (args[3..]) |arg| { + const a = std.mem.sliceTo(arg, 0); + if (std.mem.eql(u8, a, "--dry-run")) { + dry_run = true; + } else if (std.mem.eql(u8, a, "--json")) { + use_json = true; + } else if (file_path == null) { + file_path = a; + } + } + + if (file_path == null) { + stdout.print("error: missing file path\n\nusage: git tasks import [--dry-run]\n", .{}) catch {}; + return Error.InvalidCommand; + } + + // Read file content + const file = std.fs.cwd().openFile(file_path.?, .{}) catch { + stdout.print("error: cannot open file '{s}'\n", .{file_path.?}) catch {}; + return Error.FileNotFound; + }; + defer file.close(); + + const content = file.readToEndAlloc(allocator, 1024 * 1024) catch { // 1MB limit + stdout.print("error: failed to read file '{s}'\n", .{file_path.?}) catch {}; + return Error.FileReadError; + }; + defer allocator.free(content); + + // Parse tasks from markdown + var parsed_tasks = parseTasksFromMarkdown(allocator, content) catch { + stdout.print("error: failed to parse file\n", .{}) catch {}; + return Error.AllocationError; + }; + defer { + for (parsed_tasks.items) |t| allocator.free(t); + parsed_tasks.deinit(allocator); + } + + if (parsed_tasks.items.len == 0) { + stdout.print("error: no tasks found in '{s}'\n", .{file_path.?}) catch {}; + stdout.print("hint: tasks should be formatted as numbered or bulleted list items\n", .{}) catch {}; + return Error.NoTasksFound; + } + + // Preview mode + if (dry_run) { + stdout.print("preview: {} task{s} found in '{s}'\n\n", .{ + parsed_tasks.items.len, + if (parsed_tasks.items.len == 1) "" else "s", + file_path.?, + }) catch {}; + for (parsed_tasks.items, 0..) |task_content, i| { + stdout.print("{}: {s}\n", .{ i + 1, task_content }) catch {}; + } + stdout.print("\nrun without --dry-run to create these tasks\n", .{}) catch {}; + return; + } + + // Load existing task list + var task_list = loadTaskList(repo, allocator) catch |err| { + stdout.print("error: failed to load tasks: {}\n", .{err}) catch {}; + return err; + }; + defer task_list.deinit(allocator); + + // Create tasks + const now = std.time.timestamp(); + var created_ids = std.ArrayList([]const u8){}; + defer { + for (created_ids.items) |id| allocator.free(id); + created_ids.deinit(allocator); + } + + for (parsed_tasks.items) |task_content| { + // Generate new task ID + const task_id = task_list.generateId(allocator) catch return Error.AllocationError; + + // Duplicate content for task storage + const content_dupe = allocator.dupe(u8, task_content) catch { + allocator.free(task_id); + return Error.AllocationError; + }; + + const status_dupe = allocator.dupe(u8, "pending") catch { + allocator.free(task_id); + allocator.free(content_dupe); + return Error.AllocationError; + }; + + const new_task = Task{ + .id = task_id, + .content = content_dupe, + .status = status_dupe, + .created = now, + }; + + task_list.tasks.append(allocator, new_task) catch { + allocator.free(task_id); + allocator.free(content_dupe); + allocator.free(status_dupe); + return Error.AllocationError; + }; + + // Track ID for output (make a copy since task_list owns the original) + const id_copy = allocator.dupe(u8, task_id) catch return Error.AllocationError; + created_ids.append(allocator, id_copy) catch { + allocator.free(id_copy); + return Error.AllocationError; + }; + } // Save updated task list saveTaskList(repo, task_list, allocator) catch |err| { @@ -1188,9 +1468,21 @@ fn runDelete(allocator: std.mem.Allocator, args: [][:0]u8, repo: ?*c.git_reposit // Output confirmation if (use_json) { - stdout.print("{{\"deleted\":\"{s}\"}}\n", .{task_id}) catch {}; + stdout.print("{{\"imported\":{},\"ids\":[", .{created_ids.items.len}) catch {}; + for (created_ids.items, 0..) |id, i| { + if (i > 0) stdout.print(",", .{}) catch {}; + stdout.print("\"{s}\"", .{id}) catch {}; + } + stdout.print("]}}\n", .{}) catch {}; } else { - stdout.print("deleted: {s}\n {s}\n", .{ task_id, task_content }) catch {}; + stdout.print("imported: {} task{s} from '{s}'\n\n", .{ + created_ids.items.len, + if (created_ids.items.len == 1) "" else "s", + file_path.?, + }) catch {}; + for (created_ids.items, 0..) |id, i| { + stdout.print(" {s}: {s}\n", .{ id, parsed_tasks.items[i] }) catch {}; + } } } @@ -1328,4 +1620,154 @@ test "TaskList.generateId - continues from loaded state" { try testing.expectEqualStrings("task-042", new_id); try testing.expectEqual(@as(u32, 43), task_list.next_id); +} + +test "parseTasksFromMarkdown - numbered list with periods" { + var arena = std.heap.ArenaAllocator.init(testing.allocator); + defer arena.deinit(); + const allocator = arena.allocator(); + + const content = + \\1. First task + \\2. Second task + \\3. Third task + ; + + var tasks = try parseTasksFromMarkdown(allocator, content); + defer { + for (tasks.items) |t| allocator.free(t); + tasks.deinit(allocator); + } + + try testing.expectEqual(@as(usize, 3), tasks.items.len); + try testing.expectEqualStrings("First task", tasks.items[0]); + try testing.expectEqualStrings("Second task", tasks.items[1]); + try testing.expectEqualStrings("Third task", tasks.items[2]); +} + +test "parseTasksFromMarkdown - numbered list with parentheses" { + var arena = std.heap.ArenaAllocator.init(testing.allocator); + defer arena.deinit(); + const allocator = arena.allocator(); + + const content = + \\1) First task + \\2) Second task + ; + + var tasks = try parseTasksFromMarkdown(allocator, content); + defer { + for (tasks.items) |t| allocator.free(t); + tasks.deinit(allocator); + } + + try testing.expectEqual(@as(usize, 2), tasks.items.len); + try testing.expectEqualStrings("First task", tasks.items[0]); + try testing.expectEqualStrings("Second task", tasks.items[1]); +} + +test "parseTasksFromMarkdown - bullet list" { + var arena = std.heap.ArenaAllocator.init(testing.allocator); + defer arena.deinit(); + const allocator = arena.allocator(); + + const content = + \\- Task one + \\- Task two + ; + + var tasks = try parseTasksFromMarkdown(allocator, content); + defer { + for (tasks.items) |t| allocator.free(t); + tasks.deinit(allocator); + } + + try testing.expectEqual(@as(usize, 2), tasks.items.len); + try testing.expectEqualStrings("Task one", tasks.items[0]); + try testing.expectEqualStrings("Task two", tasks.items[1]); +} + +test "parseTasksFromMarkdown - checkbox format" { + var arena = std.heap.ArenaAllocator.init(testing.allocator); + defer arena.deinit(); + const allocator = arena.allocator(); + + const content = + \\- [ ] Pending task + \\- [x] Completed task + \\- [X] Another completed + ; + + var tasks = try parseTasksFromMarkdown(allocator, content); + defer { + for (tasks.items) |t| allocator.free(t); + tasks.deinit(allocator); + } + + try testing.expectEqual(@as(usize, 3), tasks.items.len); + try testing.expectEqualStrings("Pending task", tasks.items[0]); + try testing.expectEqualStrings("Completed task", tasks.items[1]); + try testing.expectEqualStrings("Another completed", tasks.items[2]); +} + +test "parseTasksFromMarkdown - mixed content ignores non-task lines" { + var arena = std.heap.ArenaAllocator.init(testing.allocator); + defer arena.deinit(); + const allocator = arena.allocator(); + + const content = + \\# Plan + \\ + \\Some description text here. + \\ + \\1. First task + \\2. Second task + \\ + \\More explanation. + ; + + var tasks = try parseTasksFromMarkdown(allocator, content); + defer { + for (tasks.items) |t| allocator.free(t); + tasks.deinit(allocator); + } + + try testing.expectEqual(@as(usize, 2), tasks.items.len); + try testing.expectEqualStrings("First task", tasks.items[0]); + try testing.expectEqualStrings("Second task", tasks.items[1]); +} + +test "parseTasksFromMarkdown - empty content" { + var arena = std.heap.ArenaAllocator.init(testing.allocator); + defer arena.deinit(); + const allocator = arena.allocator(); + + var tasks = try parseTasksFromMarkdown(allocator, ""); + defer { + for (tasks.items) |t| allocator.free(t); + tasks.deinit(allocator); + } + + try testing.expectEqual(@as(usize, 0), tasks.items.len); +} + +test "parseTasksFromMarkdown - handles indentation" { + var arena = std.heap.ArenaAllocator.init(testing.allocator); + defer arena.deinit(); + const allocator = arena.allocator(); + + const content = + \\ 1. Indented task + \\ - Bullet with indent + ; + + var tasks = try parseTasksFromMarkdown(allocator, content); + defer { + for (tasks.items) |t| allocator.free(t); + tasks.deinit(allocator); + } + + try testing.expectEqual(@as(usize, 2), tasks.items.len); + try testing.expectEqualStrings("Indented task", tasks.items[0]); + try testing.expectEqualStrings("Bullet with indent", tasks.items[1]); } \ No newline at end of file diff --git a/src/guardrails.zig b/src/guardrails.zig index 70e3732..d95bafd 100644 --- a/src/guardrails.zig +++ b/src/guardrails.zig @@ -1,6 +1,6 @@ const std = @import("std"); -/// Guardrails for agent mode (ZAGI_AGENT). +/// Guardrails for agent mode. /// Blocks commands that can cause actual data loss. /// /// Philosophy: Only block commands where data is UNRECOVERABLE. @@ -184,11 +184,6 @@ fn hasArg(args: []const [:0]const u8, target: []const u8) bool { return false; } -/// Check if guardrails should be enforced. -pub fn isAgentMode() bool { - return std.posix.getenv("ZAGI_AGENT") != null; -} - // Tests const testing = std.testing; diff --git a/src/main.zig b/src/main.zig index 3d63949..42bc81b 100644 --- a/src/main.zig +++ b/src/main.zig @@ -8,6 +8,7 @@ const commit = @import("cmds/commit.zig"); const diff = @import("cmds/diff.zig"); const fork = @import("cmds/fork.zig"); const tasks = @import("cmds/tasks.zig"); +const agent = @import("cmds/agent.zig"); const git = @import("cmds/git.zig"); const version = "0.1.0"; @@ -21,6 +22,7 @@ const Command = enum { diff_cmd, fork_cmd, tasks_cmd, + agent_cmd, other, }; @@ -99,6 +101,9 @@ fn run(allocator: std.mem.Allocator, args: [][:0]u8) !void { } else if (std.mem.eql(u8, cmd, "tasks")) { current_command = .tasks_cmd; try tasks.run(allocator, args); + } else if (std.mem.eql(u8, cmd, "agent")) { + current_command = .agent_cmd; + try agent.run(allocator, args); } else { // Unknown command: pass through to git current_command = .other; @@ -121,6 +126,7 @@ fn printHelp(stdout: anytype) !void { \\ commit Create a commit \\ fork Manage parallel worktrees \\ tasks Task management for git repositories + \\ agent Execute RALPH loop to complete tasks \\ alias Create an alias to git \\ \\options: @@ -209,6 +215,7 @@ fn printUsageHelp(stderr: anytype, cmd: Command) void { .diff_cmd => diff.help, .fork_cmd => fork.help, .tasks_cmd => tasks.help, + .agent_cmd => agent.help, .other => "usage: git [args...]\n", }; diff --git a/src/passthrough.zig b/src/passthrough.zig index 9432896..09f8840 100644 --- a/src/passthrough.zig +++ b/src/passthrough.zig @@ -1,12 +1,13 @@ const std = @import("std"); const guardrails = @import("guardrails.zig"); +const detect = @import("cmds/detect.zig"); /// Pass through a command to git CLI pub fn run(allocator: std.mem.Allocator, args: [][:0]u8) !void { const stderr = std.fs.File.stderr().deprecatedWriter(); // Check guardrails in agent mode - if (guardrails.isAgentMode()) { + if (detect.isAgentMode()) { // Cast to const for checkBlocked const const_args: []const [:0]const u8 = @ptrCast(args); if (guardrails.checkBlocked(const_args)) |reason| { diff --git a/start.sh b/start.sh index 4eba6d2..825fee6 100755 --- a/start.sh +++ b/start.sh @@ -1,52 +1,175 @@ #!/bin/bash -# RALPH loop - run Claude Code over tasks in plan.md +# RALPH loop - run Claude over zagi tasks +# +# Usage: ./start.sh [options] +# Options: +# --delay Delay between tasks (default: 2) +# --once Run only one task then exit +# --dry-run Show what would run without executing +# --help Show this help -MODEL="${MODEL:-claude-sonnet-4-20250514}" +set -e + +# Defaults DELAY="${DELAY:-2}" -PLAN="${PLAN:-plan.md}" +ONCE=false +DRY_RUN=false + +# Auto-detect git tasks command +if [ -x "./zig-out/bin/zagi" ]; then + ZAGI="./zig-out/bin/zagi" +else + echo "error: ./zig-out/bin/zagi not found. Build with 'zig build' first." + exit 1 +fi + +# Parse arguments +while [[ $# -gt 0 ]]; do + case $1 in + --delay) + DELAY="$2" + shift 2 + ;; + --once) + ONCE=true + shift + ;; + --dry-run) + DRY_RUN=true + shift + ;; + --help|-h) + sed -n '2,10p' "$0" | sed 's/^# //' | sed 's/^#//' + exit 0 + ;; + *) + echo "Unknown option: $1" + exit 1 + ;; + esac +done + +# Check if we're in a git repo +if ! git rev-parse --git-dir > /dev/null 2>&1; then + echo "error: not a git repository" + exit 1 +fi echo "RALPH loop starting..." -echo "Plan: $PLAN" -echo "Model: $MODEL" +if [ "$DRY_RUN" = true ]; then + echo "(dry-run mode)" +fi +echo "" + +# Show current tasks +$ZAGI tasks list 2>/dev/null || echo "No tasks found." echo "" while true; do - # Get first unchecked task from plan.md - TASK=$(grep -m1 '^\- \[ \]' "$PLAN" 2>/dev/null | sed 's/^- \[ \] //') + # Get all tasks as JSON and find first pending + TASKS_JSON=$($ZAGI tasks list --json 2>/dev/null || echo '{"tasks":[]}') + + # Extract first pending task using basic parsing + # Look for pending tasks in the JSON + TASK_LINE=$(echo "$TASKS_JSON" | tr ',' '\n' | grep -A5 '"status":"pending"' | head -6) + + if [ -z "$TASK_LINE" ]; then + echo "" + echo "=== All tasks complete! ===" + exit 0 + fi - if [ -z "$TASK" ]; then + # Extract task ID and content + TASK_ID=$(echo "$TASKS_JSON" | python3 -c " +import sys, json +data = json.load(sys.stdin) +for t in data.get('tasks', []): + if t.get('status') == 'pending': + print(t['id']) + break +" 2>/dev/null || echo "") + + TASK_CONTENT=$(echo "$TASKS_JSON" | python3 -c " +import sys, json +data = json.load(sys.stdin) +for t in data.get('tasks', []): + if t.get('status') == 'pending': + print(t['content']) + break +" 2>/dev/null || echo "") + + if [ -z "$TASK_ID" ]; then echo "" echo "=== All tasks complete! ===" exit 0 fi - echo "=== Next task ===" - echo "$TASK" + echo "=== Working on $TASK_ID ===" + echo "$TASK_CONTENT" echo "" - PROMPT="You are working through plan.md one task at a time. + # Build the prompt + PROMPT="You are working on: $TASK_ID -Current task: $TASK +Task: $TASK_CONTENT Instructions: -1. Read AGENTS.md for project context and build instructions -2. Complete this ONE task only -3. Verify your work (run tests, check build) -4. Commit your changes with: git commit -m \"\" --prompt \"$TASK\" -5. Mark the task done in plan.md by changing [ ] to [x] -6. If you learn critical operational details (e.g. how to build), update AGENTS.md +1. Read CONTEXT.md for mission context and current focus +2. Read AGENTS.md for build instructions and conventions +3. Complete this ONE task only +4. Verify your work (run tests, check build) +5. Commit your changes with: git commit -m \"\" +6. Output a COMPLETION PROMISE (see below) +7. Mark the task done: $ZAGI tasks done $TASK_ID + +COMPLETION PROMISE (required before marking task done): +Before calling \`$ZAGI tasks done\`, you MUST output the following confirmation: + +COMPLETION PROMISE: I confirm that: +- Tests pass: [which tests ran, summary of results] +- Build succeeds: [build command used, confirmation of no errors] +- Changes committed: [commit hash, commit message] +- Only this task was modified: [list of files changed, confirm no scope creep] +-- I have not taken any shortcuts or skipped any verification steps. + +Do NOT mark the task done without outputting this promise first. Rules: - NEVER git push (only commit) - ONLY work on this one task - Exit when done so the next task can start" - # Run Claude Code in headless mode with permissions bypassed - claude --print --dangerously-skip-permissions --model "$MODEL" "$PROMPT" + if [ "$DRY_RUN" = true ]; then + echo "Would execute:" + echo " claude -p \"\"" + echo "" + echo "Prompt preview:" + echo "$PROMPT" | head -10 + echo "..." + echo "" + else + # Run Claude in headless mode with streaming JSON output + export ZAGI_AGENT=claude + TASK_LOG="logs/${TASK_ID}.json" + mkdir -p logs + echo "Streaming to: $TASK_LOG" + + # Use CC_CMD if set, otherwise default to claude with skip-permissions + # Note: can't use 'cc' alias since shell aliases don't work in scripts + CC="${CC_CMD:-claude --dangerously-skip-permissions}" + $CC -p --verbose --output-format stream-json "$PROMPT" 2>&1 | tee "$TASK_LOG" + fi echo "" echo "=== Task iteration complete ===" echo "" - sleep "$DELAY" + if [ "$ONCE" = true ]; then + echo "Exiting after one task (--once flag)" + exit 0 + fi + + if [ "$DRY_RUN" = false ]; then + sleep "$DELAY" + fi done diff --git a/test/agent.log b/test/agent.log new file mode 100644 index 0000000..aa3d7a5 --- /dev/null +++ b/test/agent.log @@ -0,0 +1 @@ +=== Interactive planning session started === diff --git a/test/fixtures/setup.ts b/test/fixtures/setup.ts index 50e42ff..da44fa1 100644 --- a/test/fixtures/setup.ts +++ b/test/fixtures/setup.ts @@ -3,7 +3,8 @@ import { existsSync, mkdirSync, writeFileSync, rmSync, readFileSync } from "fs"; import { resolve } from "path"; const FIXTURES_BASE = resolve(__dirname, "repos"); -const COMMIT_COUNT = 100; +// Reduced from 100 to 20 - enough for testing pagination/filtering but much faster +const COMMIT_COUNT = 20; // Generate unique IDs for parallel safety function uid() { diff --git a/test/src/add.test.ts b/test/src/add.test.ts index 012ce88..d06fc07 100644 --- a/test/src/add.test.ts +++ b/test/src/add.test.ts @@ -1,18 +1,20 @@ import { describe, test, expect, beforeEach, afterEach } from "vitest"; -import { rmSync } from "fs"; -import { createFixtureRepo } from "../fixtures/setup"; -import { zagi, git } from "./shared"; +import { writeFileSync, mkdirSync } from "fs"; +import { resolve } from "path"; +import { zagi, git, createTestRepo, cleanupTestRepo } from "./shared"; let REPO_DIR: string; +// Use lightweight repo - these tests don't need multiple commits beforeEach(() => { - REPO_DIR = createFixtureRepo(); + REPO_DIR = createTestRepo(); + // Create an untracked file for testing + mkdirSync(resolve(REPO_DIR, "src"), { recursive: true }); + writeFileSync(resolve(REPO_DIR, "src/new-file.ts"), "// New file\n"); }); afterEach(() => { - if (REPO_DIR) { - rmSync(REPO_DIR, { recursive: true, force: true }); - } + cleanupTestRepo(REPO_DIR); }); describe("zagi add", () => { diff --git a/test/src/agent.test.ts b/test/src/agent.test.ts new file mode 100644 index 0000000..860b415 --- /dev/null +++ b/test/src/agent.test.ts @@ -0,0 +1,1240 @@ +import { describe, test, expect, beforeEach, afterEach } from "vitest"; +import { writeFileSync, chmodSync, existsSync, readFileSync } from "fs"; +import { resolve } from "path"; +import { zagi, createTestRepo, cleanupTestRepo } from "./shared"; + +let REPO_DIR: string; + +// Use lightweight repo - these tests don't need multiple commits +beforeEach(() => { + REPO_DIR = createTestRepo(); +}); + +afterEach(() => { + cleanupTestRepo(REPO_DIR); +}); + +// ============================================================================ +// Helper: Create mock executor scripts +// ============================================================================ + +/** + * Creates a mock executor script that always succeeds. + * Returns the path to the script. + */ +function createSuccessExecutor(repoDir: string): string { + const scriptPath = resolve(repoDir, "mock-success.sh"); + const script = `#!/bin/bash +# Output completion promise (required for agent run to succeed) +echo "" +echo "COMPLETION PROMISE: I confirm that:" +echo "- Tests pass: N/A" +echo "- Build succeeds: N/A" +echo "- Changes committed: N/A" +echo "- Task completed: Mock task" +echo "-- I have not taken any shortcuts or skipped verification." +exit 0 +`; + writeFileSync(scriptPath, script); + chmodSync(scriptPath, 0o755); + return scriptPath; +} + +/** + * Creates a mock executor script that always fails. + * Returns the path to the script. + */ +function createFailureExecutor(repoDir: string): string { + const scriptPath = resolve(repoDir, "mock-failure.sh"); + writeFileSync(scriptPath, "#!/bin/bash\nexit 1\n"); + chmodSync(scriptPath, 0o755); + return scriptPath; +} + +/** + * Creates a mock executor script that fails N times, then succeeds. + * Uses a counter file to track invocations. + */ +function createFlakeyExecutor(repoDir: string, failCount: number): string { + const scriptPath = resolve(repoDir, "mock-flakey.sh"); + const counterPath = resolve(repoDir, "invoke-counter.txt"); + + // Initialize counter + writeFileSync(counterPath, "0"); + + // Script increments counter and fails if count <= failCount + const script = `#!/bin/bash +COUNTER_FILE="${counterPath}" +COUNT=$(cat "$COUNTER_FILE") +COUNT=$((COUNT + 1)) +echo "$COUNT" > "$COUNTER_FILE" +if [ "$COUNT" -le ${failCount} ]; then + exit 1 +fi +# Output completion promise (required for agent run to succeed) +echo "" +echo "COMPLETION PROMISE: I confirm that:" +echo "- Tests pass: N/A" +echo "- Build succeeds: N/A" +echo "- Changes committed: N/A" +echo "- Task completed: Mock task" +echo "-- I have not taken any shortcuts or skipped verification." +exit 0 +`; + writeFileSync(scriptPath, script); + chmodSync(scriptPath, 0o755); + return scriptPath; +} + +/** + * Creates a mock executor that marks the task as done. + * This simulates a real agent completing its work. + */ +function createTaskCompletingExecutor(repoDir: string, zagiPath: string): string { + const scriptPath = resolve(repoDir, "mock-complete.sh"); + // The prompt contains the task ID - extract and mark done + // Format: "Task ID: task-XXX\n..." + const script = `#!/bin/bash +PROMPT="$1" +TASK_ID=$(echo "$PROMPT" | head -1 | sed 's/Task ID: //') + +# Mark task as done (let errors show for debugging) +${zagiPath} tasks done "$TASK_ID" + +# Output completion promise (required for agent run to succeed) +echo "" +echo "COMPLETION PROMISE: I confirm that:" +echo "- Tests pass: N/A" +echo "- Build succeeds: N/A" +echo "- Changes committed: N/A" +echo "- Task completed: Mock task" +echo "-- I have not taken any shortcuts or skipped verification." + +exit 0 +`; + writeFileSync(scriptPath, script); + chmodSync(scriptPath, 0o755); + return scriptPath; +} + +/** + * Creates a mock executor that logs all arguments to a file. + * This allows us to verify exactly what arguments were passed. + */ +function createArgLoggingExecutor(repoDir: string): { script: string; logFile: string } { + const scriptPath = resolve(repoDir, "mock-log-args.sh"); + const logFile = resolve(repoDir, "args.log"); + + const script = `#!/bin/bash +# Log each argument on a separate line +for arg in "$@"; do + echo "$arg" >> "${logFile}" +done +echo "---END---" >> "${logFile}" + +# Output completion promise to stdout (required for agent run to succeed) +echo "" +echo "COMPLETION PROMISE: I confirm that:" +echo "- Tests pass: N/A" +echo "- Build succeeds: N/A" +echo "- Changes committed: N/A" +echo "- Task completed: Mock task" +echo "-- I have not taken any shortcuts or skipped verification." + +exit 0 +`; + writeFileSync(scriptPath, script); + chmodSync(scriptPath, 0o755); + + return { script: scriptPath, logFile }; +} + +/** + * Creates an executor with multiple arguments that logs them. + * Returns both the base command path and the args log file. + */ +function createMultiArgExecutor(repoDir: string): { script: string; logFile: string } { + const scriptPath = resolve(repoDir, "mock-multi-arg.sh"); + const logFile = resolve(repoDir, "multi-args.log"); + + // Script logs: the script name ($0), all args ($@), and arg count ($#) + const script = `#!/bin/bash +echo "ARG_COUNT: $#" >> "${logFile}" +for arg in "$@"; do + echo "ARG: $arg" >> "${logFile}" +done + +# Output completion promise to stdout (required for agent run to succeed) +echo "" +echo "COMPLETION PROMISE: I confirm that:" +echo "- Tests pass: N/A" +echo "- Build succeeds: N/A" +echo "- Changes committed: N/A" +echo "- Task completed: Mock task" +echo "-- I have not taken any shortcuts or skipped verification." + +exit 0 +`; + writeFileSync(scriptPath, script); + chmodSync(scriptPath, 0o755); + + return { script: scriptPath, logFile }; +} + +// ============================================================================ +// Subcommand Routing +// ============================================================================ + +describe("zagi agent subcommand routing", () => { + test("shows help when no subcommand provided", () => { + const result = zagi(["agent"], { cwd: REPO_DIR }); + + expect(result).toContain("usage: git agent "); + expect(result).toContain("Commands:"); + expect(result).toContain("run"); + expect(result).toContain("plan"); + }); + + test("-h flag shows help", () => { + const result = zagi(["agent", "-h"], { cwd: REPO_DIR }); + + expect(result).toContain("usage: git agent "); + expect(result).toContain("Commands:"); + }); + + test("--help flag shows help", () => { + const result = zagi(["agent", "--help"], { cwd: REPO_DIR }); + + expect(result).toContain("usage: git agent "); + expect(result).toContain("Commands:"); + }); + + test("unknown subcommand shows error with help", () => { + const result = zagi(["agent", "unknown"], { cwd: REPO_DIR }); + + expect(result).toContain("error: unknown command 'unknown'"); + expect(result).toContain("usage: git agent "); + }); + + test("unknown subcommand with special characters shows error", () => { + const result = zagi(["agent", "--invalid-flag"], { cwd: REPO_DIR }); + + expect(result).toContain("error: unknown command '--invalid-flag'"); + }); + + test("help mentions environment variables", () => { + const result = zagi(["agent"], { cwd: REPO_DIR }); + + expect(result).toContain("ZAGI_AGENT"); + expect(result).toContain("ZAGI_AGENT_CMD"); + expect(result).toContain("claude"); + expect(result).toContain("opencode"); + }); +}); + +// ============================================================================ +// Plan Args: --help, --dry-run, description handling +// ============================================================================ + +describe("zagi agent plan args", () => { + test("-h flag shows plan help", () => { + const result = zagi(["agent", "plan", "-h"], { cwd: REPO_DIR }); + + expect(result).toContain("usage: git agent plan"); + expect(result).toContain("--dry-run"); + expect(result).toContain("-h, --help"); + }); + + test("--help flag shows plan help", () => { + const result = zagi(["agent", "plan", "--help"], { cwd: REPO_DIR }); + + expect(result).toContain("usage: git agent plan"); + expect(result).toContain("description"); + }); + + test("description is optional (interactive mode)", () => { + const { script } = createArgLoggingExecutor(REPO_DIR); + + // Plan without description should work + const result = zagi(["agent", "plan"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: script } + }); + + expect(result).toContain("Starting Interactive Planning Session"); + expect(result).not.toContain("error"); + }); + + test("unknown option shows error", () => { + const result = zagi(["agent", "plan", "--unknown-option"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "echo" } + }); + + expect(result).toContain("error: unknown option '--unknown-option'"); + }); + + test("help shows examples", () => { + const result = zagi(["agent", "plan", "--help"], { cwd: REPO_DIR }); + + expect(result).toContain("Examples:"); + expect(result).toContain("git agent plan"); + }); +}); + +// ============================================================================ +// Run Args: --help, --delay, --max-tasks +// ============================================================================ + +describe("zagi agent run args", () => { + test("-h flag shows run help", () => { + const result = zagi(["agent", "run", "-h"], { cwd: REPO_DIR }); + + expect(result).toContain("usage: git agent run"); + expect(result).toContain("--once"); + expect(result).toContain("--dry-run"); + expect(result).toContain("--delay"); + expect(result).toContain("--max-tasks"); + }); + + test("--help flag shows run help", () => { + const result = zagi(["agent", "run", "--help"], { cwd: REPO_DIR }); + + expect(result).toContain("usage: git agent run"); + expect(result).toContain("Options:"); + }); + + test("--delay flag requires a value", () => { + const result = zagi(["agent", "run", "--delay"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "echo" } + }); + + expect(result).toContain("error: --delay requires a number of seconds"); + }); + + test("--delay flag validates numeric input", () => { + const result = zagi(["agent", "run", "--delay", "abc"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "echo" } + }); + + expect(result).toContain("error: invalid delay value 'abc'"); + }); + + test("--delay accepts valid numeric value", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--delay", "5", "--dry-run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "echo" } + }); + + expect(result).toContain("dry-run mode"); + expect(result).not.toContain("error"); + }); + + test("--max-tasks flag requires a value", () => { + const result = zagi(["agent", "run", "--max-tasks"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "echo" } + }); + + expect(result).toContain("error: --max-tasks requires a number"); + }); + + test("--max-tasks flag validates numeric input", () => { + const result = zagi(["agent", "run", "--max-tasks", "not-a-number"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "echo" } + }); + + expect(result).toContain("error: invalid max-tasks value 'not-a-number'"); + }); + + test("--max-tasks accepts valid numeric value", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--max-tasks", "10", "--dry-run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "echo" } + }); + + expect(result).toContain("dry-run mode"); + expect(result).not.toContain("error"); + }); + + test("unknown option shows error", () => { + const result = zagi(["agent", "run", "--unknown-flag"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "echo" } + }); + + expect(result).toContain("error: unknown option '--unknown-flag'"); + }); + + test("help shows examples", () => { + const result = zagi(["agent", "run", "--help"], { cwd: REPO_DIR }); + + expect(result).toContain("Examples:"); + expect(result).toContain("git agent run"); + expect(result).toContain("git agent run --once"); + }); + + test("multiple flags can be combined", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--once", "--dry-run", "--delay", "0", "--max-tasks", "5"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "echo" } + }); + + expect(result).toContain("dry-run mode"); + expect(result).toContain("Starting task:"); + expect(result).not.toContain("error"); + }); +}); + +// ============================================================================ +// Executor Paths: claude default, opencode, ZAGI_AGENT_CMD override +// ============================================================================ + +describe("zagi agent executor paths", () => { + test("claude is the default executor", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + // Without ZAGI_AGENT set, should default to claude + const result = zagi(["agent", "run", "--dry-run", "--once"], { + cwd: REPO_DIR + }); + + expect(result).toContain("Executor: claude"); + expect(result).toContain("Would execute:"); + expect(result).toContain("claude -p"); + }); + + test("ZAGI_AGENT=claude uses claude executor", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--dry-run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); + + expect(result).toContain("Executor: claude"); + expect(result).toContain("claude -p"); + }); + + test("ZAGI_AGENT=opencode uses opencode executor", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--dry-run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "opencode" } + }); + + expect(result).toContain("Executor: opencode"); + expect(result).toContain("Would execute:"); + expect(result).toContain("opencode run"); + }); + + test("ZAGI_AGENT_CMD overrides default executor", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--dry-run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "my-custom-agent --flag" } + }); + + expect(result).toContain("Would execute:"); + expect(result).toContain("my-custom-agent --flag"); + }); + + test("ZAGI_AGENT_CMD overrides ZAGI_AGENT when both set", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--dry-run", "--once"], { + cwd: REPO_DIR, + env: { + ZAGI_AGENT: "opencode", + ZAGI_AGENT_CMD: "custom-cmd" + } + }); + + // Custom command should win + expect(result).toContain("Would execute:"); + expect(result).toContain("custom-cmd"); + expect(result).not.toContain("opencode run"); + }); + + test("invalid ZAGI_AGENT value shows error", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "invalid-value" } + }); + + expect(result).toContain("error: invalid ZAGI_AGENT value 'invalid-value'"); + expect(result).toContain("valid values: claude, opencode"); + }); + + test("ZAGI_AGENT=1 is invalid (not a valid executor name)", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "1" } + }); + + expect(result).toContain("error: invalid ZAGI_AGENT value"); + }); + + test("plan subcommand uses claude in interactive mode (no -p flag)", () => { + const result = zagi(["agent", "plan", "--dry-run"], { + cwd: REPO_DIR + }); + + // In plan mode, claude runs interactively (no -p flag) + expect(result).toContain("Would execute:"); + expect(result).toContain("claude"); + expect(result).not.toMatch(/claude -p/); + }); + + test("plan subcommand uses opencode in interactive mode (no run subcommand)", () => { + const result = zagi(["agent", "plan", "--dry-run"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "opencode" } + }); + + // In plan mode, opencode runs interactively (no run subcommand) + expect(result).toContain("Would execute:"); + expect(result).toContain("opencode"); + // Should NOT contain "opencode run" since that's for headless mode + expect(result).not.toMatch(/opencode run/); + }); +}); + +// ============================================================================ +// Agent Run: Basic RALPH Loop Behavior +// ============================================================================ + +describe("zagi agent run RALPH loop", () => { + test("exits immediately when no pending tasks", () => { + const executor = createSuccessExecutor(REPO_DIR); + + const result = zagi(["agent", "run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: executor } + }); + + expect(result).toContain("No pending tasks remaining"); + expect(result).toContain("All tasks complete"); + }); + + test("runs single task with --once flag", () => { + const executor = createSuccessExecutor(REPO_DIR); + + // Add a task + zagi(["tasks", "add", "Test task one"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: executor } + }); + + expect(result).toContain("Starting task: task-001"); + expect(result).toContain("Task completed successfully"); + expect(result).toContain("Exiting after one task (--once flag set)"); + }); + + test("processes multiple tasks in sequence", () => { + // Use an executor that marks tasks done + const zagiPath = resolve(__dirname, "../../zig-out/bin/zagi"); + const executor = createTaskCompletingExecutor(REPO_DIR, zagiPath); + + // Add multiple tasks + zagi(["tasks", "add", "Task one"], { cwd: REPO_DIR }); + zagi(["tasks", "add", "Task two"], { cwd: REPO_DIR }); + + // Tasks will be marked done, so both should be processed + const result = zagi(["agent", "run", "--delay", "0"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: executor } + }); + + expect(result).toContain("Starting task: task-001"); + expect(result).toContain("Starting task: task-002"); + expect(result).toContain("2 tasks processed"); + }); + + test("respects --max-tasks safety limit", () => { + const executor = createSuccessExecutor(REPO_DIR); + + // Add more tasks than max + zagi(["tasks", "add", "Task one"], { cwd: REPO_DIR }); + zagi(["tasks", "add", "Task two"], { cwd: REPO_DIR }); + zagi(["tasks", "add", "Task three"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--max-tasks", "2", "--delay", "0"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: executor } + }); + + expect(result).toContain("Reached maximum task limit (2)"); + expect(result).toContain("2 tasks processed"); + }); +}); + +// ============================================================================ +// Agent Run: Consecutive Failure Tracking +// ============================================================================ + +describe("zagi agent run consecutive failure counting", () => { + test("tracks consecutive failures for same task", () => { + const executor = createFailureExecutor(REPO_DIR); + + zagi(["tasks", "add", "Failing task"], { cwd: REPO_DIR }); + + // Run with --max-tasks to limit iterations + const result = zagi(["agent", "run", "--max-tasks", "5", "--delay", "0"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: executor } + }); + + // Should show increasing failure counts + expect(result).toContain("Task failed (1 consecutive failures)"); + expect(result).toContain("Task failed (2 consecutive failures)"); + expect(result).toContain("Task failed (3 consecutive failures)"); + expect(result).toContain("Skipping task after 3 consecutive failures"); + }); + + test("increments failure counter on each failure", () => { + const executor = createFailureExecutor(REPO_DIR); + + zagi(["tasks", "add", "Will fail"], { cwd: REPO_DIR }); + + // Run with enough iterations to see 3 failures + const result = zagi(["agent", "run", "--max-tasks", "4", "--delay", "0"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: executor } + }); + + // Count failure messages + const failureMatches = result.match(/Task failed \(\d+ consecutive failures\)/g); + expect(failureMatches).toBeTruthy(); + expect(failureMatches!.length).toBe(3); + }); + + test("resets failure counter on success", () => { + // Create a flakey executor that fails twice, then succeeds + const executor = createFlakeyExecutor(REPO_DIR, 2); + + zagi(["tasks", "add", "Flakey task"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--max-tasks", "4", "--delay", "0"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: executor } + }); + + // Should fail twice, then succeed + expect(result).toContain("Task failed (1 consecutive failures)"); + expect(result).toContain("Task failed (2 consecutive failures)"); + expect(result).toContain("Task completed successfully"); + + // Should NOT show 3 failures - it recovered + expect(result).not.toContain("Task failed (3 consecutive failures)"); + }); +}); + +// ============================================================================ +// Agent Run: Max Failures Exit Condition +// ============================================================================ + +describe("zagi agent run max failures exit condition", () => { + test("skips task after 3 consecutive failures", () => { + const executor = createFailureExecutor(REPO_DIR); + + zagi(["tasks", "add", "Broken task"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--delay", "0"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: executor } + }); + + expect(result).toContain("Skipping task after 3 consecutive failures"); + expect(result).toContain("All remaining tasks have failed 3+ times"); + }); + + test("exits when all tasks exceed failure threshold", () => { + const executor = createFailureExecutor(REPO_DIR); + + // Add multiple tasks - all will fail + zagi(["tasks", "add", "Task one"], { cwd: REPO_DIR }); + zagi(["tasks", "add", "Task two"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--delay", "0"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: executor } + }); + + // Each task should fail 3 times + expect(result).toContain("All remaining tasks have failed 3+ times"); + expect(result).toContain("RALPH loop completed"); + }); + + test("continues with other tasks when one exceeds failure threshold", () => { + // Create a script that fails for task-001 but succeeds for task-002 + const smartScript = resolve(REPO_DIR, "mock-smart.sh"); + writeFileSync(smartScript, `#!/bin/bash +PROMPT="$1" +if echo "$PROMPT" | grep -q "task-001"; then + exit 1 +fi +# Output completion promise for task-002 +echo "" +echo "COMPLETION PROMISE: I confirm that:" +echo "- Tests pass: N/A" +echo "- Build succeeds: N/A" +echo "- Changes committed: N/A" +echo "- Task completed: Mock task" +echo "-- I have not taken any shortcuts or skipped verification." +exit 0 +`); + chmodSync(smartScript, 0o755); + + zagi(["tasks", "add", "Will always fail"], { cwd: REPO_DIR }); + zagi(["tasks", "add", "Will succeed"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--max-tasks", "10", "--delay", "0"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: smartScript } + }); + + // First task should fail 3 times + expect(result).toContain("Skipping task after 3 consecutive failures"); + + // Second task should eventually be attempted and succeed + expect(result).toContain("Starting task: task-002"); + expect(result).toContain("Task completed successfully"); + }); + + test("uses exactly 3 as the failure threshold", () => { + // Executor fails exactly twice, then succeeds + const executor = createFlakeyExecutor(REPO_DIR, 2); + + zagi(["tasks", "add", "Recovers after 2 failures"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--max-tasks", "4", "--delay", "0"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: executor } + }); + + // Should succeed on third attempt (2 failures is below threshold) + expect(result).toContain("Task completed successfully"); + expect(result).not.toContain("Skipping task after 3 consecutive failures"); + }); +}); + +// ============================================================================ +// Agent Run: Dry Run Mode +// ============================================================================ + +describe("zagi agent run --dry-run", () => { + test("shows what would run without executing", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + // Use ZAGI_AGENT_CMD to avoid trying to run actual claude command + const result = zagi(["agent", "run", "--dry-run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "echo" } + }); + + expect(result).toContain("dry-run mode"); + expect(result).toContain("Starting task: task-001"); + expect(result).toContain("Would execute:"); + }); + + test("dry-run shows custom executor command", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--dry-run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "aider --yes" } + }); + + expect(result).toContain("Would execute:"); + expect(result).toContain("aider --yes"); + }); + + test("dry-run respects --max-tasks", () => { + zagi(["tasks", "add", "Task one"], { cwd: REPO_DIR }); + zagi(["tasks", "add", "Task two"], { cwd: REPO_DIR }); + zagi(["tasks", "add", "Task three"], { cwd: REPO_DIR }); + + // Use ZAGI_AGENT_CMD to avoid validation issues + const result = zagi(["agent", "run", "--dry-run", "--max-tasks", "2"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "echo" } + }); + + // In dry-run mode without marking tasks done, it will keep looping on the same task + // until max-tasks is reached + expect(result).toContain("Starting task: task-001"); + expect(result).toContain("Reached maximum task limit (2)"); + expect(result).toContain("2 tasks processed"); + }); +}); + +// ============================================================================ +// Agent Run: Task Completion Integration +// ============================================================================ + +describe("zagi agent run task completion", () => { + test("loops until tasks are marked done", () => { + // Get the zagi binary path + const zagiPath = resolve(__dirname, "../../zig-out/bin/zagi"); + const executor = createTaskCompletingExecutor(REPO_DIR, zagiPath); + + zagi(["tasks", "add", "Complete me"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--delay", "0"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: executor } + }); + + expect(result).toContain("Starting task: task-001"); + expect(result).toContain("Task completed successfully"); + expect(result).toContain("No pending tasks remaining"); + expect(result).toContain("All tasks complete"); + + // Verify task is actually marked done (uses checkmark symbol) + const listResult = zagi(["tasks", "list"], { cwd: REPO_DIR }); + expect(listResult).toContain("[✓] task-001"); + expect(listResult).toContain("(0 pending, 1 completed)"); + }); + + test("processes all tasks until completion", () => { + const zagiPath = resolve(__dirname, "../../zig-out/bin/zagi"); + const executor = createTaskCompletingExecutor(REPO_DIR, zagiPath); + + zagi(["tasks", "add", "First task"], { cwd: REPO_DIR }); + zagi(["tasks", "add", "Second task"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--delay", "0"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: executor } + }); + + expect(result).toContain("Starting task: task-001"); + expect(result).toContain("Starting task: task-002"); + expect(result).toContain("No pending tasks remaining"); + expect(result).toContain("2 tasks processed"); + + // Verify both tasks done + const listResult = zagi(["tasks", "list"], { cwd: REPO_DIR }); + expect(listResult).toContain("(0 pending, 2 completed)"); + }); +}); + +// ============================================================================ +// Agent Run: Error Handling +// ============================================================================ + +describe("zagi agent run error handling", () => { + test("invalid ZAGI_AGENT value shows error", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "invalid-executor" } + }); + + expect(result).toContain("error: invalid ZAGI_AGENT value"); + expect(result).toContain("valid values: claude, opencode"); + expect(result).toContain("use ZAGI_AGENT_CMD for custom executors"); + }); + + test("ZAGI_AGENT_CMD bypasses ZAGI_AGENT validation", () => { + const executor = createSuccessExecutor(REPO_DIR); + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + // Even with invalid ZAGI_AGENT, custom cmd should work + const result = zagi(["agent", "run", "--once"], { + cwd: REPO_DIR, + env: { + ZAGI_AGENT: "invalid", + ZAGI_AGENT_CMD: executor + } + }); + + expect(result).toContain("Task completed successfully"); + expect(result).not.toContain("error: invalid ZAGI_AGENT"); + }); +}); + +// ============================================================================ +// Agent Run: ZAGI_AGENT_CMD Override +// ============================================================================ + +describe("zagi agent run ZAGI_AGENT_CMD override", () => { + test("uses custom command instead of default executor", () => { + const { script, logFile } = createArgLoggingExecutor(REPO_DIR); + + zagi(["tasks", "add", "Test custom command"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: script } + }); + + expect(result).toContain("Task completed successfully"); + + // Verify the custom script was called (log file exists and has content) + const { readFileSync } = require("fs"); + const logContent = readFileSync(logFile, "utf-8"); + expect(logContent).toContain("---END---"); // Script was executed + + // The prompt should be passed as the argument + expect(logContent).toContain("Task ID: task-001"); + }); + + test("handles command with spaces and multiple arguments", () => { + const { script, logFile } = createMultiArgExecutor(REPO_DIR); + + zagi(["tasks", "add", "Test multi-arg command"], { cwd: REPO_DIR }); + + // Command with multiple space-separated arguments + const result = zagi(["agent", "run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: `${script} --yes --model gpt-4` } + }); + + expect(result).toContain("Task completed successfully"); + + // Verify all arguments were passed correctly + const { readFileSync } = require("fs"); + const logContent = readFileSync(logFile, "utf-8"); + + // Should have: --yes, --model, gpt-4, and the prompt = 4 args + expect(logContent).toContain("ARG_COUNT: 4"); + expect(logContent).toContain("ARG: --yes"); + expect(logContent).toContain("ARG: --model"); + expect(logContent).toContain("ARG: gpt-4"); + // The prompt is the last argument + expect(logContent).toMatch(/ARG:.*Task ID: task-001/); + }); + + test("prompt is appended as final argument", () => { + const { script, logFile } = createMultiArgExecutor(REPO_DIR); + + zagi(["tasks", "add", "Test prompt positioning"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: `${script} --first --second` } + }); + + expect(result).toContain("Task completed successfully"); + + // Parse log to verify argument order + const { readFileSync } = require("fs"); + const logContent = readFileSync(logFile, "utf-8"); + const lines = logContent.split("\n").filter((l: string) => l.startsWith("ARG: ")); + + // Arguments should be: --first, --second, + expect(lines.length).toBe(3); + expect(lines[0]).toBe("ARG: --first"); + expect(lines[1]).toBe("ARG: --second"); + expect(lines[2]).toContain("Task ID: task-001"); // Prompt is last + }); + + test("dry-run shows ZAGI_AGENT_CMD in output", () => { + zagi(["tasks", "add", "Test dry-run display"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--dry-run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "my-custom-agent --verbose --timeout 30" } + }); + + expect(result).toContain("dry-run mode"); + expect(result).toContain("Would execute:"); + expect(result).toContain("my-custom-agent --verbose --timeout 30"); + }); + + test("custom command overrides ZAGI_AGENT completely", () => { + const { script, logFile } = createArgLoggingExecutor(REPO_DIR); + + zagi(["tasks", "add", "Test override"], { cwd: REPO_DIR }); + + // Set both ZAGI_AGENT and ZAGI_AGENT_CMD - CMD should win + const result = zagi(["agent", "run", "--once"], { + cwd: REPO_DIR, + env: { + ZAGI_AGENT: "opencode", + ZAGI_AGENT_CMD: script + } + }); + + expect(result).toContain("Task completed successfully"); + + // Verify our custom script was used (not opencode) + const { readFileSync } = require("fs"); + const logContent = readFileSync(logFile, "utf-8"); + expect(logContent).toContain("---END---"); // Our script ran + expect(logContent).toContain("Task ID:"); // Got the prompt + }); +}); + +// ============================================================================ +// Agent Plan: ZAGI_AGENT_CMD Override +// ============================================================================ + +describe("zagi agent plan ZAGI_AGENT_CMD override", () => { + test("dry-run shows custom command", () => { + const result = zagi(["agent", "plan", "--dry-run", "Test planning"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "my-planner --interactive" } + }); + + expect(result).toContain("Interactive Planning Session (dry-run)"); + expect(result).toContain("Would execute:"); + expect(result).toContain("my-planner --interactive"); + }); + + test("dry-run shows custom command with multiple args", () => { + const result = zagi(["agent", "plan", "--dry-run", "Build feature X"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "aider --yes --model claude-3" } + }); + + expect(result).toContain("Would execute:"); + expect(result).toContain("aider --yes --model claude-3"); + }); + + test("custom command is used instead of default", () => { + const { script, logFile } = createArgLoggingExecutor(REPO_DIR); + + // This will actually execute the mock script + const result = zagi(["agent", "plan", "Test planning execution"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: script } + }); + + expect(result).toContain("Starting Interactive Planning Session"); + expect(result).toContain("Planning session completed"); + + // Verify script received the planning prompt + const { readFileSync } = require("fs"); + const logContent = readFileSync(logFile, "utf-8"); + expect(logContent).toContain("interactive planning agent"); + expect(logContent).toContain("Test planning execution"); // The description + }); +}); + +// ============================================================================ +// Agent Plan: Interactive Mode (no description required) +// ============================================================================ + +describe("zagi agent plan interactive mode", () => { + test("works without initial description", () => { + const { script, logFile } = createArgLoggingExecutor(REPO_DIR); + + // No description provided - should start interactive session + const result = zagi(["agent", "plan"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: script } + }); + + expect(result).toContain("Starting Interactive Planning Session"); + expect(result).toContain("Planning session completed"); + + // Verify script received the interactive prompt + const { readFileSync } = require("fs"); + const logContent = readFileSync(logFile, "utf-8"); + expect(logContent).toContain("interactive planning agent"); + expect(logContent).toContain("start by asking what the user wants to build"); + }); + + test("dry-run without description shows will ask user", () => { + const result = zagi(["agent", "plan", "--dry-run"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "my-agent" } + }); + + expect(result).toContain("Interactive Planning Session (dry-run)"); + expect(result).toContain("Initial context: (none - will ask user)"); + expect(result).toContain("Would execute:"); + }); + + test("dry-run with description shows the context", () => { + const result = zagi(["agent", "plan", "--dry-run", "Add auth feature"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "my-agent" } + }); + + expect(result).toContain("Interactive Planning Session (dry-run)"); + expect(result).toContain("Initial context: Add auth feature"); + }); + + test("prompt includes interactive protocol phases", () => { + const result = zagi(["agent", "plan", "--dry-run"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "my-agent" } + }); + + // Verify the prompt includes the 4-phase protocol + expect(result).toContain("PHASE 1: EXPLORE CODEBASE"); + expect(result).toContain("PHASE 2: ASK CLARIFYING QUESTIONS"); + expect(result).toContain("PHASE 3: PROPOSE PLAN"); + expect(result).toContain("PHASE 4: CREATE TASKS"); + expect(result).toContain("NEVER create tasks without explicit user approval"); + }); + + test("prompt includes clarifying question categories", () => { + const result = zagi(["agent", "plan", "--dry-run"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "my-agent" } + }); + + // Verify the prompt includes question categories for scope, constraints, preferences + expect(result).toContain("SCOPE questions:"); + expect(result).toContain("CONSTRAINTS questions:"); + expect(result).toContain("PREFERENCES questions:"); + expect(result).toContain("ACCEPTANCE CRITERIA questions:"); + }); + + test("prompt emphasizes asking questions before drafting plan", () => { + const result = zagi(["agent", "plan", "--dry-run"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "my-agent" } + }); + + // Verify the prompt emphasizes asking questions first + expect(result).toContain("DO NOT draft a plan yet"); + expect(result).toContain("ALWAYS ask clarifying questions BEFORE drafting a plan"); + }); +}); + +// ============================================================================ +// Error Conditions: Executor Not Found +// ============================================================================ + +describe("error conditions: executor not found", () => { + test("agent run shows error when executor command not found", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + // Use a non-existent command + const result = zagi(["agent", "run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "/nonexistent/command/that/does/not/exist" } + }); + + // Should show error about execution failure + expect(result).toMatch(/error|fail|unable/i); + }); + + test("agent plan shows error when executor command not found", () => { + // Use a non-existent command + const result = zagi(["agent", "plan", "Test plan"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "/nonexistent/command/that/does/not/exist" } + }); + + // Should show error about execution failure + expect(result).toMatch(/error|fail/i); + }); + + test("agent run handles executor that exits with error code", () => { + const failScript = createFailureExecutor(REPO_DIR); + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: failScript } + }); + + expect(result).toContain("Task failed (1 consecutive failures)"); + }); + + test("agent run continues after executor failure with remaining tasks", () => { + const failScript = createFailureExecutor(REPO_DIR); + zagi(["tasks", "add", "Task 1"], { cwd: REPO_DIR }); + zagi(["tasks", "add", "Task 2"], { cwd: REPO_DIR }); + + const result = zagi(["agent", "run", "--max-tasks", "4", "--delay", "0"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: failScript } + }); + + // Should attempt multiple tasks even when failing + expect(result).toContain("Starting task:"); + expect(result).toContain("Task failed"); + }); + + test("dry-run mode works even with non-existent executor", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + // Dry-run should succeed without trying to execute + const result = zagi(["agent", "run", "--dry-run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "/nonexistent/command" } + }); + + expect(result).toContain("dry-run mode"); + expect(result).toContain("Would execute:"); + expect(result).toContain("Starting task: task-001"); + expect(result).not.toContain("error"); + }); +}); + +// ============================================================================ +// Error Conditions: No Tasks Exist (Agent Run) +// ============================================================================ + +describe("error conditions: no tasks exist (agent run)", () => { + test("agent run shows helpful message when no tasks", () => { + const executor = createSuccessExecutor(REPO_DIR); + + const result = zagi(["agent", "run"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: executor } + }); + + expect(result).toContain("No pending tasks remaining"); + expect(result).toContain("All tasks complete"); + }); + + test("agent run with --once shows clear message when no tasks", () => { + const executor = createSuccessExecutor(REPO_DIR); + + const result = zagi(["agent", "run", "--once"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: executor } + }); + + expect(result).toContain("No pending tasks remaining"); + expect(result).toContain("All tasks complete"); + }); + + test("agent run suggests next action when no tasks", () => { + const executor = createSuccessExecutor(REPO_DIR); + + const result = zagi(["agent", "run"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: executor } + }); + + // Should suggest viewing tasks with pr command + expect(result).toContain("zagi tasks pr"); + }); + + test("agent run dry-run shows no tasks message", () => { + const result = zagi(["agent", "run", "--dry-run"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "echo" } + }); + + expect(result).toContain("No pending tasks remaining"); + }); +}); diff --git a/test/src/commit.test.ts b/test/src/commit.test.ts index 770876f..ccb43db 100644 --- a/test/src/commit.test.ts +++ b/test/src/commit.test.ts @@ -1,7 +1,6 @@ import { describe, test, expect, beforeEach, afterEach } from "vitest"; import { resolve } from "path"; -import { writeFileSync, appendFileSync, rmSync } from "fs"; -import { createFixtureRepo } from "../fixtures/setup"; +import { writeFileSync, appendFileSync } from "fs"; import { zagi, git, createTestRepo, cleanupTestRepo } from "./shared"; let REPO_DIR: string; @@ -12,14 +11,13 @@ function stageTestFile() { git(["add", "commit-test.txt"], { cwd: REPO_DIR }); } +// Use lightweight repo - these tests don't need multiple commits beforeEach(() => { - REPO_DIR = createFixtureRepo(); + REPO_DIR = createTestRepo(); }); afterEach(() => { - if (REPO_DIR) { - rmSync(REPO_DIR, { recursive: true, force: true }); - } + cleanupTestRepo(REPO_DIR); }); describe("zagi commit", () => { @@ -101,10 +99,14 @@ describe("zagi commit --prompt", () => { "My test prompt text", ], { cwd: REPO_DIR }); - // Read the note using git notes command - const noteResult = git(["notes", "--ref=prompts", "show", "HEAD"], { cwd: REPO_DIR }); + // Agent name stored in refs/notes/agent (plain text) + const agentNote = git(["notes", "--ref=agent", "show", "HEAD"], { cwd: REPO_DIR }); + // Agent will be "terminal" in test env (CLAUDECODE cleared), or actual agent if set + expect(agentNote.trim().length).toBeGreaterThan(0); - expect(noteResult).toContain("My test prompt text"); + // Prompt stored in refs/notes/prompt (plain text) + const promptNote = git(["notes", "--ref=prompt", "show", "HEAD"], { cwd: REPO_DIR }); + expect(promptNote).toContain("My test prompt text"); }); test("prompt shown with --prompts in log", () => { @@ -150,7 +152,7 @@ describe("ZAGI_AGENT", () => { ); expect(result).toContain("--prompt required"); - expect(result).toContain("ZAGI_AGENT"); + expect(result).toContain("agent mode"); }); test("ZAGI_AGENT succeeds with --prompt", () => { diff --git a/test/src/diff.test.ts b/test/src/diff.test.ts index a7efc89..5ebfa2f 100644 --- a/test/src/diff.test.ts +++ b/test/src/diff.test.ts @@ -1,19 +1,24 @@ import { describe, test, expect, beforeEach, afterEach } from "vitest"; import { resolve } from "path"; -import { rmSync, writeFileSync, readFileSync } from "fs"; -import { createFixtureRepo } from "../fixtures/setup"; -import { zagi, git } from "./shared"; +import { writeFileSync, readFileSync, mkdirSync, appendFileSync } from "fs"; +import { zagi, git, createTestRepo, cleanupTestRepo } from "./shared"; let REPO_DIR: string; +// Use lightweight repo - these tests don't need multiple commits beforeEach(() => { - REPO_DIR = createFixtureRepo(); + REPO_DIR = createTestRepo(); + // Create test files for diff operations - needs enough lines to test deletions + mkdirSync(resolve(REPO_DIR, "src"), { recursive: true }); + writeFileSync(resolve(REPO_DIR, "src/main.ts"), 'export function main() {\n console.log("hello");\n // line 3\n // line 4\n // line 5\n // line 6\n}\n'); + git(["add", "src/main.ts"], { cwd: REPO_DIR }); + git(["commit", "-m", "Add main.ts"], { cwd: REPO_DIR }); + // Create uncommitted changes for diff tests + appendFileSync(resolve(REPO_DIR, "src/main.ts"), "\n// Modified\n"); }); afterEach(() => { - if (REPO_DIR) { - rmSync(REPO_DIR, { recursive: true, force: true }); - } + cleanupTestRepo(REPO_DIR); }); describe("zagi diff", () => { @@ -328,3 +333,34 @@ describe("zagi diff --name-only", () => { expect(result).toContain("src/main.ts"); }); }); + +describe("zagi diff file path handling", () => { + test("file path without -- is treated as path, not revision", () => { + // This was a bug: `git diff path/to/file` would fail with + // "failed to walk commits" because the path was treated as a revision + const result = zagi(["diff", "src/main.ts"], { cwd: REPO_DIR }); + + // Should show diff for that file, not error + expect(result).toContain("src/main.ts"); + expect(result).not.toContain("failed to walk"); + }); + + test("non-existent path is treated as revision spec", () => { + // If path doesn't exist, treat as revision (will fail appropriately) + const result = zagi(["diff", "nonexistent-branch"], { cwd: REPO_DIR }); + + // Should fail because it's not a valid revision + expect(result).toContain("failed to walk commits"); + }); + + test("multiple file paths work without --", () => { + // Create another file with changes + writeFileSync(resolve(REPO_DIR, "README.md"), "# Updated\n"); + + const result = zagi(["diff", "src/main.ts", "README.md"], { cwd: REPO_DIR }); + + // Should show diff for both files + expect(result).toContain("src/main.ts"); + expect(result).toContain("README.md"); + }); +}); diff --git a/test/src/shared.ts b/test/src/shared.ts index 5cc0eff..03ae3b5 100644 --- a/test/src/shared.ts +++ b/test/src/shared.ts @@ -21,13 +21,24 @@ export function zagi(args: string[], options: ZagiOptions = {}): string { // Create isolated env - start with current env const env = { ...process.env }; - // By default, remove zagi env vars unless explicitly set + // By default, remove agent mode env vars unless explicitly set + // This ensures tests run outside agent mode by default if (!("ZAGI_AGENT" in envOverrides)) { delete env.ZAGI_AGENT; } + if (!("ZAGI_AGENT_CMD" in envOverrides)) { + delete env.ZAGI_AGENT_CMD; + } if (!("ZAGI_STRIP_COAUTHORS" in envOverrides)) { delete env.ZAGI_STRIP_COAUTHORS; } + // Also clear CLAUDECODE and OPENCODE which trigger agent mode + if (!("CLAUDECODE" in envOverrides)) { + delete env.CLAUDECODE; + } + if (!("OPENCODE" in envOverrides)) { + delete env.OPENCODE; + } // Apply overrides (undefined removes the key) for (const [key, value] of Object.entries(envOverrides)) { diff --git a/test/src/status.test.ts b/test/src/status.test.ts index b7be4bb..86d93aa 100644 --- a/test/src/status.test.ts +++ b/test/src/status.test.ts @@ -1,18 +1,25 @@ import { describe, test, expect, beforeEach, afterEach } from "vitest"; -import { rmSync } from "fs"; -import { createFixtureRepo } from "../fixtures/setup"; -import { zagi, git } from "./shared"; +import { writeFileSync, mkdirSync, appendFileSync } from "fs"; +import { resolve } from "path"; +import { zagi, git, createTestRepo, cleanupTestRepo } from "./shared"; let REPO_DIR: string; +// Use lightweight repo - these tests don't need multiple commits beforeEach(() => { - REPO_DIR = createFixtureRepo(); + REPO_DIR = createTestRepo(); + // Create test files similar to createFixtureRepo + mkdirSync(resolve(REPO_DIR, "src"), { recursive: true }); + writeFileSync(resolve(REPO_DIR, "src/main.ts"), 'export function main() {\n console.log("hello");\n}\n'); + git(["add", "src/main.ts"], { cwd: REPO_DIR }); + git(["commit", "-m", "Add main.ts"], { cwd: REPO_DIR }); + // Create uncommitted changes for status tests + writeFileSync(resolve(REPO_DIR, "src/new-file.ts"), "// New file\n"); + appendFileSync(resolve(REPO_DIR, "src/main.ts"), "\n// Modified\n"); }); afterEach(() => { - if (REPO_DIR) { - rmSync(REPO_DIR, { recursive: true, force: true }); - } + cleanupTestRepo(REPO_DIR); }); describe("zagi status", () => { diff --git a/test/src/tasks.test.ts b/test/src/tasks.test.ts index f1cbea5..c2221ea 100644 --- a/test/src/tasks.test.ts +++ b/test/src/tasks.test.ts @@ -1,20 +1,21 @@ import { describe, test, expect, beforeEach, afterEach } from "vitest"; -import { rmSync } from "fs"; -import { createFixtureRepo } from "../fixtures/setup"; -import { zagi, git } from "./shared"; +import { zagi, git, createTestRepo, cleanupTestRepo } from "./shared"; let REPO_DIR: string; +// Use lightweight repo - these tests don't need multiple commits beforeEach(() => { - REPO_DIR = createFixtureRepo(); + REPO_DIR = createTestRepo(); }); afterEach(() => { - if (REPO_DIR) { - rmSync(REPO_DIR, { recursive: true, force: true }); - } + cleanupTestRepo(REPO_DIR); }); +// ============================================================================ +// Help and Basic Usage +// ============================================================================ + describe("zagi tasks help", () => { test("shows help message", () => { const result = zagi(["tasks"], { cwd: REPO_DIR }); @@ -34,6 +35,10 @@ describe("zagi tasks help", () => { }); }); +// ============================================================================ +// Task CRUD Operations +// ============================================================================ + describe("zagi tasks add", () => { test("creates new task with generated ID", () => { const result = zagi(["tasks", "add", "Fix authentication bug"], { cwd: REPO_DIR }); @@ -48,16 +53,6 @@ describe("zagi tasks add", () => { expect(result).toContain("Add user authentication system"); }); - test("creates task with dependency using --after", () => { - // First create a task to depend on - zagi(["tasks", "add", "Base task"], { cwd: REPO_DIR }); - - const result = zagi(["tasks", "add", "Dependent task", "--after", "task-001"], { cwd: REPO_DIR }); - - expect(result).toMatch(/created: task-\d{3}/); - expect(result).toContain("Dependent task"); - }); - test("outputs JSON when --json flag is used", () => { const result = zagi(["tasks", "add", "Test task", "--json"], { cwd: REPO_DIR }); @@ -69,32 +64,17 @@ describe("zagi tasks add", () => { expect(parsed).toHaveProperty("completed", null); }); - test("creates task with dependency in JSON format", () => { - zagi(["tasks", "add", "Base task"], { cwd: REPO_DIR }); - - const result = zagi(["tasks", "add", "Dependent task", "--after", "task-001", "--json"], { cwd: REPO_DIR }); - - const parsed = JSON.parse(result.trim()); - expect(parsed).toHaveProperty("after", "task-001"); - }); - - test("error for missing content", () => { + test("shows error for missing content", () => { const result = zagi(["tasks", "add"], { cwd: REPO_DIR }); expect(result).toContain("error: missing task content"); }); - test("error for empty content", () => { + test("shows error for empty content", () => { const result = zagi(["tasks", "add", ""], { cwd: REPO_DIR }); expect(result).toContain("error: task content cannot be empty"); }); - - test("error for --after without task ID", () => { - const result = zagi(["tasks", "add", "Test task", "--after"], { cwd: REPO_DIR }); - - expect(result).toContain("error: --after requires a task ID"); - }); }); describe("zagi tasks list", () => { @@ -126,15 +106,6 @@ describe("zagi tasks list", () => { expect(result).toContain("[ ] task-002"); }); - test("shows task dependencies", () => { - zagi(["tasks", "add", "Base task"], { cwd: REPO_DIR }); - zagi(["tasks", "add", "Dependent task", "--after", "task-001"], { cwd: REPO_DIR }); - - const result = zagi(["tasks", "list"], { cwd: REPO_DIR }); - - expect(result).toContain("[ ] task-002 (after task-001)"); - }); - test("outputs JSON when --json flag is used", () => { zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); @@ -149,22 +120,21 @@ describe("zagi tasks list", () => { }); test("JSON output includes all task fields", () => { - zagi(["tasks", "add", "Base task"], { cwd: REPO_DIR }); - zagi(["tasks", "add", "Dependent task", "--after", "task-001"], { cwd: REPO_DIR }); + zagi(["tasks", "add", "First task"], { cwd: REPO_DIR }); + zagi(["tasks", "add", "Second task"], { cwd: REPO_DIR }); zagi(["tasks", "done", "task-001"], { cwd: REPO_DIR }); const result = zagi(["tasks", "list", "--json"], { cwd: REPO_DIR }); const parsed = JSON.parse(result.trim()); - const baseTask = parsed.tasks.find((t: any) => t.id === "task-001"); - const depTask = parsed.tasks.find((t: any) => t.id === "task-002"); + const firstTask = parsed.tasks.find((t: any) => t.id === "task-001"); + const secondTask = parsed.tasks.find((t: any) => t.id === "task-002"); - expect(baseTask).toHaveProperty("status", "completed"); - expect(baseTask).toHaveProperty("completed"); - expect(baseTask.completed).not.toBeNull(); + expect(firstTask).toHaveProperty("status", "completed"); + expect(firstTask).toHaveProperty("completed"); + expect(firstTask.completed).not.toBeNull(); - expect(depTask).toHaveProperty("after", "task-001"); - expect(depTask).toHaveProperty("status", "pending"); + expect(secondTask).toHaveProperty("status", "pending"); }); }); @@ -190,15 +160,6 @@ describe("zagi tasks show", () => { expect(result).toContain("completed:"); }); - test("shows task dependency", () => { - zagi(["tasks", "add", "Base task"], { cwd: REPO_DIR }); - zagi(["tasks", "add", "Dependent task", "--after", "task-001"], { cwd: REPO_DIR }); - - const result = zagi(["tasks", "show", "task-002"], { cwd: REPO_DIR }); - - expect(result).toContain("depends on: task-001"); - }); - test("outputs JSON when --json flag is used", () => { zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); @@ -210,13 +171,13 @@ describe("zagi tasks show", () => { expect(parsed).toHaveProperty("status", "pending"); }); - test("error for missing task ID", () => { + test("shows error for missing task ID", () => { const result = zagi(["tasks", "show"], { cwd: REPO_DIR }); expect(result).toContain("error: missing task ID"); }); - test("error for non-existent task", () => { + test("shows error for non-existent task", () => { const result = zagi(["tasks", "show", "task-999"], { cwd: REPO_DIR }); expect(result).toContain("error: task 'task-999' not found"); @@ -243,7 +204,7 @@ describe("zagi tasks done", () => { expect(result).toContain("(0 pending, 1 completed)"); }); - test("already completed task message", () => { + test("shows message for already completed task", () => { zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); zagi(["tasks", "done", "task-001"], { cwd: REPO_DIR }); @@ -252,77 +213,100 @@ describe("zagi tasks done", () => { expect(result).toContain("task 'task-001' already completed"); }); - test("error for missing task ID", () => { + test("shows error for missing task ID", () => { const result = zagi(["tasks", "done"], { cwd: REPO_DIR }); expect(result).toContain("error: missing task ID"); }); - test("error for non-existent task", () => { + test("shows error for non-existent task", () => { const result = zagi(["tasks", "done", "task-999"], { cwd: REPO_DIR }); expect(result).toContain("error: task 'task-999' not found"); }); }); -describe("zagi tasks ready", () => { - test("shows no ready tasks when empty", () => { - const result = zagi(["tasks", "ready"], { cwd: REPO_DIR }); +describe("zagi tasks edit", () => { + test("is blocked in agent mode", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + const result = zagi(["tasks", "edit", "task-001", "New content"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); - expect(result).toContain("no tasks found"); + // Edit is blocked in agent mode - agents should use append + expect(result).toContain("error: edit command blocked"); + expect(result).toContain("tasks append"); }); - test("shows single ready task", () => { + test("replaces when not in agent mode", () => { zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); - const result = zagi(["tasks", "ready"], { cwd: REPO_DIR }); + // Explicitly unset ZAGI_AGENT to test non-agent mode + const result = zagi(["tasks", "edit", "task-001", "Updated content"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "" } + }); - expect(result).toContain("ready: 1 task"); - expect(result).toContain("[ ] task-001"); - expect(result).toContain("Test task"); + expect(result).toContain("updated: task-001"); + expect(result).toContain("Updated content"); }); +}); - test("excludes blocked tasks", () => { - zagi(["tasks", "add", "Base task"], { cwd: REPO_DIR }); - zagi(["tasks", "add", "Dependent task", "--after", "task-001"], { cwd: REPO_DIR }); - zagi(["tasks", "add", "Independent task"], { cwd: REPO_DIR }); +describe("zagi tasks append", () => { + test("appends to task content", () => { + zagi(["tasks", "add", "Original task"], { cwd: REPO_DIR }); - const result = zagi(["tasks", "ready"], { cwd: REPO_DIR }); + const result = zagi(["tasks", "append", "task-001", "Additional notes"], { cwd: REPO_DIR }); - expect(result).toContain("ready: 2 tasks"); - expect(result).toContain("[ ] task-001"); - expect(result).toContain("[ ] task-003"); - expect(result).not.toContain("task-002"); + expect(result).toContain("appended: task-001"); + expect(result).toContain("Additional notes"); + + // Verify content was appended + const showResult = zagi(["tasks", "show", "task-001"], { cwd: REPO_DIR }); + expect(showResult).toContain("Original task"); + expect(showResult).toContain("Additional notes"); }); - test("includes tasks with completed dependencies", () => { - zagi(["tasks", "add", "Base task"], { cwd: REPO_DIR }); - zagi(["tasks", "add", "Dependent task", "--after", "task-001"], { cwd: REPO_DIR }); - zagi(["tasks", "done", "task-001"], { cwd: REPO_DIR }); + test("works in agent mode", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); - const result = zagi(["tasks", "ready"], { cwd: REPO_DIR }); + const result = zagi(["tasks", "append", "task-001", "Agent notes"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); - expect(result).toContain("ready: 1 task"); - expect(result).toContain("[ ] task-002"); - expect(result).toContain("Dependent task"); + expect(result).toContain("appended: task-001"); + expect(result).toContain("Agent notes"); }); +}); - test("excludes completed tasks", () => { +describe("zagi tasks delete", () => { + test("is blocked in agent mode", () => { zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); - zagi(["tasks", "done", "task-001"], { cwd: REPO_DIR }); - const result = zagi(["tasks", "ready"], { cwd: REPO_DIR }); + const result = zagi(["tasks", "delete", "task-001"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); - expect(result).toContain("no ready tasks"); + expect(result).toContain("error: delete command blocked"); + expect(result).toContain("ZAGI_AGENT is set"); + expect(result).toContain("permanent data loss"); }); - test("handles plural form correctly", () => { - zagi(["tasks", "add", "First task"], { cwd: REPO_DIR }); - zagi(["tasks", "add", "Second task"], { cwd: REPO_DIR }); + test("succeeds when not in agent mode", () => { + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); - const result = zagi(["tasks", "ready"], { cwd: REPO_DIR }); + // Explicitly unset ZAGI_AGENT to test non-agent mode + const result = zagi(["tasks", "delete", "task-001"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "" } + }); - expect(result).toContain("ready: 2 tasks"); + expect(result).toContain("deleted: task-001"); + expect(result).toContain("Test task"); }); }); @@ -345,112 +329,293 @@ describe("zagi tasks pr", () => { expect(result).toContain("- [x] Test task"); }); - test("shows ready tasks section", () => { - zagi(["tasks", "add", "Ready task"], { cwd: REPO_DIR }); + test("shows pending tasks section", () => { + zagi(["tasks", "add", "Pending task"], { cwd: REPO_DIR }); const result = zagi(["tasks", "pr"], { cwd: REPO_DIR }); - expect(result).toContain("### Ready"); - expect(result).toContain("- [ ] Ready task"); + expect(result).toContain("### Pending"); + expect(result).toContain("- [ ] Pending task"); }); - test("shows blocked tasks section", () => { - zagi(["tasks", "add", "Base task"], { cwd: REPO_DIR }); - zagi(["tasks", "add", "Blocked task", "--after", "task-001"], { cwd: REPO_DIR }); + test("shows both completed and pending", () => { + zagi(["tasks", "add", "First task"], { cwd: REPO_DIR }); + zagi(["tasks", "add", "Second task"], { cwd: REPO_DIR }); + zagi(["tasks", "done", "task-001"], { cwd: REPO_DIR }); const result = zagi(["tasks", "pr"], { cwd: REPO_DIR }); - expect(result).toContain("### Ready"); - expect(result).toContain("- [ ] Base task"); - expect(result).toContain("### Blocked"); - expect(result).toContain("- [ ] Blocked task (after task-001)"); + expect(result).toContain("### Completed"); + expect(result).toContain("- [x] First task"); + expect(result).toContain("### Pending"); + expect(result).toContain("- [ ] Second task"); }); +}); - test("shows dependencies in completed tasks", () => { - zagi(["tasks", "add", "Base task"], { cwd: REPO_DIR }); - zagi(["tasks", "add", "Dependent task", "--after", "task-001"], { cwd: REPO_DIR }); - zagi(["tasks", "done", "task-001"], { cwd: REPO_DIR }); - zagi(["tasks", "done", "task-002"], { cwd: REPO_DIR }); +// ============================================================================ +// Agent Commands +// ============================================================================ - const result = zagi(["tasks", "pr"], { cwd: REPO_DIR }); +describe("zagi agent", () => { + test("shows help when no subcommand", () => { + const result = zagi(["agent"], { cwd: REPO_DIR }); - expect(result).toContain("- [x] Base task"); - expect(result).toContain("- [x] Dependent task (after task-001)"); + expect(result).toContain("usage: git agent "); + expect(result).toContain("Commands:"); + expect(result).toContain("run"); + expect(result).toContain("plan"); }); - test("comprehensive mixed state", () => { - // Create complex task hierarchy - zagi(["tasks", "add", "Foundation"], { cwd: REPO_DIR }); - zagi(["tasks", "add", "Build on foundation", "--after", "task-001"], { cwd: REPO_DIR }); - zagi(["tasks", "add", "Independent work"], { cwd: REPO_DIR }); - zagi(["tasks", "add", "Final step", "--after", "task-002"], { cwd: REPO_DIR }); + test("agent --help shows help", () => { + const result = zagi(["agent", "--help"], { cwd: REPO_DIR }); - // Complete foundation - zagi(["tasks", "done", "task-001"], { cwd: REPO_DIR }); + expect(result).toContain("usage: git agent "); + }); - const result = zagi(["tasks", "pr"], { cwd: REPO_DIR }); + test("agent run --help shows run help", () => { + const result = zagi(["agent", "run", "--help"], { cwd: REPO_DIR }); - // Should have all sections - expect(result).toContain("### Completed"); - expect(result).toContain("- [x] Foundation"); + expect(result).toContain("usage: git agent run"); + expect(result).toContain("--once"); + expect(result).toContain("--dry-run"); + expect(result).toContain("--max-tasks"); + }); + + test("agent plan --help shows plan help", () => { + const result = zagi(["agent", "plan", "--help"], { cwd: REPO_DIR }); + + expect(result).toContain("usage: git agent plan"); + expect(result).toContain("[description]"); // Optional, hence brackets + expect(result).toContain("--dry-run"); + }); - expect(result).toContain("### Ready"); - expect(result).toContain("- [ ] Build on foundation"); - expect(result).toContain("- [ ] Independent work"); + test("agent plan --dry-run shows planning prompt", () => { + const result = zagi(["agent", "plan", "--dry-run", "Add user auth"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); - expect(result).toContain("### Blocked"); - expect(result).toContain("- [ ] Final step (after task-002)"); + expect(result).toContain("Interactive Planning Session (dry-run)"); + expect(result).toContain("Initial context: Add user auth"); + expect(result).toContain("Would execute:"); + expect(result).toContain("claude"); // No -p flag for interactive mode + expect(result).toContain("Prompt Preview"); + expect(result).toContain("INITIAL CONTEXT: Add user auth"); + }); + + test("agent plan without description starts interactive session", () => { + // agent plan without description starts interactive mode (asks user what to build) + const result = zagi(["agent", "plan", "--dry-run"], { cwd: REPO_DIR }); + + expect(result).toContain("Interactive Planning Session"); + expect(result).toContain("Initial context: (none - will ask user)"); + }); + + test("agent unknown subcommand shows error", () => { + const result = zagi(["agent", "invalid"], { cwd: REPO_DIR }); + + expect(result).toContain("error: unknown command 'invalid'"); + expect(result).toContain("usage: git agent "); }); }); -describe("zagi tasks edit", () => { - test("blocked in agent mode", () => { - zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); +// ============================================================================ +// Agent Plan --dry-run (Prompt Generation) +// ============================================================================ - const result = zagi(["tasks", "edit", "task-001", "Updated content"], { +describe("zagi agent plan --dry-run", () => { + test("generates correct prompt structure", () => { + const result = zagi(["agent", "plan", "--dry-run", "Build a REST API"], { cwd: REPO_DIR, - env: { ZAGI_AGENT: "claude-code" } + env: { ZAGI_AGENT: "claude" } }); - expect(result).toContain("error: edit command blocked"); - expect(result).toContain("ZAGI_AGENT is set"); + // Verify all prompt sections are present + expect(result).toContain("=== Interactive Planning Session (dry-run) ==="); + expect(result).toContain("Initial context: Build a REST API"); + expect(result).toContain("Would execute:"); + expect(result).toContain("--- Prompt Preview ---"); + expect(result).toContain("You are an interactive planning agent"); + expect(result).toContain("INITIAL CONTEXT: Build a REST API"); + expect(result).toContain("PHASE 1: EXPLORE CODEBASE"); + expect(result).toContain("Read AGENTS.md"); + expect(result).toContain("PHASE 4: CREATE TASKS"); + expect(result).toContain("tasks add"); + expect(result).toContain("=== RULES ==="); + expect(result).toContain("NEVER git push"); + }); + + test("includes absolute path to zagi binary", () => { + const result = zagi(["agent", "plan", "--dry-run", "Test task"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); + + // The prompt should include the absolute path for task commands + expect(result).toMatch(/\/.*\/zagi tasks add/); + expect(result).toMatch(/\/.*\/zagi tasks list/); }); - test("works when not in agent mode", () => { - zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + test("handles goal with double quotes", () => { + const result = zagi(["agent", "plan", "--dry-run", 'Add "login" button'], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); - const result = zagi(["tasks", "edit", "task-001", "Updated content"], { cwd: REPO_DIR }); + expect(result).toContain('Initial context: Add "login" button'); + expect(result).toContain('INITIAL CONTEXT: Add "login" button'); + }); - expect(result).toContain("updated: task-001"); - expect(result).toContain("Updated content"); + test("handles goal with single quotes", () => { + const result = zagi(["agent", "plan", "--dry-run", "Add 'logout' feature"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); + + expect(result).toContain("Initial context: Add 'logout' feature"); + expect(result).toContain("INITIAL CONTEXT: Add 'logout' feature"); }); -}); -describe("zagi tasks delete", () => { - test("blocked in agent mode", () => { - zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + test("handles goal with backticks", () => { + const result = zagi(["agent", "plan", "--dry-run", "Add `code` formatting"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); - const result = zagi(["tasks", "delete", "task-001"], { + expect(result).toContain("Initial context: Add `code` formatting"); + expect(result).toContain("INITIAL CONTEXT: Add `code` formatting"); + }); + + test("handles goal with shell special characters", () => { + const result = zagi(["agent", "plan", "--dry-run", "Fix $PATH & ENV vars"], { cwd: REPO_DIR, - env: { ZAGI_AGENT: "claude-code" } + env: { ZAGI_AGENT: "claude" } }); - expect(result).toContain("error: delete command blocked"); - expect(result).toContain("ZAGI_AGENT is set"); - expect(result).toContain("permanent data loss"); + expect(result).toContain("Initial context: Fix $PATH & ENV vars"); + expect(result).toContain("INITIAL CONTEXT: Fix $PATH & ENV vars"); }); - test("works when not in agent mode", () => { - zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + test("handles goal with angle brackets", () => { + const result = zagi(["agent", "plan", "--dry-run", "Add validation"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); - const result = zagi(["tasks", "delete", "task-001"], { cwd: REPO_DIR }); + expect(result).toContain("Initial context: Add validation"); + expect(result).toContain("INITIAL CONTEXT: Add validation"); + }); - expect(result).toContain("deleted: task-001"); + test("handles goal with parentheses and brackets", () => { + const result = zagi(["agent", "plan", "--dry-run", "Refactor function(args) and array[0]"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); + + expect(result).toContain("Initial context: Refactor function(args) and array[0]"); + expect(result).toContain("INITIAL CONTEXT: Refactor function(args) and array[0]"); + }); + + test("handles goal with unicode characters", () => { + const result = zagi(["agent", "plan", "--dry-run", "Add emoji support"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); + + expect(result).toContain("Initial context: Add emoji support"); + expect(result).toContain("INITIAL CONTEXT: Add emoji support"); + }); + + test("handles goal with newline in content", () => { + // Note: Shell typically doesn't pass literal newlines in args, but we test the goal is preserved + const goal = "Line one\\nLine two"; + const result = zagi(["agent", "plan", "--dry-run", goal], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); + + expect(result).toContain(`Initial context: ${goal}`); + expect(result).toContain(`INITIAL CONTEXT: ${goal}`); + }); + + test("handles long goal description", () => { + const longGoal = "Implement a comprehensive user authentication system with OAuth2 support, including Google, GitHub, and Microsoft providers, plus email/password fallback with rate limiting and account lockout protection"; + const result = zagi(["agent", "plan", "--dry-run", longGoal], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); + + expect(result).toContain(`Initial context: ${longGoal}`); + expect(result).toContain(`INITIAL CONTEXT: ${longGoal}`); + }); + + test("handles goal with mixed special characters", () => { + const result = zagi(["agent", "plan", "--dry-run", "Add 'auth' with & \"refresh\" tokens"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); + + expect(result).toContain("Initial context: Add 'auth' with & \"refresh\" tokens"); + expect(result).toContain("INITIAL CONTEXT: Add 'auth' with & \"refresh\" tokens"); + }); + + test("shows opencode executor when ZAGI_AGENT=opencode", () => { + const result = zagi(["agent", "plan", "--dry-run", "Test task"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "opencode" } + }); + + expect(result).toContain("Would execute:"); + expect(result).toContain("opencode"); // No "run" subcommand for interactive mode + }); + + test("shows custom executor when ZAGI_AGENT_CMD is set", () => { + const result = zagi(["agent", "plan", "--dry-run", "Test task"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT_CMD: "aider --yes" } + }); + + expect(result).toContain("Would execute:"); + expect(result).toContain("aider --yes"); + }); + + test("uses claude as default executor", () => { + const result = zagi(["agent", "plan", "--dry-run", "Test task"], { + cwd: REPO_DIR + }); + + expect(result).toContain("Would execute:"); + expect(result).toContain("claude"); // No -p flag for interactive mode + }); + + test("goal with only whitespace shows error", () => { + const result = zagi(["agent", "plan", "--dry-run", " "], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); + + // Whitespace-only is still valid input from the shell perspective + // The command treats it as valid content + expect(result).toContain("Initial context:"); + }); + + test("--dry-run flag position after goal works", () => { + const result = zagi(["agent", "plan", "Build feature", "--dry-run"], { + cwd: REPO_DIR, + env: { ZAGI_AGENT: "claude" } + }); + + expect(result).toContain("Interactive Planning Session (dry-run)"); + expect(result).toContain("Initial context: Build feature"); }); }); +// ============================================================================ +// Error Handling +// ============================================================================ + describe("error handling", () => { - test("invalid subcommand shows error and help", () => { + test("shows error and help for invalid subcommand", () => { const result = zagi(["tasks", "invalid"], { cwd: REPO_DIR }); expect(result).toContain("error: unknown command 'invalid'"); @@ -469,7 +634,7 @@ describe("error handling", () => { expect(result).toContain("task-003"); }); - test("works in non-git directory", () => { + test("shows error in non-git directory", () => { const result = zagi(["tasks", "add", "Test task"], { cwd: "/tmp" }); // libgit2 outputs "fatal:" for non-repo errors @@ -477,38 +642,9 @@ describe("error handling", () => { }); }); -describe("performance", () => { - test("zagi tasks operations are reasonably fast", () => { - // Create several tasks - const start = Date.now(); - - for (let i = 1; i <= 10; i++) { - zagi(["tasks", "add", `Task ${i}`], { cwd: REPO_DIR }); - } - - zagi(["tasks", "list"], { cwd: REPO_DIR }); - - const elapsed = Date.now() - start; - - // Should complete 10 add operations + 1 list in under 5 seconds - // This is quite generous, but we want to catch major performance regressions - expect(elapsed).toBeLessThan(5000); - }); - - test("task storage persists across commands", () => { - zagi(["tasks", "add", "Persistent task"], { cwd: REPO_DIR }); - - const firstList = zagi(["tasks", "list"], { cwd: REPO_DIR }); - expect(firstList).toContain("Persistent task"); - - zagi(["tasks", "add", "Another task"], { cwd: REPO_DIR }); - - const secondList = zagi(["tasks", "list"], { cwd: REPO_DIR }); - expect(secondList).toContain("Persistent task"); - expect(secondList).toContain("Another task"); - expect(secondList).toContain("tasks: 2 total"); - }); -}); +// ============================================================================ +// Git Integration +// ============================================================================ describe("integration with git", () => { test("tasks are stored in git refs", () => { @@ -545,4 +681,131 @@ describe("integration with git", () => { expect(featureTasks).toContain("Feature branch task"); expect(featureTasks).not.toContain("Main branch task"); }); -}); \ No newline at end of file +}); + +// ============================================================================ +// Error Conditions: No Tasks Exist +// ============================================================================ + +describe("error conditions: no tasks exist", () => { + test("tasks list shows helpful message when empty", () => { + const result = zagi(["tasks", "list"], { cwd: REPO_DIR }); + + expect(result).toContain("no tasks found"); + }); + + test("tasks list --json returns empty array when no tasks", () => { + const result = zagi(["tasks", "list", "--json"], { cwd: REPO_DIR }); + + const parsed = JSON.parse(result.trim()); + expect(parsed).toHaveProperty("tasks"); + expect(parsed.tasks).toHaveLength(0); + }); + + test("tasks pr shows helpful message when empty", () => { + const result = zagi(["tasks", "pr"], { cwd: REPO_DIR }); + + expect(result).toContain("No tasks found"); + }); + + test("tasks show gives clear error for non-existent task", () => { + const result = zagi(["tasks", "show", "task-001"], { cwd: REPO_DIR }); + + expect(result).toContain("error: task 'task-001' not found"); + }); + + test("tasks done gives clear error for non-existent task", () => { + const result = zagi(["tasks", "done", "task-001"], { cwd: REPO_DIR }); + + expect(result).toContain("error: task 'task-001' not found"); + }); +}); + +// ============================================================================ +// Error Conditions: Corrupted Task Data +// ============================================================================ + +describe("error conditions: corrupted task data", () => { + test("handles corrupted task ref gracefully", () => { + // First, create a valid task to establish refs/tasks/main + zagi(["tasks", "add", "Initial task"], { cwd: REPO_DIR }); + + // Corrupt the task ref by writing garbage data directly + // The ref points to a blob - we can create a new blob with garbage + // and update the ref to point to it + git(["update-ref", "refs/tasks/main", "HEAD"], { cwd: REPO_DIR }); + + // Now tasks list should handle this gracefully + const result = zagi(["tasks", "list"], { cwd: REPO_DIR }); + + // Should not crash - either shows "no tasks" or handles error gracefully + // The Zig code's fromJson handles malformed data by skipping invalid lines + expect(result).not.toContain("panic"); + expect(result).not.toContain("SIGSEGV"); + }); + + test("recovers from malformed task data by showing empty list", () => { + // Create a task then corrupt by pointing ref to a tree object + zagi(["tasks", "add", "Test task"], { cwd: REPO_DIR }); + + // Create an empty blob with partial/invalid task data + const blobOid = git(["hash-object", "-w", "--stdin"], { + cwd: REPO_DIR + }).trim(); + + // Note: This test verifies the system doesn't crash on read errors + // The actual behavior depends on what git returns for corrupted refs + const result = zagi(["tasks", "list"], { cwd: REPO_DIR }); + + // Main verification: system should not crash + expect(typeof result).toBe("string"); + }); + + test("task add works even after corrupted read", () => { + // Point ref to HEAD (a commit, not a blob) - simulates corruption + git(["update-ref", "-d", "refs/tasks/main"], { cwd: REPO_DIR }).trim(); + + // Adding a new task should work (creates fresh task list) + const result = zagi(["tasks", "add", "Fresh task after corruption"], { cwd: REPO_DIR }); + + expect(result).toContain("created: task-001"); + expect(result).toContain("Fresh task after corruption"); + }); +}); + +// ============================================================================ +// Performance +// ============================================================================ + +describe("performance", () => { + test("zagi tasks operations are reasonably fast", () => { + // Create several tasks + const start = Date.now(); + + for (let i = 1; i <= 10; i++) { + zagi(["tasks", "add", `Task ${i}`], { cwd: REPO_DIR }); + } + + zagi(["tasks", "list"], { cwd: REPO_DIR }); + + const elapsed = Date.now() - start; + + // Should complete 10 add operations + 1 list in under 5 seconds + // This is quite generous, but we want to catch major performance regressions + expect(elapsed).toBeLessThan(5000); + }); + + test("task storage persists across commands", () => { + zagi(["tasks", "add", "Persistent task"], { cwd: REPO_DIR }); + + const firstList = zagi(["tasks", "list"], { cwd: REPO_DIR }); + expect(firstList).toContain("Persistent task"); + + zagi(["tasks", "add", "Another task"], { cwd: REPO_DIR }); + + const secondList = zagi(["tasks", "list"], { cwd: REPO_DIR }); + expect(secondList).toContain("Persistent task"); + expect(secondList).toContain("Another task"); + expect(secondList).toContain("tasks: 2 total"); + }); +});