Skip to content

v1.11.x: Agentic Workflow Implementation Plan

Created: 2025-12-20 Updated: 2025-12-26 Status: Phases 1-4 Complete, Phase 5 In Progress

Phase Status

Phase Description Status Version
1 File editing tools + consent ✅ Complete v1.11.0
2 @git context provider ✅ Complete v1.11.4
3 @tree context provider ✅ Complete v1.11.4
4 Manual testing & refinement ✅ Complete v1.11.5-v1.11.7
5 /agent loop implementation 🔧 In Progress v1.11.8
6 Testing & documentation ⏳ Pending v1.11.8

Note: v1.11.7 removed all legacy code (AIClient, PerplexityClientPromptTools, tool_manager). EngineClient is now the only client interface.

Branch: feature/adding-agent-loop (Phase 5 work)

User Guide: AGENT_MODE_GUIDE.md - Practical examples for research and development workflows


Workflow Diagrams

Current Turn-Based Flow (Pre-v1.11.8)

Current Non-Agentic Flow

The current workflow is turn-based: 1. User sends a message 2. AI processes and optionally calls tools 3. AI returns a single response 4. User must manually direct the next action

Autonomous Agent Flow (v1.11.8+)

Future Agentic Flow

The /agent command enables autonomous execution: 1. User issues /agent <task> command 2. AI enters an autonomous loop (max 5 iterations) 3. Each iteration: Plan → Execute tools → Check completion 4. Loop continues until TASK_COMPLETE: signal or max iterations 5. User can interrupt with Ctrl-C at any time


Overview

This release transforms ppxai from a turn-based chatbot into an autonomous developer agent capable of multi-step task execution. The implementation follows an incremental approach, building and testing individual components before combining them into the autonomous /agent loop.


System Semantics & Coherence

Current Command/Tool Landscape (v1.10.8)

1. Read-Only Tools (always available when tools enabled): - search_files - Find files with glob patterns - read_file - Read file contents - list_directory - List directory contents - Purpose: Discovery and inspection of codebase

2. Code Generation Commands (TUI slash commands, VSCode commands): - /generate, /test, /docs, /implement, /debug, /explain, /convert - Output: Generated code to chat (markdown) - User Action: Manual copy/paste to apply changes - Mode: Consultative - AI suggests, user applies

3. Utility Shell Tool (risky for code editing): - execute_shell_command - Run any shell command - Risk: Escaping issues, no atomicity, no validation - Use Case: Tests, builds, git commands (NOT file editing)

Semantic Enhancement (v1.11.0)

4. NEW: File Editing Tools (Phase 1 - enables autonomous mode): - apply_patch, replace_block, insert_text, delete_lines - Output: Direct file modification with atomic operations - Safety: Validation, dry-run, rollback capability - Mode: Agentic - AI applies changes autonomously

5. NEW: Context Providers (Phases 2-3): - @git - Auto-inject git diff for code review - @tree - Auto-inject project structure - Purpose: Automatic context awareness

6. NEW: Agent Loop (Phase 5): - /agent <task> - Autonomous multi-step execution - Flow: Plan → Execute tools → Verify → Repeat (max 5 iterations) - Uses: All tools above in combination

Coherent User Workflows

Workflow A: Consultative (Tools Disabled)

User: "/test utils.py"
AI: [Generates test code in chat]
User: [Copies code to test_utils.py manually]

Workflow B: Semi-Autonomous (Tools Enabled, Manual Direction)

User: "Fix the bug in auth.py line 42"
AI: [Uses read_file → Analyzes → Uses replace_block to fix]
User: [Sees confirmation, change applied]

Workflow C: Fully Autonomous (Tools + /agent)

User: "/agent implement user authentication with tests"
AI: [Loop 1] Plan: Create auth.py, write tests, verify
AI: [Loop 2] Executes: insert_text to create files
AI: [Loop 3] Executes: execute_shell_command to run tests
AI: [Loop 4] Tests fail → Uses replace_block to fix
AI: [Loop 5] Tests pass → Returns success

No Conflicts, Only Enhancement

Feature Before v1.11.0 After v1.11.0
Code generation ✅ Output to chat ✅ Output to chat (unchanged)
File reading ✅ Via tools ✅ Via tools (unchanged)
File editing ⚠️ Manual or risky shell ✅ Safe atomic tools
Context awareness Manual @file ✅ Manual @file + Auto @git/@tree
Multi-step tasks Manual direction ✅ Autonomous /agent loop

Conclusion: Phase 1 tools enable a new autonomous mode while preserving the existing consultative mode. Users choose their workflow based on trust level and task complexity.


Architecture Goals

Current State (v1.10.8): - Turn-based interaction: User → AI → User - Tools available but require explicit user direction - Manual context injection via @file references

Target State (v1.11.0): - Autonomous multi-step execution with /agent command - Safe file editing without shell escaping risks - Automatic context injection via @git and @tree - AI can plan, execute, and verify tasks independently


Implementation Phases

Phase 1: Native File Editing Tools (6-8 hours)

Goal: Create safe, atomic file editing tools with per-file session consent in ppxai/engine/tools/builtin/editor.py

Context: Existing Tools & Commands

Existing Tools (Read-Only): - search_files(pattern, directory) - Find files with glob patterns - read_file(filepath, max_lines) - Read file contents - list_directory(path, format) - List directory contents - execute_shell_command(command, working_dir) - Run shell commands (⚠️ unsafe for file editing)

Existing Code Generation Slash Commands (Output to Chat): - /generate <desc> - Generate code from description - /test <file> - Generate unit tests for file - /docs <file> - Generate documentation for file - /implement <spec> - Implement feature from specification - /debug <error> - Analyze and fix errors - /explain <file> - Explain code logic - /convert <lang1> <lang2> <file> - Convert code between languages - /show <file> - Display file content (read-only)

Semantic Consistency: The current system has a clear pattern: - Manual Mode: Slash commands generate code → User copies/applies manually - Autonomous Mode (Phase 1 enables): AI uses tools → Direct file modification

This creates two coherent workflows: 1. Consultative (tools disabled): AI suggests changes, user applies 2. Agentic (tools enabled): AI applies changes autonomously

Phase 1 tools complement existing commands by enabling automation of previously manual workflows. No conflicts or redundancy.

Why Not Use execute_shell_command? - ❌ Shell escaping issues with complex code - ❌ No atomic operations or rollback - ❌ No dry-run validation - ❌ Risk of partial writes on failure

Design Decision: Use per-file consent instead of tool-level configuration for better UX and safety.

How It Works:

User: "/tools enable"  → All tools available (including edit tools)

[AI attempts to edit file_1.py]
→ Prompt: "⚠️  AI wants to edit file_1.py. Allow? (y/n/always/never)"
→ User: "y"
→ file_1.py added to session's allowed files
→ Edit proceeds

[AI attempts to edit file_1.py again]
→ No prompt (already consented)
→ Edit proceeds

[AI attempts to edit file_2.py]
→ Prompt: "⚠️  AI wants to edit file_2.py. Allow? (y/n/always/never)"
→ User: "always"
→ All future edits allowed (no more prompts this session)
→ Edit proceeds

Consent Options: - y (yes) - Allow editing this specific file (this session) - n (no) - Deny this edit (tool returns error to AI) - always - Allow all file edits without prompting (autonomous mode) - never - Deny all file edits (disables edit tools for session)

Implementation:

# Session state (ppxai/engine/session.py)
class Session:
    def __init__(self):
        self.allowed_files: Set[Path] = set()
        self.edit_consent_mode: str = "ask"  # "ask", "always", "never"

# Edit tools base helper (ppxai/engine/tools/builtin/editor.py)
def _check_edit_consent(self, file_path: str) -> bool:
    """Check if user consents to editing this file.

    Returns:
        True if edit is allowed, False otherwise
    """
    from pathlib import Path

    path = Path(file_path).resolve()

    # Check global consent mode
    if self.session.edit_consent_mode == "always":
        return True
    if self.session.edit_consent_mode == "never":
        return False

    # Check if already consented for this file
    if path in self.session.allowed_files:
        return True

    # Prompt user (TUI or VSCode dialog)
    response = self._prompt_edit_consent(path)

    if response == "y":
        self.session.allowed_files.add(path)
        return True
    elif response == "always":
        self.session.edit_consent_mode = "always"
        return True
    elif response == "never":
        self.session.edit_consent_mode = "never"
        return False
    else:  # "n"
        return False

def _prompt_edit_consent(self, file_path: Path) -> str:
    """Prompt user for edit consent (TUI or VSCode)."""
    # TUI: Use prompt_toolkit input
    # VSCode: Use vscode.window.showWarningMessage with buttons
    pass

VSCode Extension Handling:

// Show modal dialog in VSCode
const response = await vscode.window.showWarningMessage(
    `AI wants to edit ${filePath}. Allow this change?`,
    { modal: true },
    'Yes',
    'Always (this session)',
    'No',
    'Never (this session)'
);

Benefits: - ✅ Granular safety - Per-file consent instead of all-or-nothing - ✅ Transparent - User sees exactly which files AI wants to edit - ✅ Flexible - "always" option enables fully autonomous mode - ✅ Session-scoped - Fresh consent each session for safety - ✅ No upfront config - Works immediately with /tools enable - ✅ Compatible with /agent - Only interrupts first time per file

Trade-offs: - ⚠️ Interrupts /agent - First edit per file pauses for consent - Mitigated by: "always" option for uninterrupted autonomous workflow

Additional Time: +2 hours for consent mechanism implementation

Architectural Impact & Backward Compatibility

Analysis: Phase 1 integrates with existing architecture with zero breaking changes.

Current Architecture (v1.10.8): - ✅ Event-based communication (EventType enum) - ✅ Async tool execution (async def execute()) - ✅ UI-agnostic engine (EngineClient) - ✅ TUI uses prompt_toolkit for input - ✅ VSCode uses HTTP + SSE for communication - ✅ Existing tools: read_file, search_files, list_directory, execute_shell_command

Changes Required (all additive, no modifications):

Component File Change Lines Breaking?
Session state ppxai/engine/session.py Add consent fields ~20 ❌ No
Event types ppxai/engine/types.py Add CONSENT_REQUEST ~5 ❌ No
Engine client ppxai/engine/client.py Add consent method ~40 ❌ No
Edit tools ppxai/engine/tools/builtin/editor.py NEW FILE ~400 ❌ No
Tool registration ppxai/engine/tools/builtin/__init__.py Import editor ~2 ❌ No
TUI ppxai/main.py or commands.py Add consent callback ~30 ❌ No
VSCode vscode-extension/src/chatPanel.ts Handle consent events ~50 ❌ No
HTTP server ppxai/server/http.py Add consent endpoint ~30 ❌ No
TOTAL All additive ~577 ❌ No

Consent Implementation Pattern:

# Engine provides consent callback (both sync TUI and async VSCode work)
class EngineClient:
    def __init__(self, consent_callback: Optional[Callable] = None):
        self.consent_callback = consent_callback  # UI provides this

# Tools use the callback
class EditTool(BaseTool):
    async def execute(self, file_path: str, ...):
        # Request consent (awaits response from UI)
        if not await self._check_consent(file_path):
            return "Error: Edit permission denied by user"
        # Proceed...

# TUI: Synchronous prompt
async def tui_consent(file_path: str) -> tuple[bool, str]:
    response = prompt(f"⚠️  Edit {file_path}? (y/n/always/never): ")
    return (response in ['y', 'always'], response)

# VSCode: Event-based via HTTP
# 1. Server emits CONSENT_REQUEST SSE event
# 2. Extension shows modal dialog
# 3. Extension POSTs response to /consent/respond
# 4. Server resolves Future, tool proceeds

Why This Works: - ✅ Callback pattern is flexible (works for both TUI and VSCode) - ✅ Async-friendly (tools can await consent) - ✅ Event-based for VSCode (SSE + HTTP endpoint) - ✅ No changes to existing tools or commands - ✅ Default behavior: If no callback, auto-approve (backward compatible)

Foundations for Future Phases

Phase 1 establishes critical foundations needed by Phases 2-6:

For Phase 2 (@git context) & Phase 3 (@tree context): - ✅ No dependencies on Phase 1 - ✅ Context injection system already exists (ContextInjector) - ✅ Working directory tracking already exists - Phase 1 provides: Examples of clean tool implementation

For Phase 5 (/agent loop) - CRITICAL DEPENDENCIES: - ✅ Consent mechanism - Agent will make multiple tool calls - ✅ "always" mode - Essential for uninterrupted autonomous execution - ✅ Session-scoped state - Consent persists across iterations - ✅ Async callback pattern - Agent loop is async - ✅ Event-based coordination - VSCode agent needs non-blocking consent

Phase 1 Consent Design Explicitly Supports /agent Loop:

# Agent loop scenario (Phase 5)
User: "/agent implement user auth with tests"

# Iteration 1: Create auth.py
 Consent prompt: "Edit auth.py? (y/n/always/never)"
 User: "always"   Critical for autonomous mode
 edit_consent_mode = "always"

# Iteration 2: Create test_auth.py
 NO prompt (always mode)
 Tool executes immediately

# Iteration 3-5: Fix tests, refactor
 NO prompts (always mode)
 Fully autonomous execution

If Phase 1 consent was blocking or not session-scoped: - ❌ Agent would interrupt on every file edit - ❌ Autonomous mode would be unusable - ❌ Would need to redesign in Phase 5

Phase 1 Design Decisions That Enable Phase 5: 1. ✅ Async consent callback (doesn't block event loop) 2. ✅ Session-scoped consent state (persists across tool calls) 3. ✅ "always" mode (enables true autonomy) 4. ✅ Event-based for VSCode (non-blocking UI) 5. ✅ Future-based coordination (tool waits for user response)

New Event Type Required (Phase 1):

# ppxai/engine/types.py
class EventType(Enum):
    # ... existing events ...
    CONSENT_REQUEST = "consent_request"  # NEW - Phase 1

New HTTP Endpoint Required (Phase 1):

# ppxai/server/http.py
@app.post("/consent/respond")
async def respond_consent(request: ConsentResponse):
    """VSCode extension responds to consent request."""
    # Resolve Future that edit tool is awaiting
    return {"ok": True}

Validation Checklist: - [ ] Consent callback pattern supports both TUI and VSCode - [ ] Async design doesn't block /agent loop (Phase 5) - [ ] Session state persists across multiple tool calls - [ ] "always" mode truly bypasses all prompts - [ ] Event-based consent works with SSE streaming - [ ] Backward compatible (no callback = auto-approve)

Tools to Implement

  1. apply_patch(file_path: str, unified_diff: str)
  2. Apply standard unified diff patches
  3. Validate patch format before applying
  4. Atomic operation with rollback on failure
  5. Return success/failure with line numbers affected

  6. replace_block(file_path: str, search: str, replace: str)

  7. Search for exact text block and replace
  8. Case-sensitive by default
  9. Fail if search text not found or found multiple times
  10. Return matched location and new content

  11. insert_text(file_path: str, line_number: int, text: str)

  12. Insert text at specific line number
  13. Preserve indentation context
  14. Support multiple lines
  15. Return confirmation with line range

  16. delete_lines(file_path: str, start_line: int, end_line: int)

  17. Delete range of lines (inclusive)
  18. Validate line numbers exist
  19. Return deleted content for undo capability

Implementation Details

File: ppxai/engine/tools/builtin/editor.py

from ppxai.engine.tools.base import BaseTool
from typing import Dict, Any
import difflib
from pathlib import Path

class ApplyPatchTool(BaseTool):
    """Apply unified diff patch to a file."""

    def name(self) -> str:
        return "apply_patch"

    def description(self) -> str:
        return "Apply a unified diff patch to a file"

    def parameters(self) -> Dict[str, Any]:
        return {
            "type": "object",
            "properties": {
                "file_path": {
                    "type": "string",
                    "description": "Path to file to patch"
                },
                "unified_diff": {
                    "type": "string",
                    "description": "Unified diff format patch"
                }
            },
            "required": ["file_path", "unified_diff"]
        }

    def execute(self, file_path: str, unified_diff: str) -> str:
        # Implementation with safety checks
        # - Validate file exists
        # - Parse diff
        # - Apply atomically
        # - Rollback on failure
        pass

# Similar for ReplaceBlockTool, InsertTextTool, DeleteLinesTool

Manual Testing (Phase 1)

Test 1: Tool Functionality (test each tool individually without /agent)

# Test apply_patch
User: "Create a patch to fix the typo in README.md line 42 and apply it"
 Consent prompt: "⚠️  AI wants to edit README.md. Allow? (y/n/always/never)"
 User: "y"
AI: [generates diff, calls apply_patch]
 Verify: Typo fixed in README.md

# Test replace_block
User: "In auth.py, replace the old login function with this new implementation: [code]"
 Consent prompt: "⚠️  AI wants to edit auth.py. Allow? (y/n/always/never)"
 User: "y"
AI: [calls replace_block with search/replace]
 Verify: Function replaced correctly

# Test insert_text
User: "Add error handling at line 100 in client.py: [code]"
 Consent prompt: "⚠️  AI wants to edit client.py. Allow? (y/n/always/never)"
 User: "y"
AI: [calls insert_text]
 Verify: Code inserted at correct line

# Test delete_lines
User: "Delete the deprecated code from lines 50-75 in utils.py"
 Consent prompt: "⚠️  AI wants to edit utils.py. Allow? (y/n/always/never)"
 User: "y"
AI: [calls delete_lines]
 Verify: Lines deleted correctly

Test 2: Consent Flow

# Test per-file consent (already allowed)
User: "Fix another typo in README.md line 100"
 NO consent prompt (already allowed in this session)
AI: [calls apply_patch]
 Verify: No interruption for same file

# Test consent denial
User: "Edit config.json to change port to 8080"
 Consent prompt: "⚠️  AI wants to edit config.json. Allow? (y/n/always/never)"
 User: "n"
AI: [receives error from tool]
 Verify: AI reports edit was denied, file unchanged

# Test "always" mode
User: "Refactor server.py to use async/await"
 Consent prompt: "⚠️  AI wants to edit server.py. Allow? (y/n/always/never)"
 User: "always"
AI: [calls replace_block multiple times]
 Verify: No more prompts for any file

# Test "never" mode (new session)
User: "/clear" (start fresh session)
User: "/tools enable"
User: "Fix bug in database.py"
 Consent prompt: "⚠️  AI wants to edit database.py. Allow? (y/n/always/never)"
 User: "never"
AI: [receives error from tool]
 Verify: All edit attempts denied for entire session

Test 3: VSCode Extension

  • Test modal dialog appears in VSCode
  • Test all consent options work
  • Test consent state persists across multiple edits

Success Criteria: - ✅ All 4 tools work reliably - ✅ Proper error messages on failure - ✅ Atomic operations (no partial edits) - ✅ Clear confirmation messages - ✅ Consent prompts appear for first edit of each file - ✅ Consent state persists within session - ✅ "y" allows specific file, "always" allows all files, "never" blocks all - ✅ Denied edits return clear error to AI - ✅ VSCode modal dialog works correctly

Phase 1 Deliverables Summary

What Gets Built: 1. ✅ 4 file editing tools (apply_patch, replace_block, insert_text, delete_lines) 2. ✅ Per-file session consent mechanism 3. ✅ Session state fields (allowed_files, edit_consent_mode) 4. ✅ Async consent callback pattern 5. ✅ TUI consent prompt integration 6. ✅ VSCode consent dialog integration 7. ✅ New EventType.CONSENT_REQUEST 8. ✅ New HTTP endpoint /consent/respond 9. ✅ Tool registration in builtin system

What This Enables: - ✅ Immediate Value: Safe autonomous file editing - ✅ Phase 2/3 Foundation: Example of clean tool implementation - ✅ Phase 5 Critical: Consent mechanism for /agent loop - ✅ Backward Compatible: Both TUI and VSCode keep working - ✅ Future-Proof: Designed for autonomous multi-step workflows

What Doesn't Break: - ✅ Existing tools (read_file, search_files, etc.) unchanged - ✅ Existing commands unchanged - ✅ Existing TUI workflows unchanged - ✅ Existing VSCode workflows unchanged - ✅ No configuration changes required

Files Created/Modified:

NEW:     ppxai/engine/tools/builtin/editor.py        (~400 lines)

MODIFIED: ppxai/engine/session.py                    (+20 lines)
MODIFIED: ppxai/engine/types.py                      (+5 lines)
MODIFIED: ppxai/engine/client.py                     (+40 lines)
MODIFIED: ppxai/engine/tools/builtin/__init__.py     (+2 lines)
MODIFIED: ppxai/main.py or commands.py               (+30 lines)
MODIFIED: vscode-extension/src/chatPanel.ts          (+50 lines)
MODIFIED: ppxai/server/http.py                       (+30 lines)

TOTAL: ~577 lines added (all additive, no deletions)

Time Investment: 6-8 hours Risk Level: Low (additive changes only, well-architected consent) Validation: Manual testing (Test 1-3 above) Next Phase Dependency: None (Phases 2-3 can proceed independently)


Phase 2: @git Context Provider (2-3 hours)

Goal: Automatically inject git diff context when user references @git

Implementation

File: ppxai/engine/context.py (extend existing ContextInjector)

class ContextInjector:
    # ... existing code ...

    GIT_PATTERN = r'@git\b'

    def inject_git_context(self, working_dir: str) -> Optional[InjectedContext]:
        """
        Inject git diff (staged + unstaged) as context.

        Returns:
            InjectedContext with git diff or None if not in git repo
        """
        import subprocess

        try:
            # Get unstaged changes
            unstaged = subprocess.run(
                ['git', 'diff'],
                cwd=working_dir,
                capture_output=True,
                text=True
            )

            # Get staged changes
            staged = subprocess.run(
                ['git', 'diff', '--staged'],
                cwd=working_dir,
                capture_output=True,
                text=True
            )

            # Combine with headers
            content = ""
            if staged.stdout.strip():
                content += "=== Staged Changes ===\n"
                content += staged.stdout + "\n"

            if unstaged.stdout.strip():
                content += "=== Unstaged Changes ===\n"
                content += unstaged.stdout + "\n"

            if not content:
                return InjectedContext(
                    source="@git",
                    content="No changes in working directory",
                    language="text",
                    truncated=False,
                    size=0
                )

            return InjectedContext(
                source="@git",
                content=content,
                language="diff",
                truncated=len(content) > self.MAX_FILE_SIZE,
                size=len(content)
            )

        except subprocess.CalledProcessError:
            return None  # Not a git repository

Update message enhancement:

def enhance_message(self, message: str, working_dir: Optional[str] = None) -> Tuple[str, List[InjectedContext]]:
    """Enhanced to support @git pattern."""
    # ... existing @file logic ...

    # Check for @git pattern
    if re.search(self.GIT_PATTERN, message):
        git_ctx = self.inject_git_context(working_dir or self.working_dir)
        if git_ctx:
            contexts.append(git_ctx)

    # ... rest of enhancement logic ...

Manual Testing (Phase 2)

# Test basic git context
User: "What changes did I make @git"
AI: [sees diff context, summarizes changes]

# Test with no changes
User: "Review my changes @git"
AI: [sees "No changes", responds appropriately]

# Test code review workflow
User: "Review my authentication changes for security issues @git"
AI: [analyzes diff, provides security review]

# Test combined with file
User: "Compare my changes @git with the original design in DESIGN.md"
AI: [sees both git diff and DESIGN.md content]

Success Criteria: - @git detected in messages - Staged and unstaged changes both captured - Proper formatting in context block - Graceful handling of non-git directories - Works alongside @file references


Phase 3: @tree Context Provider (2-3 hours)

Goal: Inject visual project structure when user references @tree

Implementation

File: ppxai/engine/context.py (extend existing ContextInjector)

class ContextInjector:
    # ... existing code ...

    TREE_PATTERN = r'@tree\b'

    def inject_tree_context(self, working_dir: str, max_depth: int = 3) -> InjectedContext:
        """
        Inject directory tree structure as context.

        Args:
            working_dir: Root directory to tree
            max_depth: Maximum depth to traverse

        Returns:
            InjectedContext with tree structure
        """
        from pathlib import Path

        def build_tree(path: Path, prefix: str = "", depth: int = 0) -> str:
            """Recursively build tree structure."""
            if depth > max_depth:
                return ""

            # Respect .gitignore
            gitignore_patterns = self._load_gitignore(path)

            items = sorted(path.iterdir(), key=lambda x: (not x.is_dir(), x.name))
            output = []

            for i, item in enumerate(items):
                # Skip ignored files
                if self._is_ignored(item, gitignore_patterns):
                    continue

                is_last = i == len(items) - 1
                current_prefix = "└── " if is_last else "├── "
                next_prefix = "    " if is_last else "│   "

                if item.is_dir():
                    output.append(f"{prefix}{current_prefix}{item.name}/")
                    output.append(build_tree(item, prefix + next_prefix, depth + 1))
                else:
                    output.append(f"{prefix}{current_prefix}{item.name}")

            return "\n".join(filter(None, output))

        tree = build_tree(Path(working_dir))

        # Add header with stats
        total_files = tree.count('\n') - tree.count('/\n')
        total_dirs = tree.count('/\n')

        content = f"Project: {Path(working_dir).name}\n"
        content += f"Directories: {total_dirs}, Files: {total_files}\n"
        content += f"Max depth: {max_depth}\n\n"
        content += tree

        return InjectedContext(
            source="@tree",
            content=content,
            language="text",
            truncated=False,
            size=len(content)
        )

    def _load_gitignore(self, path: Path) -> List[str]:
        """Load .gitignore patterns."""
        gitignore_file = path / '.gitignore'
        if gitignore_file.exists():
            with open(gitignore_file) as f:
                return [line.strip() for line in f if line.strip() and not line.startswith('#')]
        return []

    def _is_ignored(self, path: Path, patterns: List[str]) -> bool:
        """Check if path matches gitignore patterns."""
        import fnmatch
        name = path.name
        return any(fnmatch.fnmatch(name, pattern) for pattern in patterns)

Manual Testing (Phase 3)

# Test basic tree
User: "Show me the project structure @tree"
AI: [sees tree, describes structure]

# Test architectural planning
User: "Where should I add the new caching layer @tree"
AI: [analyzes structure, suggests location]

# Test refactoring guidance
User: "I want to reorganize the API modules. Current structure: @tree"
AI: [sees structure, provides refactoring suggestions]

# Test combined contexts
User: "Review my changes @git in the context of the project structure @tree"
AI: [sees both, provides comprehensive review]

# Test depth control (future enhancement)
User: "Show a shallow tree @tree depth=1"
AI: [shows only top-level structure]

Success Criteria: - @tree detected in messages - Directory structure displayed correctly - .gitignore patterns respected - Reasonable depth limit (3 levels) - File/directory counts included - Works with @git and @file


Phase 4: Manual Testing & Refinement (3-4 hours)

Goal: Dogfood all new features to validate design and find issues

Comprehensive Test Scenarios

Scenario 1: File Editing Workflow

1. User: "Read test_commands.py and find the failing test"
   AI: [uses read_file tool]

2. User: "The test at line 185 is checking the wrong value. Fix it using replace_block"
   AI: [uses editor.replace_block]

3. User: "Run pytest to verify the fix"
   AI: [uses shell.execute_command]

4. User: "If it still fails, apply this patch: [diff]"
   AI: [uses editor.apply_patch if needed]

Scenario 2: Code Review Workflow

1. User: "What changes are in my working directory @git"
   AI: [sees staged + unstaged diffs, summarizes]

2. User: "Are there any bugs or security issues in these changes @git"
   AI: [analyzes diff, provides review]

3. User: "Fix the SQL injection vulnerability you found"
   AI: [uses editor tool to fix]

4. User: "Show me the updated diff @git"
   AI: [shows new diff with fix applied]

Scenario 3: Architecture Planning

1. User: "Show me the current structure @tree"
   AI: [displays tree]

2. User: "I need to add caching. Where should the cache module go @tree"
   AI: [analyzes structure, suggests location]

3. User: "Create cache.py in that location with basic structure"
   AI: [uses editor.insert_text or creates file]

4. User: "Show updated structure @tree"
   AI: [displays tree with new file]

Scenario 4: Combined Workflow (Ultimate Test)

1. User: "Review my authentication changes @git against the project structure @tree"
   AI: [uses both contexts for comprehensive analysis]

2. User: "You suggested moving auth.py to src/auth/. Do that and update imports"
   AI: [uses multiple editor tools for refactoring]

3. User: "Verify no broken imports @tree"
   AI: [checks structure, may run tests]

Refinement Checklist

Based on testing, refine:

  • [ ] Truncation Limits: Adjust MAX_FILE_SIZE if needed
  • [ ] Error Messages: Ensure clear, actionable errors
  • [ ] Safety Confirmations: Add warnings for destructive operations
  • [ ] Context Formatting: Polish how diffs/trees display
  • [ ] Performance: Optimize tree generation for large projects
  • [ ] Edge Cases: Handle empty repos, binary files, symlinks
  • [ ] User Feedback: Add progress indicators for slow operations

Issues to Watch For

  1. File Editing:
  2. Encoding issues (UTF-8 vs ASCII)
  3. Line ending differences (CRLF vs LF)
  4. Indentation preservation
  5. Partial edit failures

  6. Git Context:

  7. Large diffs truncation strategy
  8. Binary file diffs
  9. Merge conflicts in diff
  10. Submodule handling

  11. Tree Context:

  12. Symlink loops
  13. Hidden files strategy
  14. Large directory structures
  15. .gitignore edge cases

Phase 5: /agent Loop Implementation (6-8 hours)

Goal: Build autonomous execution loop using the proven tools

Now that we have confidence in the tools, implement the agent loop:

Command Handler

File: ppxai/commands.py (add new handler)

def handle_agent(self, args: str):
    """Handle /agent command for autonomous task execution."""
    if not args.strip():
        console.print("[red]Usage: /agent <task description>[/red]\n")
        return

    task = args.strip()
    max_iterations = 5  # Configurable

    console.print(f"[cyan]🤖 Starting autonomous task:[/cyan] {task}\n")
    console.print(f"[dim]Max iterations: {max_iterations}[/dim]\n")

    iteration = 0
    task_complete = False

    while iteration < max_iterations and not task_complete:
        iteration += 1
        console.print(f"[yellow]--- Iteration {iteration}/{max_iterations} ---[/yellow]\n")

        # Construct prompt for iteration
        if iteration == 1:
            prompt = f"""Task: {task}

Please work on this task autonomously. You have access to tools for:
- File editing (apply_patch, replace_block, insert_text, delete_lines)
- File reading and searching
- Shell commands
- Git context (@git)
- Project structure (@tree)

After each action, assess if the task is complete. If complete, respond with:
TASK_COMPLETE: <summary of what was done>

If you need to continue, explain what you're doing and call the appropriate tools."""
        else:
            prompt = f"""Continue working on the task: {task}

Previous iteration completed. Assess the current state and continue if needed.
If the task is complete, respond with:
TASK_COMPLETE: <summary>"""

        # Send to AI
        try:
            response = self.client.chat(prompt)

            # Check for completion signal
            if "TASK_COMPLETE:" in response:
                task_complete = True
                summary = response.split("TASK_COMPLETE:")[1].strip()
                console.print(f"\n[green]✅ Task completed:[/green] {summary}\n")

        except KeyboardInterrupt:
            console.print("\n[yellow]Agent loop interrupted by user[/yellow]\n")
            break
        except Exception as e:
            console.print(f"[red]Error in iteration {iteration}: {e}[/red]\n")
            break

    if not task_complete and iteration >= max_iterations:
        console.print(f"[yellow]⚠️  Task incomplete after {max_iterations} iterations[/yellow]\n")

Engine Support

File: ppxai/engine/client.py (add agent mode)

def chat_agent(self, task: str, max_iterations: int = 5) -> Generator[Event, None, None]:
    """
    Autonomous agent mode - AI loops until task complete or max iterations.

    Args:
        task: High-level task description
        max_iterations: Max number of AI turns

    Yields:
        Events for each iteration
    """
    for iteration in range(1, max_iterations + 1):
        # Emit iteration start event
        yield Event(
            type=EventType.AGENT_ITERATION,
            data={"iteration": iteration, "max": max_iterations}
        )

        # Construct iteration prompt
        prompt = self._build_agent_prompt(task, iteration)

        # Stream chat response
        response_text = ""
        for event in self.chat(prompt):
            yield event
            if event.type == EventType.STREAM_CHUNK:
                response_text += event.data

        # Check for completion
        if self._task_complete(response_text):
            yield Event(
                type=EventType.AGENT_COMPLETE,
                data={"iterations": iteration, "summary": self._extract_summary(response_text)}
            )
            break
    else:
        # Max iterations reached
        yield Event(
            type=EventType.AGENT_MAX_ITERATIONS,
            data={"iterations": max_iterations}
        )

def _build_agent_prompt(self, task: str, iteration: int) -> str:
    """Build prompt for agent iteration."""
    # Implementation
    pass

def _task_complete(self, response: str) -> bool:
    """Check if AI signaled task completion."""
    return "TASK_COMPLETE:" in response

def _extract_summary(self, response: str) -> str:
    """Extract completion summary from response."""
    if "TASK_COMPLETE:" in response:
        return response.split("TASK_COMPLETE:")[1].strip()
    return ""

Safety Features

  1. Max Iterations: Prevent infinite loops
  2. Interrupt Support: Ctrl-C breaks agent loop
  3. Confirmation Prompts: For destructive operations
  4. Rollback Capability: Track edits for undo
  5. Progress Logging: Clear visibility into agent actions

Manual Testing (Phase 5)

# Test simple task
/agent Fix the failing test in test_commands.py

# Test multi-step task
/agent Review my changes @git and fix any issues you find

# Test planning task
/agent Reorganize the auth module based on @tree structure

# Test complex task
/agent Implement caching for the API, add tests, and update docs

# Test interrupt
/agent <long task>
[Press Ctrl-C during execution]

# Test max iterations
/agent <task that takes 10+ steps>
[Should stop at 5 iterations with clear message]

Success Criteria: - Agent completes simple tasks autonomously - Proper iteration tracking - Clean interrupt handling - Clear completion/failure messaging - Doesn't exceed max iterations


Phase 6: Testing & Documentation (4-5 hours)

Goal: Comprehensive testing and documentation for v1.11.0

Unit Tests

File: tests/test_editor_tools.py

def test_apply_patch():
    """Test unified diff patch application."""
    # Create temp file
    # Apply valid patch
    # Verify result
    # Test invalid patch
    # Test non-existent file

def test_replace_block():
    """Test block replacement."""
    # Test exact match
    # Test no match
    # Test multiple matches (should fail)
    # Test case sensitivity

def test_insert_text():
    """Test text insertion."""
    # Test at line number
    # Test out of bounds
    # Test indentation preservation

def test_delete_lines():
    """Test line deletion."""
    # Test valid range
    # Test invalid range
    # Test beyond file end

File: tests/test_context_providers.py

def test_git_context_injection():
    """Test @git context provider."""
    # Create temp git repo
    # Make changes
    # Test @git detection
    # Verify diff in context
    # Test no changes
    # Test non-git directory

def test_tree_context_injection():
    """Test @tree context provider."""
    # Create temp directory structure
    # Test @tree detection
    # Verify tree in context
    # Test depth limiting
    # Test .gitignore respect

def test_combined_contexts():
    """Test multiple context providers together."""
    # Test @git + @tree
    # Test @file + @git
    # Test all three

File: tests/test_agent_loop.py

def test_agent_basic_task():
    """Test simple agent task completion."""
    # Mock AI that completes in 1 iteration
    # Verify completion event
    # Verify summary extracted

def test_agent_max_iterations():
    """Test max iteration limit."""
    # Mock AI that never completes
    # Verify stops at max iterations
    # Verify appropriate event

def test_agent_interrupt():
    """Test agent interrupt handling."""
    # Start agent task
    # Interrupt during iteration
    # Verify clean shutdown

Integration Tests

End-to-end workflows:

def test_code_review_workflow():
    """Test full code review workflow with @git."""
    # Setup git repo with changes
    # Ask for review with @git
    # Verify diff in context
    # Verify AI response references changes

def test_refactoring_workflow():
    """Test refactoring with @tree."""
    # Setup project structure
    # Ask for refactoring suggestions with @tree
    # Verify tree in context
    # Verify AI considers structure

def test_autonomous_debugging():
    """Test agent finding and fixing a bug."""
    # Create file with known bug
    # /agent fix the bug
    # Verify bug found
    # Verify fix applied
    # Verify tests pass

Documentation Updates

File: CLAUDE.md

## Current Version: v1.11.0 (Agentic Workflow)

**What's New in v1.11.0:**
- **Autonomous Execution**: New `/agent` command for multi-step task execution
- **Native File Editing**: Safe, atomic file editing tools (apply_patch, replace_block, insert_text, delete_lines)
- **Git Context**: `@git` automatically injects diffs for code review workflows
- **Project Structure**: `@tree` provides architectural awareness
- **Combined Workflows**: Mix @git, @tree, and @file for comprehensive context

**Examples:**
```bash
/agent Fix all failing tests
/agent Review my changes @git
/agent Refactor the auth module based on @tree
**File**: `README.md`

Update "What's New" section and add examples:

```markdown
### v1.11.0 - Agentic Workflow

Transform ppxai into an autonomous developer agent:

- `/agent <task>` - Autonomous multi-step task execution
- `@git` - Automatic diff injection for code review
- `@tree` - Project structure awareness
- Native file editing tools (apply_patch, replace_block, insert_text, delete_lines)

**Example Workflows:**
```bash
# Autonomous debugging
/agent Fix the failing test in test_auth.py

# Code review with context
Review my authentication changes @git

# Architecture planning
Where should I add caching given this structure @tree

# Combined contexts
/agent Refactor the API based on @tree and review changes @git
**File**: `docs/AGENTIC_WORKFLOW.md` (New)

Create comprehensive guide:

```markdown
# Agentic Workflow Guide

## Overview

The `/agent` command enables ppxai to work autonomously on multi-step tasks...

## File Editing Tools

### apply_patch
### replace_block
### insert_text
### delete_lines

## Context Providers

### @git - Git Diff Context
### @tree - Project Structure
### @file - File Content (existing)

## Agent Loop

### How It Works
### Best Practices
### Limitations
### Troubleshooting

## Example Workflows

### Debugging
### Code Review
### Refactoring
### Feature Implementation

File: vscode-extension/README.md

Add section on agentic features and context providers.


Timeline Summary

Phase Component Hours Dependencies
1 File editing tools + consent 6-8 None
2 @git context 2-3 None
3 @tree context 2-3 None
4 Manual testing 3-4 Phases 1-3
5 /agent loop 6-8 Phases 1-4
6 Testing & docs 4-5 All phases
Total 23-31 hours

Success Criteria

Phase 1 Complete When:

  • [ ] All 4 file editing tools implemented
  • [ ] Per-file session consent mechanism working
  • [ ] Each tool tested manually
  • [ ] Consent flow tested (y/n/always/never)
  • [ ] Atomic operations confirmed
  • [ ] Error handling validated
  • [ ] VSCode modal dialog working

Phase 2 Complete When:

  • [ ] @git pattern detected correctly
  • [ ] Staged and unstaged diffs captured
  • [ ] Works in git and non-git directories
  • [ ] Formatted appropriately in context

Phase 3 Complete When:

  • [ ] @tree pattern detected correctly
  • [ ] Directory structure displayed
  • [ ] .gitignore patterns respected
  • [ ] Reasonable depth limits enforced

Phase 4 Complete When:

  • [ ] All tools tested in real workflows
  • [ ] Edge cases identified and handled
  • [ ] Context formatting refined
  • [ ] Performance acceptable

Phase 5 Complete When:

  • [ ] /agent command implemented
  • [ ] Multi-iteration tasks work
  • [ ] Proper termination conditions
  • [ ] Interrupt handling works

Phase 6 Complete When:

  • [ ] All unit tests passing
  • [ ] Integration tests passing
  • [ ] Documentation complete
  • [ ] Examples validated

Benefits of Incremental Approach

  1. Lower Risk: Test each component in isolation
  2. Faster Feedback: Validate design decisions early
  3. Better Debugging: Isolate issues to specific components
  4. Incremental Value: Each phase delivers usable features
  5. Confidence Building: Know tools work before building automation
  6. User Input: Can incorporate feedback before /agent implementation

Next Steps

  1. Create feature branch: feature/agentic-workflow-v1.11.0
  2. Start with Phase 1: Implement file editing tools
  3. Test manually after each phase
  4. Proceed to next phase only when previous is validated
  5. Build /agent loop last with full confidence

Notes

  • All code must maintain backward compatibility
  • Security review required for file editing operations
  • Performance testing needed for large projects
  • Documentation must include real-world examples
  • Tests must cover edge cases and error conditions

Last Updated: 2025-12-20 Next Review: After Phase 1 completion