v1.11.x: Agentic Workflow Implementation Plan¶

Created: 2025-12-20 Updated: 2025-12-26 Status: Phases 1-4 Complete, Phase 5 In Progress

Phase Status¶

Phase	Description	Status	Version
1	File editing tools + consent	✅ Complete	v1.11.0
2	@git context provider	✅ Complete	v1.11.4
3	@tree context provider	✅ Complete	v1.11.4
4	Manual testing & refinement	✅ Complete	v1.11.5-v1.11.7
5	`/agent` loop implementation	🔧 In Progress	v1.11.8
6	Testing & documentation	⏳ Pending	v1.11.8

Note: v1.11.7 removed all legacy code (AIClient, PerplexityClientPromptTools, tool_manager). EngineClient is now the only client interface.

Branch: feature/adding-agent-loop (Phase 5 work)

User Guide: AGENT_MODE_GUIDE.md - Practical examples for research and development workflows

Workflow Diagrams¶

Current Turn-Based Flow (Pre-v1.11.8)¶

Current Non-Agentic Flow

The current workflow is turn-based: 1. User sends a message 2. AI processes and optionally calls tools 3. AI returns a single response 4. User must manually direct the next action

Autonomous Agent Flow (v1.11.8+)¶

Future Agentic Flow

The /agent command enables autonomous execution: 1. User issues /agent <task> command 2. AI enters an autonomous loop (max 5 iterations) 3. Each iteration: Plan → Execute tools → Check completion 4. Loop continues until TASK_COMPLETE: signal or max iterations 5. User can interrupt with Ctrl-C at any time

Overview¶

This release transforms ppxai from a turn-based chatbot into an autonomous developer agent capable of multi-step task execution. The implementation follows an incremental approach, building and testing individual components before combining them into the autonomous /agent loop.

System Semantics & Coherence¶

Current Command/Tool Landscape (v1.10.8)¶

1. Read-Only Tools (always available when tools enabled): - search_files - Find files with glob patterns - read_file - Read file contents - list_directory - List directory contents - Purpose: Discovery and inspection of codebase

2. Code Generation Commands (TUI slash commands, VSCode commands): - /generate, /test, /docs, /implement, /debug, /explain, /convert - Output: Generated code to chat (markdown) - User Action: Manual copy/paste to apply changes - Mode: Consultative - AI suggests, user applies

3. Utility Shell Tool (risky for code editing): - execute_shell_command - Run any shell command - Risk: Escaping issues, no atomicity, no validation - Use Case: Tests, builds, git commands (NOT file editing)

Semantic Enhancement (v1.11.0)¶

4. NEW: File Editing Tools (Phase 1 - enables autonomous mode): - apply_patch, replace_block, insert_text, delete_lines - Output: Direct file modification with atomic operations - Safety: Validation, dry-run, rollback capability - Mode: Agentic - AI applies changes autonomously

5. NEW: Context Providers (Phases 2-3): - @git - Auto-inject git diff for code review - @tree - Auto-inject project structure - Purpose: Automatic context awareness

6. NEW: Agent Loop (Phase 5): - /agent <task> - Autonomous multi-step execution - Flow: Plan → Execute tools → Verify → Repeat (max 5 iterations) - Uses: All tools above in combination

Coherent User Workflows¶

Workflow A: Consultative (Tools Disabled)

User: "/test utils.py"
AI: [Generates test code in chat]
User: [Copies code to test_utils.py manually]

Workflow B: Semi-Autonomous (Tools Enabled, Manual Direction)

User: "Fix the bug in auth.py line 42"
AI: [Uses read_file → Analyzes → Uses replace_block to fix]
User: [Sees confirmation, change applied]

Workflow C: Fully Autonomous (Tools + /agent)

User: "/agent implement user authentication with tests"
AI: [Loop 1] Plan: Create auth.py, write tests, verify
AI: [Loop 2] Executes: insert_text to create files
AI: [Loop 3] Executes: execute_shell_command to run tests
AI: [Loop 4] Tests fail → Uses replace_block to fix
AI: [Loop 5] Tests pass → Returns success

No Conflicts, Only Enhancement¶

Feature	Before v1.11.0	After v1.11.0
Code generation	✅ Output to chat	✅ Output to chat (unchanged)
File reading	✅ Via tools	✅ Via tools (unchanged)
File editing	⚠️ Manual or risky shell	✅ Safe atomic tools
Context awareness	Manual @file	✅ Manual @file + Auto @git/@tree
Multi-step tasks	Manual direction	✅ Autonomous /agent loop

Conclusion: Phase 1 tools enable a new autonomous mode while preserving the existing consultative mode. Users choose their workflow based on trust level and task complexity.

Architecture Goals¶

Current State (v1.10.8): - Turn-based interaction: User → AI → User - Tools available but require explicit user direction - Manual context injection via @file references

Target State (v1.11.0): - Autonomous multi-step execution with /agent command - Safe file editing without shell escaping risks - Automatic context injection via @git and @tree - AI can plan, execute, and verify tasks independently

Implementation Phases¶

Phase 1: Native File Editing Tools (6-8 hours)¶

Goal: Create safe, atomic file editing tools with per-file session consent in ppxai/engine/tools/builtin/editor.py

Context: Existing Tools & Commands¶

Existing Tools (Read-Only): - search_files(pattern, directory) - Find files with glob patterns - read_file(filepath, max_lines) - Read file contents - list_directory(path, format) - List directory contents - execute_shell_command(command, working_dir) - Run shell commands (⚠️ unsafe for file editing)

Existing Code Generation Slash Commands (Output to Chat): - /generate <desc> - Generate code from description - /test <file> - Generate unit tests for file - /docs <file> - Generate documentation for file - /implement <spec> - Implement feature from specification - /debug <error> - Analyze and fix errors - /explain <file> - Explain code logic - /convert <lang1> <lang2> <file> - Convert code between languages - /show <file> - Display file content (read-only)

Semantic Consistency: The current system has a clear pattern: - Manual Mode: Slash commands generate code → User copies/applies manually - Autonomous Mode (Phase 1 enables): AI uses tools → Direct file modification

This creates two coherent workflows: 1. Consultative (tools disabled): AI suggests changes, user applies 2. Agentic (tools enabled): AI applies changes autonomously

Phase 1 tools complement existing commands by enabling automation of previously manual workflows. No conflicts or redundancy.

Why Not Use execute_shell_command? - ❌ Shell escaping issues with complex code - ❌ No atomic operations or rollback - ❌ No dry-run validation - ❌ Risk of partial writes on failure

Design Decision: Use per-file consent instead of tool-level configuration for better UX and safety.

How It Works:

User: "/tools enable"  → All tools available (including edit tools)

[AI attempts to edit file_1.py]
→ Prompt: "⚠️  AI wants to edit file_1.py. Allow? (y/n/always/never)"
→ User: "y"
→ file_1.py added to session's allowed files
→ Edit proceeds

[AI attempts to edit file_1.py again]
→ No prompt (already consented)
→ Edit proceeds

[AI attempts to edit file_2.py]
→ Prompt: "⚠️  AI wants to edit file_2.py. Allow? (y/n/always/never)"
→ User: "always"
→ All future edits allowed (no more prompts this session)
→ Edit proceeds

Consent Options: - y (yes) - Allow editing this specific file (this session) - n (no) - Deny this edit (tool returns error to AI) - always - Allow all file edits without prompting (autonomous mode) - never - Deny all file edits (disables edit tools for session)

Implementation:

# Session state (ppxai/engine/session.py)
class Session:
    def __init__(self):
        self.allowed_files: Set[Path] = set()
        self.edit_consent_mode: str = "ask"  # "ask", "always", "never"

# Edit tools base helper (ppxai/engine/tools/builtin/editor.py)
def _check_edit_consent(self, file_path: str) -> bool:
    """Check if user consents to editing this file.

    Returns:
        True if edit is allowed, False otherwise
    """
    from pathlib import Path

    path = Path(file_path).resolve()

    # Check global consent mode
    if self.session.edit_consent_mode == "always":
        return True
    if self.session.edit_consent_mode == "never":
        return False

    # Check if already consented for this file
    if path in self.session.allowed_files:
        return True

    # Prompt user (TUI or VSCode dialog)
    response = self._prompt_edit_consent(path)

    if response == "y":
        self.session.allowed_files.add(path)
        return True
    elif response == "always":
        self.session.edit_consent_mode = "always"
        return True
    elif response == "never":
        self.session.edit_consent_mode = "never"
        return False
    else:  # "n"
        return False

def _prompt_edit_consent(self, file_path: Path) -> str:
    """Prompt user for edit consent (TUI or VSCode)."""
    # TUI: Use prompt_toolkit input
    # VSCode: Use vscode.window.showWarningMessage with buttons
    pass

VSCode Extension Handling:

// Show modal dialog in VSCode
const response = await vscode.window.showWarningMessage(
    `AI wants to edit ${filePath}. Allow this change?`,
    { modal: true },
    'Yes',
    'Always (this session)',
    'No',
    'Never (this session)'
);

Benefits: - ✅ Granular safety - Per-file consent instead of all-or-nothing - ✅ Transparent - User sees exactly which files AI wants to edit - ✅ Flexible - "always" option enables fully autonomous mode - ✅ Session-scoped - Fresh consent each session for safety - ✅ No upfront config - Works immediately with /tools enable - ✅ Compatible with /agent - Only interrupts first time per file

Trade-offs: - ⚠️ Interrupts /agent - First edit per file pauses for consent - Mitigated by: "always" option for uninterrupted autonomous workflow

Additional Time: +2 hours for consent mechanism implementation

Architectural Impact & Backward Compatibility¶

Analysis: Phase 1 integrates with existing architecture with zero breaking changes.

Current Architecture (v1.10.8): - ✅ Event-based communication (EventType enum) - ✅ Async tool execution (async def execute()) - ✅ UI-agnostic engine (EngineClient) - ✅ TUI uses prompt_toolkit for input - ✅ VSCode uses HTTP + SSE for communication - ✅ Existing tools: read_file, search_files, list_directory, execute_shell_command

Changes Required (all additive, no modifications):

Component	File	Change	Lines	Breaking?
Session state	`ppxai/engine/session.py`	Add consent fields	~20	❌ No
Event types	`ppxai/engine/types.py`	Add CONSENT_REQUEST	~5	❌ No
Engine client	`ppxai/engine/client.py`	Add consent method	~40	❌ No
Edit tools	`ppxai/engine/tools/builtin/editor.py`	NEW FILE	~400	❌ No
Tool registration	`ppxai/engine/tools/builtin/__init__.py`	Import editor	~2	❌ No
TUI	`ppxai/main.py` or `commands.py`	Add consent callback	~30	❌ No
VSCode	`vscode-extension/src/chatPanel.ts`	Handle consent events	~50	❌ No
HTTP server	`ppxai/server/http.py`	Add consent endpoint	~30	❌ No
TOTAL		All additive	~577	❌ No

Consent Implementation Pattern:

# Engine provides consent callback (both sync TUI and async VSCode work)
class EngineClient:
    def __init__(self, consent_callback: Optional[Callable] = None):
        self.consent_callback = consent_callback  # UI provides this

# Tools use the callback
class EditTool(BaseTool):
    async def execute(self, file_path: str, ...):
        # Request consent (awaits response from UI)
        if not await self._check_consent(file_path):
            return "Error: Edit permission denied by user"
        # Proceed...

# TUI: Synchronous prompt
async def tui_consent(file_path: str) -> tuple[bool, str]:
    response = prompt(f"⚠️  Edit {file_path}? (y/n/always/never): ")
    return (response in ['y', 'always'], response)

# VSCode: Event-based via HTTP
# 1. Server emits CONSENT_REQUEST SSE event
# 2. Extension shows modal dialog
# 3. Extension POSTs response to /consent/respond
# 4. Server resolves Future, tool proceeds

Why This Works: - ✅ Callback pattern is flexible (works for both TUI and VSCode) - ✅ Async-friendly (tools can await consent) - ✅ Event-based for VSCode (SSE + HTTP endpoint) - ✅ No changes to existing tools or commands - ✅ Default behavior: If no callback, auto-approve (backward compatible)

Foundations for Future Phases¶

Phase 1 establishes critical foundations needed by Phases 2-6:

For Phase 2 (@git context) & Phase 3 (@tree context): - ✅ No dependencies on Phase 1 - ✅ Context injection system already exists (ContextInjector) - ✅ Working directory tracking already exists - Phase 1 provides: Examples of clean tool implementation

For Phase 5 (/agent loop) - CRITICAL DEPENDENCIES: - ✅ Consent mechanism - Agent will make multiple tool calls - ✅ "always" mode - Essential for uninterrupted autonomous execution - ✅ Session-scoped state - Consent persists across iterations - ✅ Async callback pattern - Agent loop is async - ✅ Event-based coordination - VSCode agent needs non-blocking consent

Phase 1 Consent Design Explicitly Supports /agent Loop:

# Agent loop scenario (Phase 5)
User: "/agent implement user auth with tests"

# Iteration 1: Create auth.py
→ Consent prompt: "Edit auth.py? (y/n/always/never)"
→ User: "always"  ← Critical for autonomous mode
→ edit_consent_mode = "always"

# Iteration 2: Create test_auth.py
→ NO prompt (always mode)
→ Tool executes immediately

# Iteration 3-5: Fix tests, refactor
→ NO prompts (always mode)
→ Fully autonomous execution

If Phase 1 consent was blocking or not session-scoped: - ❌ Agent would interrupt on every file edit - ❌ Autonomous mode would be unusable - ❌ Would need to redesign in Phase 5

Phase 1 Design Decisions That Enable Phase 5: 1. ✅ Async consent callback (doesn't block event loop) 2. ✅ Session-scoped consent state (persists across tool calls) 3. ✅ "always" mode (enables true autonomy) 4. ✅ Event-based for VSCode (non-blocking UI) 5. ✅ Future-based coordination (tool waits for user response)

New Event Type Required (Phase 1):

# ppxai/engine/types.py
class EventType(Enum):
    # ... existing events ...
    CONSENT_REQUEST = "consent_request"  # NEW - Phase 1

New HTTP Endpoint Required (Phase 1):

# ppxai/server/http.py
@app.post("/consent/respond")
async def respond_consent(request: ConsentResponse):
    """VSCode extension responds to consent request."""
    # Resolve Future that edit tool is awaiting
    return {"ok": True}

Validation Checklist: - [ ] Consent callback pattern supports both TUI and VSCode - [ ] Async design doesn't block /agent loop (Phase 5) - [ ] Session state persists across multiple tool calls - [ ] "always" mode truly bypasses all prompts - [ ] Event-based consent works with SSE streaming - [ ] Backward compatible (no callback = auto-approve)

Tools to Implement¶

apply_patch(file_path: str, unified_diff: str)
Apply standard unified diff patches
Validate patch format before applying
Atomic operation with rollback on failure
Return success/failure with line numbers affected
replace_block(file_path: str, search: str, replace: str)
Search for exact text block and replace
Case-sensitive by default
Fail if search text not found or found multiple times
Return matched location and new content
insert_text(file_path: str, line_number: int, text: str)
Insert text at specific line number
Preserve indentation context
Support multiple lines
Return confirmation with line range
delete_lines(file_path: str, start_line: int, end_line: int)
Delete range of lines (inclusive)
Validate line numbers exist
Return deleted content for undo capability

Implementation Details¶

File: ppxai/engine/tools/builtin/editor.py

from ppxai.engine.tools.base import BaseTool
from typing import Dict, Any
import difflib
from pathlib import Path

class ApplyPatchTool(BaseTool):
    """Apply unified diff patch to a file."""

    def name(self) -> str:
        return "apply_patch"

    def description(self) -> str:
        return "Apply a unified diff patch to a file"

    def parameters(self) -> Dict[str, Any]:
        return {
            "type": "object",
            "properties": {
                "file_path": {
                    "type": "string",
                    "description": "Path to file to patch"
                },
                "unified_diff": {
                    "type": "string",
                    "description": "Unified diff format patch"
                }
            },
            "required": ["file_path", "unified_diff"]
        }

    def execute(self, file_path: str, unified_diff: str) -> str:
        # Implementation with safety checks
        # - Validate file exists
        # - Parse diff
        # - Apply atomically
        # - Rollback on failure
        pass

# Similar for ReplaceBlockTool, InsertTextTool, DeleteLinesTool

Manual Testing (Phase 1)¶

Test 1: Tool Functionality (test each tool individually without /agent)

# Test apply_patch
User: "Create a patch to fix the typo in README.md line 42 and apply it"
→ Consent prompt: "⚠️  AI wants to edit README.md. Allow? (y/n/always/never)"
→ User: "y"
AI: [generates diff, calls apply_patch]
→ Verify: Typo fixed in README.md

# Test replace_block
User: "In auth.py, replace the old login function with this new implementation: [code]"
→ Consent prompt: "⚠️  AI wants to edit auth.py. Allow? (y/n/always/never)"
→ User: "y"
AI: [calls replace_block with search/replace]
→ Verify: Function replaced correctly

# Test insert_text
User: "Add error handling at line 100 in client.py: [code]"
→ Consent prompt: "⚠️  AI wants to edit client.py. Allow? (y/n/always/never)"
→ User: "y"
AI: [calls insert_text]
→ Verify: Code inserted at correct line

# Test delete_lines
User: "Delete the deprecated code from lines 50-75 in utils.py"
→ Consent prompt: "⚠️  AI wants to edit utils.py. Allow? (y/n/always/never)"
→ User: "y"
AI: [calls delete_lines]
→ Verify: Lines deleted correctly

Test 2: Consent Flow

# Test per-file consent (already allowed)
User: "Fix another typo in README.md line 100"
→ NO consent prompt (already allowed in this session)
AI: [calls apply_patch]
→ Verify: No interruption for same file

# Test consent denial
User: "Edit config.json to change port to 8080"
→ Consent prompt: "⚠️  AI wants to edit config.json. Allow? (y/n/always/never)"
→ User: "n"
AI: [receives error from tool]
→ Verify: AI reports edit was denied, file unchanged

# Test "always" mode
User: "Refactor server.py to use async/await"
→ Consent prompt: "⚠️  AI wants to edit server.py. Allow? (y/n/always/never)"
→ User: "always"
AI: [calls replace_block multiple times]
→ Verify: No more prompts for any file

# Test "never" mode (new session)
User: "/clear" (start fresh session)
User: "/tools enable"
User: "Fix bug in database.py"
→ Consent prompt: "⚠️  AI wants to edit database.py. Allow? (y/n/always/never)"
→ User: "never"
AI: [receives error from tool]
→ Verify: All edit attempts denied for entire session

Test 3: VSCode Extension

Test modal dialog appears in VSCode
Test all consent options work
Test consent state persists across multiple edits

Success Criteria: - ✅ All 4 tools work reliably - ✅ Proper error messages on failure - ✅ Atomic operations (no partial edits) - ✅ Clear confirmation messages - ✅ Consent prompts appear for first edit of each file - ✅ Consent state persists within session - ✅ "y" allows specific file, "always" allows all files, "never" blocks all - ✅ Denied edits return clear error to AI - ✅ VSCode modal dialog works correctly

Phase 1 Deliverables Summary¶

What Gets Built: 1. ✅ 4 file editing tools (apply_patch, replace_block, insert_text, delete_lines) 2. ✅ Per-file session consent mechanism 3. ✅ Session state fields (allowed_files, edit_consent_mode) 4. ✅ Async consent callback pattern 5. ✅ TUI consent prompt integration 6. ✅ VSCode consent dialog integration 7. ✅ New EventType.CONSENT_REQUEST 8. ✅ New HTTP endpoint /consent/respond 9. ✅ Tool registration in builtin system

What This Enables: - ✅ Immediate Value: Safe autonomous file editing - ✅ Phase 2/3 Foundation: Example of clean tool implementation - ✅ Phase 5 Critical: Consent mechanism for /agent loop - ✅ Backward Compatible: Both TUI and VSCode keep working - ✅ Future-Proof: Designed for autonomous multi-step workflows

What Doesn't Break: - ✅ Existing tools (read_file, search_files, etc.) unchanged - ✅ Existing commands unchanged - ✅ Existing TUI workflows unchanged - ✅ Existing VSCode workflows unchanged - ✅ No configuration changes required

Files Created/Modified:

NEW:     ppxai/engine/tools/builtin/editor.py        (~400 lines)

MODIFIED: ppxai/engine/session.py                    (+20 lines)
MODIFIED: ppxai/engine/types.py                      (+5 lines)
MODIFIED: ppxai/engine/client.py                     (+40 lines)
MODIFIED: ppxai/engine/tools/builtin/__init__.py     (+2 lines)
MODIFIED: ppxai/main.py or commands.py               (+30 lines)
MODIFIED: vscode-extension/src/chatPanel.ts          (+50 lines)
MODIFIED: ppxai/server/http.py                       (+30 lines)

TOTAL: ~577 lines added (all additive, no deletions)

Time Investment: 6-8 hours Risk Level: Low (additive changes only, well-architected consent) Validation: Manual testing (Test 1-3 above) Next Phase Dependency: None (Phases 2-3 can proceed independently)

Phase 2: `@git` Context Provider (2-3 hours)¶

Goal: Automatically inject git diff context when user references @git

Implementation¶

File: ppxai/engine/context.py (extend existing ContextInjector)

class ContextInjector:
    # ... existing code ...

    GIT_PATTERN = r'@git\b'

    def inject_git_context(self, working_dir: str) -> Optional[InjectedContext]:
        """
        Inject git diff (staged + unstaged) as context.

        Returns:
            InjectedContext with git diff or None if not in git repo
        """
        import subprocess

        try:
            # Get unstaged changes
            unstaged = subprocess.run(
                ['git', 'diff'],
                cwd=working_dir,
                capture_output=True,
                text=True
            )

            # Get staged changes
            staged = subprocess.run(
                ['git', 'diff', '--staged'],
                cwd=working_dir,
                capture_output=True,
                text=True
            )

            # Combine with headers
            content = ""
            if staged.stdout.strip():
                content += "=== Staged Changes ===\n"
                content += staged.stdout + "\n"

            if unstaged.stdout.strip():
                content += "=== Unstaged Changes ===\n"
                content += unstaged.stdout + "\n"

            if not content:
                return InjectedContext(
                    source="@git",
                    content="No changes in working directory",
                    language="text",
                    truncated=False,
                    size=0
                )

            return InjectedContext(
                source="@git",
                content=content,
                language="diff",
                truncated=len(content) > self.MAX_FILE_SIZE,
                size=len(content)
            )

        except subprocess.CalledProcessError:
            return None  # Not a git repository

Update message enhancement:

def enhance_message(self, message: str, working_dir: Optional[str] = None) -> Tuple[str, List[InjectedContext]]:
    """Enhanced to support @git pattern."""
    # ... existing @file logic ...

    # Check for @git pattern
    if re.search(self.GIT_PATTERN, message):
        git_ctx = self.inject_git_context(working_dir or self.working_dir)
        if git_ctx:
            contexts.append(git_ctx)

    # ... rest of enhancement logic ...

Manual Testing (Phase 2)¶

# Test basic git context
User: "What changes did I make @git"
AI: [sees diff context, summarizes changes]

# Test with no changes
User: "Review my changes @git"
AI: [sees "No changes", responds appropriately]

# Test code review workflow
User: "Review my authentication changes for security issues @git"
AI: [analyzes diff, provides security review]

# Test combined with file
User: "Compare my changes @git with the original design in DESIGN.md"
AI: [sees both git diff and DESIGN.md content]

Success Criteria: - @git detected in messages - Staged and unstaged changes both captured - Proper formatting in context block - Graceful handling of non-git directories - Works alongside @file references

Phase 3: `@tree` Context Provider (2-3 hours)¶

Goal: Inject visual project structure when user references @tree

Implementation¶

File: ppxai/engine/context.py (extend existing ContextInjector)

class ContextInjector:
    # ... existing code ...

    TREE_PATTERN = r'@tree\b'

    def inject_tree_context(self, working_dir: str, max_depth: int = 3) -> InjectedContext:
        """
        Inject directory tree structure as context.

        Args:
            working_dir: Root directory to tree
            max_depth: Maximum depth to traverse

        Returns:
            InjectedContext with tree structure
        """
        from pathlib import Path

        def build_tree(path: Path, prefix: str = "", depth: int = 0) -> str:
            """Recursively build tree structure."""
            if depth > max_depth:
                return ""

            # Respect .gitignore
            gitignore_patterns = self._load_gitignore(path)

            items = sorted(path.iterdir(), key=lambda x: (not x.is_dir(), x.name))
            output = []

            for i, item in enumerate(items):
                # Skip ignored files
                if self._is_ignored(item, gitignore_patterns):
                    continue

                is_last = i == len(items) - 1
                current_prefix = "└── " if is_last else "├── "
                next_prefix = "    " if is_last else "│   "

                if item.is_dir():
                    output.append(f"{prefix}{current_prefix}{item.name}/")
                    output.append(build_tree(item, prefix + next_prefix, depth + 1))
                else:
                    output.append(f"{prefix}{current_prefix}{item.name}")

            return "\n".join(filter(None, output))

        tree = build_tree(Path(working_dir))

        # Add header with stats
        total_files = tree.count('\n') - tree.count('/\n')
        total_dirs = tree.count('/\n')

        content = f"Project: {Path(working_dir).name}\n"
        content += f"Directories: {total_dirs}, Files: {total_files}\n"
        content += f"Max depth: {max_depth}\n\n"
        content += tree

        return InjectedContext(
            source="@tree",
            content=content,
            language="text",
            truncated=False,
            size=len(content)
        )

    def _load_gitignore(self, path: Path) -> List[str]:
        """Load .gitignore patterns."""
        gitignore_file = path / '.gitignore'
        if gitignore_file.exists():
            with open(gitignore_file) as f:
                return [line.strip() for line in f if line.strip() and not line.startswith('#')]
        return []

    def _is_ignored(self, path: Path, patterns: List[str]) -> bool:
        """Check if path matches gitignore patterns."""
        import fnmatch
        name = path.name
        return any(fnmatch.fnmatch(name, pattern) for pattern in patterns)

Manual Testing (Phase 3)¶

# Test basic tree
User: "Show me the project structure @tree"
AI: [sees tree, describes structure]

# Test architectural planning
User: "Where should I add the new caching layer @tree"
AI: [analyzes structure, suggests location]

# Test refactoring guidance
User: "I want to reorganize the API modules. Current structure: @tree"
AI: [sees structure, provides refactoring suggestions]

# Test combined contexts
User: "Review my changes @git in the context of the project structure @tree"
AI: [sees both, provides comprehensive review]

# Test depth control (future enhancement)
User: "Show a shallow tree @tree depth=1"
AI: [shows only top-level structure]

Success Criteria: - @tree detected in messages - Directory structure displayed correctly - .gitignore patterns respected - Reasonable depth limit (3 levels) - File/directory counts included - Works with @git and @file

Goal: Dogfood all new features to validate design and find issues

Comprehensive Test Scenarios¶

Scenario 1: File Editing Workflow

1. User: "Read test_commands.py and find the failing test"
   AI: [uses read_file tool]

2. User: "The test at line 185 is checking the wrong value. Fix it using replace_block"
   AI: [uses editor.replace_block]

3. User: "Run pytest to verify the fix"
   AI: [uses shell.execute_command]

4. User: "If it still fails, apply this patch: [diff]"
   AI: [uses editor.apply_patch if needed]

Scenario 2: Code Review Workflow

1. User: "What changes are in my working directory @git"
   AI: [sees staged + unstaged diffs, summarizes]

2. User: "Are there any bugs or security issues in these changes @git"
   AI: [analyzes diff, provides review]

3. User: "Fix the SQL injection vulnerability you found"
   AI: [uses editor tool to fix]

4. User: "Show me the updated diff @git"
   AI: [shows new diff with fix applied]

Scenario 3: Architecture Planning

1. User: "Show me the current structure @tree"
   AI: [displays tree]

2. User: "I need to add caching. Where should the cache module go @tree"
   AI: [analyzes structure, suggests location]

3. User: "Create cache.py in that location with basic structure"
   AI: [uses editor.insert_text or creates file]

4. User: "Show updated structure @tree"
   AI: [displays tree with new file]

Scenario 4: Combined Workflow (Ultimate Test)

1. User: "Review my authentication changes @git against the project structure @tree"
   AI: [uses both contexts for comprehensive analysis]

2. User: "You suggested moving auth.py to src/auth/. Do that and update imports"
   AI: [uses multiple editor tools for refactoring]

3. User: "Verify no broken imports @tree"
   AI: [checks structure, may run tests]

Based on testing, refine:

[ ] Truncation Limits: Adjust MAX_FILE_SIZE if needed
[ ] Error Messages: Ensure clear, actionable errors
[ ] Safety Confirmations: Add warnings for destructive operations
[ ] Context Formatting: Polish how diffs/trees display
[ ] Performance: Optimize tree generation for large projects
[ ] Edge Cases: Handle empty repos, binary files, symlinks
[ ] User Feedback: Add progress indicators for slow operations

Issues to Watch For¶

File Editing:
Encoding issues (UTF-8 vs ASCII)
Line ending differences (CRLF vs LF)
Indentation preservation
Partial edit failures
Git Context:
Large diffs truncation strategy
Binary file diffs
Merge conflicts in diff
Submodule handling
Tree Context:
Symlink loops
Hidden files strategy
Large directory structures
.gitignore edge cases

Phase 5: `/agent` Loop Implementation (6-8 hours)¶

Goal: Build autonomous execution loop using the proven tools

Now that we have confidence in the tools, implement the agent loop:

Command Handler¶

File: ppxai/commands.py (add new handler)

def handle_agent(self, args: str):
    """Handle /agent command for autonomous task execution."""
    if not args.strip():
        console.print("[red]Usage: /agent <task description>[/red]\n")
        return

    task = args.strip()
    max_iterations = 5  # Configurable

    console.print(f"[cyan]🤖 Starting autonomous task:[/cyan] {task}\n")
    console.print(f"[dim]Max iterations: {max_iterations}[/dim]\n")

    iteration = 0
    task_complete = False

    while iteration < max_iterations and not task_complete:
        iteration += 1
        console.print(f"[yellow]--- Iteration {iteration}/{max_iterations} ---[/yellow]\n")

        # Construct prompt for iteration
        if iteration == 1:
            prompt = f"""Task: {task}

Please work on this task autonomously. You have access to tools for:
- File editing (apply_patch, replace_block, insert_text, delete_lines)
- File reading and searching
- Shell commands
- Git context (@git)
- Project structure (@tree)

After each action, assess if the task is complete. If complete, respond with:
TASK_COMPLETE: <summary of what was done>

If you need to continue, explain what you're doing and call the appropriate tools."""
        else:
            prompt = f"""Continue working on the task: {task}

Previous iteration completed. Assess the current state and continue if needed.
If the task is complete, respond with:
TASK_COMPLETE: <summary>"""

        # Send to AI
        try:
            response = self.client.chat(prompt)

            # Check for completion signal
            if "TASK_COMPLETE:" in response:
                task_complete = True
                summary = response.split("TASK_COMPLETE:")[1].strip()
                console.print(f"\n[green]✅ Task completed:[/green] {summary}\n")

        except KeyboardInterrupt:
            console.print("\n[yellow]Agent loop interrupted by user[/yellow]\n")
            break
        except Exception as e:
            console.print(f"[red]Error in iteration {iteration}: {e}[/red]\n")
            break

    if not task_complete and iteration >= max_iterations:
        console.print(f"[yellow]⚠️  Task incomplete after {max_iterations} iterations[/yellow]\n")

Engine Support¶

File: ppxai/engine/client.py (add agent mode)

def chat_agent(self, task: str, max_iterations: int = 5) -> Generator[Event, None, None]:
    """
    Autonomous agent mode - AI loops until task complete or max iterations.

    Args:
        task: High-level task description
        max_iterations: Max number of AI turns

    Yields:
        Events for each iteration
    """
    for iteration in range(1, max_iterations + 1):
        # Emit iteration start event
        yield Event(
            type=EventType.AGENT_ITERATION,
            data={"iteration": iteration, "max": max_iterations}
        )

        # Construct iteration prompt
        prompt = self._build_agent_prompt(task, iteration)

        # Stream chat response
        response_text = ""
        for event in self.chat(prompt):
            yield event
            if event.type == EventType.STREAM_CHUNK:
                response_text += event.data

        # Check for completion
        if self._task_complete(response_text):
            yield Event(
                type=EventType.AGENT_COMPLETE,
                data={"iterations": iteration, "summary": self._extract_summary(response_text)}
            )
            break
    else:
        # Max iterations reached
        yield Event(
            type=EventType.AGENT_MAX_ITERATIONS,
            data={"iterations": max_iterations}
        )

def _build_agent_prompt(self, task: str, iteration: int) -> str:
    """Build prompt for agent iteration."""
    # Implementation
    pass

def _task_complete(self, response: str) -> bool:
    """Check if AI signaled task completion."""
    return "TASK_COMPLETE:" in response

def _extract_summary(self, response: str) -> str:
    """Extract completion summary from response."""
    if "TASK_COMPLETE:" in response:
        return response.split("TASK_COMPLETE:")[1].strip()
    return ""

Safety Features¶

Max Iterations: Prevent infinite loops
Interrupt Support: Ctrl-C breaks agent loop
Confirmation Prompts: For destructive operations
Rollback Capability: Track edits for undo
Progress Logging: Clear visibility into agent actions

Manual Testing (Phase 5)¶

# Test simple task
/agent Fix the failing test in test_commands.py

# Test multi-step task
/agent Review my changes @git and fix any issues you find

# Test planning task
/agent Reorganize the auth module based on @tree structure

# Test complex task
/agent Implement caching for the API, add tests, and update docs

# Test interrupt
/agent <long task>
[Press Ctrl-C during execution]

# Test max iterations
/agent <task that takes 10+ steps>
[Should stop at 5 iterations with clear message]

Success Criteria: - Agent completes simple tasks autonomously - Proper iteration tracking - Clean interrupt handling - Clear completion/failure messaging - Doesn't exceed max iterations

Phase 6: Testing & Documentation (4-5 hours)¶

Goal: Comprehensive testing and documentation for v1.11.0

Unit Tests¶

File: tests/test_editor_tools.py

def test_apply_patch():
    """Test unified diff patch application."""
    # Create temp file
    # Apply valid patch
    # Verify result
    # Test invalid patch
    # Test non-existent file

def test_replace_block():
    """Test block replacement."""
    # Test exact match
    # Test no match
    # Test multiple matches (should fail)
    # Test case sensitivity

def test_insert_text():
    """Test text insertion."""
    # Test at line number
    # Test out of bounds
    # Test indentation preservation

def test_delete_lines():
    """Test line deletion."""
    # Test valid range
    # Test invalid range
    # Test beyond file end

File: tests/test_context_providers.py

def test_git_context_injection():
    """Test @git context provider."""
    # Create temp git repo
    # Make changes
    # Test @git detection
    # Verify diff in context
    # Test no changes
    # Test non-git directory

def test_tree_context_injection():
    """Test @tree context provider."""
    # Create temp directory structure
    # Test @tree detection
    # Verify tree in context
    # Test depth limiting
    # Test .gitignore respect

def test_combined_contexts():
    """Test multiple context providers together."""
    # Test @git + @tree
    # Test @file + @git
    # Test all three

File: tests/test_agent_loop.py

def test_agent_basic_task():
    """Test simple agent task completion."""
    # Mock AI that completes in 1 iteration
    # Verify completion event
    # Verify summary extracted

def test_agent_max_iterations():
    """Test max iteration limit."""
    # Mock AI that never completes
    # Verify stops at max iterations
    # Verify appropriate event

def test_agent_interrupt():
    """Test agent interrupt handling."""
    # Start agent task
    # Interrupt during iteration
    # Verify clean shutdown

Integration Tests¶

End-to-end workflows:

def test_code_review_workflow():
    """Test full code review workflow with @git."""
    # Setup git repo with changes
    # Ask for review with @git
    # Verify diff in context
    # Verify AI response references changes

def test_refactoring_workflow():
    """Test refactoring with @tree."""
    # Setup project structure
    # Ask for refactoring suggestions with @tree
    # Verify tree in context
    # Verify AI considers structure

def test_autonomous_debugging():
    """Test agent finding and fixing a bug."""
    # Create file with known bug
    # /agent fix the bug
    # Verify bug found
    # Verify fix applied
    # Verify tests pass

Documentation Updates¶

File: CLAUDE.md

## Current Version: v1.11.0 (Agentic Workflow)

**What's New in v1.11.0:**
- **Autonomous Execution**: New `/agent` command for multi-step task execution
- **Native File Editing**: Safe, atomic file editing tools (apply_patch, replace_block, insert_text, delete_lines)
- **Git Context**: `@git` automatically injects diffs for code review workflows
- **Project Structure**: `@tree` provides architectural awareness
- **Combined Workflows**: Mix @git, @tree, and @file for comprehensive context

**Examples:**
```bash
/agent Fix all failing tests
/agent Review my changes @git
/agent Refactor the auth module based on @tree

**File**: `README.md`

Update "What's New" section and add examples:

```markdown
### v1.11.0 - Agentic Workflow

Transform ppxai into an autonomous developer agent:

- `/agent <task>` - Autonomous multi-step task execution
- `@git` - Automatic diff injection for code review
- `@tree` - Project structure awareness
- Native file editing tools (apply_patch, replace_block, insert_text, delete_lines)

**Example Workflows:**
```bash
# Autonomous debugging
/agent Fix the failing test in test_auth.py

# Code review with context
Review my authentication changes @git

# Architecture planning
Where should I add caching given this structure @tree

# Combined contexts
/agent Refactor the API based on @tree and review changes @git

**File**: `docs/AGENTIC_WORKFLOW.md` (New)

Create comprehensive guide:

```markdown
# Agentic Workflow Guide

## Overview

The `/agent` command enables ppxai to work autonomously on multi-step tasks...

## File Editing Tools

### apply_patch
### replace_block
### insert_text
### delete_lines

## Context Providers

### @git - Git Diff Context
### @tree - Project Structure
### @file - File Content (existing)

## Agent Loop

### How It Works
### Best Practices
### Limitations
### Troubleshooting

## Example Workflows

### Debugging
### Code Review
### Refactoring
### Feature Implementation

File: vscode-extension/README.md

Add section on agentic features and context providers.

Timeline Summary¶

Phase	Component	Hours	Dependencies
1	File editing tools + consent	6-8	None
2	@git context	2-3	None
3	@tree context	2-3	None
4	Manual testing	3-4	Phases 1-3
5	/agent loop	6-8	Phases 1-4
6	Testing & docs	4-5	All phases
Total		23-31 hours

Success Criteria¶

Phase 1 Complete When:¶

[ ] All 4 file editing tools implemented
[ ] Per-file session consent mechanism working
[ ] Each tool tested manually
[ ] Consent flow tested (y/n/always/never)
[ ] Atomic operations confirmed
[ ] Error handling validated
[ ] VSCode modal dialog working

Phase 2 Complete When:¶

[ ] @git pattern detected correctly
[ ] Staged and unstaged diffs captured
[ ] Works in git and non-git directories
[ ] Formatted appropriately in context

Phase 3 Complete When:¶

[ ] @tree pattern detected correctly
[ ] Directory structure displayed
[ ] .gitignore patterns respected
[ ] Reasonable depth limits enforced

Phase 4 Complete When:¶

[ ] All tools tested in real workflows
[ ] Edge cases identified and handled
[ ] Context formatting refined
[ ] Performance acceptable

Phase 5 Complete When:¶

[ ] /agent command implemented
[ ] Multi-iteration tasks work
[ ] Proper termination conditions
[ ] Interrupt handling works

Phase 6 Complete When:¶

[ ] All unit tests passing
[ ] Integration tests passing
[ ] Documentation complete
[ ] Examples validated

Benefits of Incremental Approach¶

Lower Risk: Test each component in isolation
Faster Feedback: Validate design decisions early
Better Debugging: Isolate issues to specific components
Incremental Value: Each phase delivers usable features
Confidence Building: Know tools work before building automation
User Input: Can incorporate feedback before /agent implementation

Next Steps¶

Create feature branch: feature/agentic-workflow-v1.11.0
Start with Phase 1: Implement file editing tools
Test manually after each phase
Proceed to next phase only when previous is validated
Build /agent loop last with full confidence

Notes¶

All code must maintain backward compatibility
Security review required for file editing operations
Performance testing needed for large projects
Documentation must include real-world examples
Tests must cover edge cases and error conditions

Last Updated: 2025-12-20 Next Review: After Phase 1 completion

v1.11.x: Agentic Workflow Implementation Plan¶

Phase Status¶

Workflow Diagrams¶

Current Turn-Based Flow (Pre-v1.11.8)¶

Autonomous Agent Flow (v1.11.8+)¶

Overview¶

System Semantics & Coherence¶

Current Command/Tool Landscape (v1.10.8)¶

Semantic Enhancement (v1.11.0)¶

Coherent User Workflows¶

No Conflicts, Only Enhancement¶

Architecture Goals¶

Implementation Phases¶

Phase 1: Native File Editing Tools (6-8 hours)¶

Context: Existing Tools & Commands¶

Safety Mechanism: Per-File Session Consent¶

Architectural Impact & Backward Compatibility¶

Foundations for Future Phases¶

Tools to Implement¶

Implementation Details¶

Manual Testing (Phase 1)¶

Phase 1 Deliverables Summary¶

Phase 2: @git Context Provider (2-3 hours)¶

Implementation¶

Manual Testing (Phase 2)¶

Phase 3: @tree Context Provider (2-3 hours)¶

Implementation¶

Manual Testing (Phase 3)¶

Phase 4: Manual Testing & Refinement (3-4 hours)¶

Comprehensive Test Scenarios¶

Refinement Checklist¶

Issues to Watch For¶

Phase 5: /agent Loop Implementation (6-8 hours)¶

Command Handler¶

Engine Support¶

Safety Features¶

Manual Testing (Phase 5)¶

Phase 6: Testing & Documentation (4-5 hours)¶

Unit Tests¶

Integration Tests¶

Documentation Updates¶

Timeline Summary¶

Success Criteria¶

Phase 1 Complete When:¶

Phase 2 Complete When:¶

Phase 3 Complete When:¶

Phase 4 Complete When:¶

Phase 5 Complete When:¶

Phase 6 Complete When:¶

Benefits of Incremental Approach¶

Next Steps¶

Notes¶

Phase 2: `@git` Context Provider (2-3 hours)¶

Phase 3: `@tree` Context Provider (2-3 hours)¶

Phase 5: `/agent` Loop Implementation (6-8 hours)¶