v1.11.x: Agentic Workflow Implementation Plan¶
Created: 2025-12-20 Updated: 2025-12-26 Status: Phases 1-4 Complete, Phase 5 In Progress
Phase Status¶
| Phase | Description | Status | Version |
|---|---|---|---|
| 1 | File editing tools + consent | ✅ Complete | v1.11.0 |
| 2 | @git context provider | ✅ Complete | v1.11.4 |
| 3 | @tree context provider | ✅ Complete | v1.11.4 |
| 4 | Manual testing & refinement | ✅ Complete | v1.11.5-v1.11.7 |
| 5 | /agent loop implementation |
🔧 In Progress | v1.11.8 |
| 6 | Testing & documentation | ⏳ Pending | v1.11.8 |
Note: v1.11.7 removed all legacy code (AIClient, PerplexityClientPromptTools, tool_manager). EngineClient is now the only client interface.
Branch: feature/adding-agent-loop (Phase 5 work)
User Guide: AGENT_MODE_GUIDE.md - Practical examples for research and development workflows
Workflow Diagrams¶
Current Turn-Based Flow (Pre-v1.11.8)¶

The current workflow is turn-based: 1. User sends a message 2. AI processes and optionally calls tools 3. AI returns a single response 4. User must manually direct the next action
Autonomous Agent Flow (v1.11.8+)¶

The /agent command enables autonomous execution:
1. User issues /agent <task> command
2. AI enters an autonomous loop (max 5 iterations)
3. Each iteration: Plan → Execute tools → Check completion
4. Loop continues until TASK_COMPLETE: signal or max iterations
5. User can interrupt with Ctrl-C at any time
Overview¶
This release transforms ppxai from a turn-based chatbot into an autonomous developer agent capable of multi-step task execution. The implementation follows an incremental approach, building and testing individual components before combining them into the autonomous /agent loop.
System Semantics & Coherence¶
Current Command/Tool Landscape (v1.10.8)¶
1. Read-Only Tools (always available when tools enabled):
- search_files - Find files with glob patterns
- read_file - Read file contents
- list_directory - List directory contents
- Purpose: Discovery and inspection of codebase
2. Code Generation Commands (TUI slash commands, VSCode commands):
- /generate, /test, /docs, /implement, /debug, /explain, /convert
- Output: Generated code to chat (markdown)
- User Action: Manual copy/paste to apply changes
- Mode: Consultative - AI suggests, user applies
3. Utility Shell Tool (risky for code editing):
- execute_shell_command - Run any shell command
- Risk: Escaping issues, no atomicity, no validation
- Use Case: Tests, builds, git commands (NOT file editing)
Semantic Enhancement (v1.11.0)¶
4. NEW: File Editing Tools (Phase 1 - enables autonomous mode):
- apply_patch, replace_block, insert_text, delete_lines
- Output: Direct file modification with atomic operations
- Safety: Validation, dry-run, rollback capability
- Mode: Agentic - AI applies changes autonomously
5. NEW: Context Providers (Phases 2-3):
- @git - Auto-inject git diff for code review
- @tree - Auto-inject project structure
- Purpose: Automatic context awareness
6. NEW: Agent Loop (Phase 5):
- /agent <task> - Autonomous multi-step execution
- Flow: Plan → Execute tools → Verify → Repeat (max 5 iterations)
- Uses: All tools above in combination
Coherent User Workflows¶
Workflow A: Consultative (Tools Disabled)
User: "/test utils.py"
AI: [Generates test code in chat]
User: [Copies code to test_utils.py manually]
Workflow B: Semi-Autonomous (Tools Enabled, Manual Direction)
User: "Fix the bug in auth.py line 42"
AI: [Uses read_file → Analyzes → Uses replace_block to fix]
User: [Sees confirmation, change applied]
Workflow C: Fully Autonomous (Tools + /agent)
User: "/agent implement user authentication with tests"
AI: [Loop 1] Plan: Create auth.py, write tests, verify
AI: [Loop 2] Executes: insert_text to create files
AI: [Loop 3] Executes: execute_shell_command to run tests
AI: [Loop 4] Tests fail → Uses replace_block to fix
AI: [Loop 5] Tests pass → Returns success
No Conflicts, Only Enhancement¶
| Feature | Before v1.11.0 | After v1.11.0 |
|---|---|---|
| Code generation | ✅ Output to chat | ✅ Output to chat (unchanged) |
| File reading | ✅ Via tools | ✅ Via tools (unchanged) |
| File editing | ⚠️ Manual or risky shell | ✅ Safe atomic tools |
| Context awareness | Manual @file | ✅ Manual @file + Auto @git/@tree |
| Multi-step tasks | Manual direction | ✅ Autonomous /agent loop |
Conclusion: Phase 1 tools enable a new autonomous mode while preserving the existing consultative mode. Users choose their workflow based on trust level and task complexity.
Architecture Goals¶
Current State (v1.10.8): - Turn-based interaction: User → AI → User - Tools available but require explicit user direction - Manual context injection via @file references
Target State (v1.11.0):
- Autonomous multi-step execution with /agent command
- Safe file editing without shell escaping risks
- Automatic context injection via @git and @tree
- AI can plan, execute, and verify tasks independently
Implementation Phases¶
Phase 1: Native File Editing Tools (6-8 hours)¶
Goal: Create safe, atomic file editing tools with per-file session consent in ppxai/engine/tools/builtin/editor.py
Context: Existing Tools & Commands¶
Existing Tools (Read-Only):
- search_files(pattern, directory) - Find files with glob patterns
- read_file(filepath, max_lines) - Read file contents
- list_directory(path, format) - List directory contents
- execute_shell_command(command, working_dir) - Run shell commands (⚠️ unsafe for file editing)
Existing Code Generation Slash Commands (Output to Chat):
- /generate <desc> - Generate code from description
- /test <file> - Generate unit tests for file
- /docs <file> - Generate documentation for file
- /implement <spec> - Implement feature from specification
- /debug <error> - Analyze and fix errors
- /explain <file> - Explain code logic
- /convert <lang1> <lang2> <file> - Convert code between languages
- /show <file> - Display file content (read-only)
Semantic Consistency: The current system has a clear pattern: - Manual Mode: Slash commands generate code → User copies/applies manually - Autonomous Mode (Phase 1 enables): AI uses tools → Direct file modification
This creates two coherent workflows: 1. Consultative (tools disabled): AI suggests changes, user applies 2. Agentic (tools enabled): AI applies changes autonomously
Phase 1 tools complement existing commands by enabling automation of previously manual workflows. No conflicts or redundancy.
Why Not Use execute_shell_command? - ❌ Shell escaping issues with complex code - ❌ No atomic operations or rollback - ❌ No dry-run validation - ❌ Risk of partial writes on failure
Safety Mechanism: Per-File Session Consent¶
Design Decision: Use per-file consent instead of tool-level configuration for better UX and safety.
How It Works:
User: "/tools enable" → All tools available (including edit tools)
[AI attempts to edit file_1.py]
→ Prompt: "⚠️ AI wants to edit file_1.py. Allow? (y/n/always/never)"
→ User: "y"
→ file_1.py added to session's allowed files
→ Edit proceeds
[AI attempts to edit file_1.py again]
→ No prompt (already consented)
→ Edit proceeds
[AI attempts to edit file_2.py]
→ Prompt: "⚠️ AI wants to edit file_2.py. Allow? (y/n/always/never)"
→ User: "always"
→ All future edits allowed (no more prompts this session)
→ Edit proceeds
Consent Options:
- y (yes) - Allow editing this specific file (this session)
- n (no) - Deny this edit (tool returns error to AI)
- always - Allow all file edits without prompting (autonomous mode)
- never - Deny all file edits (disables edit tools for session)
Implementation:
# Session state (ppxai/engine/session.py)
class Session:
def __init__(self):
self.allowed_files: Set[Path] = set()
self.edit_consent_mode: str = "ask" # "ask", "always", "never"
# Edit tools base helper (ppxai/engine/tools/builtin/editor.py)
def _check_edit_consent(self, file_path: str) -> bool:
"""Check if user consents to editing this file.
Returns:
True if edit is allowed, False otherwise
"""
from pathlib import Path
path = Path(file_path).resolve()
# Check global consent mode
if self.session.edit_consent_mode == "always":
return True
if self.session.edit_consent_mode == "never":
return False
# Check if already consented for this file
if path in self.session.allowed_files:
return True
# Prompt user (TUI or VSCode dialog)
response = self._prompt_edit_consent(path)
if response == "y":
self.session.allowed_files.add(path)
return True
elif response == "always":
self.session.edit_consent_mode = "always"
return True
elif response == "never":
self.session.edit_consent_mode = "never"
return False
else: # "n"
return False
def _prompt_edit_consent(self, file_path: Path) -> str:
"""Prompt user for edit consent (TUI or VSCode)."""
# TUI: Use prompt_toolkit input
# VSCode: Use vscode.window.showWarningMessage with buttons
pass
VSCode Extension Handling:
// Show modal dialog in VSCode
const response = await vscode.window.showWarningMessage(
`AI wants to edit ${filePath}. Allow this change?`,
{ modal: true },
'Yes',
'Always (this session)',
'No',
'Never (this session)'
);
Benefits:
- ✅ Granular safety - Per-file consent instead of all-or-nothing
- ✅ Transparent - User sees exactly which files AI wants to edit
- ✅ Flexible - "always" option enables fully autonomous mode
- ✅ Session-scoped - Fresh consent each session for safety
- ✅ No upfront config - Works immediately with /tools enable
- ✅ Compatible with /agent - Only interrupts first time per file
Trade-offs: - ⚠️ Interrupts /agent - First edit per file pauses for consent - Mitigated by: "always" option for uninterrupted autonomous workflow
Additional Time: +2 hours for consent mechanism implementation
Architectural Impact & Backward Compatibility¶
Analysis: Phase 1 integrates with existing architecture with zero breaking changes.
Current Architecture (v1.10.8):
- ✅ Event-based communication (EventType enum)
- ✅ Async tool execution (async def execute())
- ✅ UI-agnostic engine (EngineClient)
- ✅ TUI uses prompt_toolkit for input
- ✅ VSCode uses HTTP + SSE for communication
- ✅ Existing tools: read_file, search_files, list_directory, execute_shell_command
Changes Required (all additive, no modifications):
| Component | File | Change | Lines | Breaking? |
|---|---|---|---|---|
| Session state | ppxai/engine/session.py |
Add consent fields | ~20 | ❌ No |
| Event types | ppxai/engine/types.py |
Add CONSENT_REQUEST | ~5 | ❌ No |
| Engine client | ppxai/engine/client.py |
Add consent method | ~40 | ❌ No |
| Edit tools | ppxai/engine/tools/builtin/editor.py |
NEW FILE | ~400 | ❌ No |
| Tool registration | ppxai/engine/tools/builtin/__init__.py |
Import editor | ~2 | ❌ No |
| TUI | ppxai/main.py or commands.py |
Add consent callback | ~30 | ❌ No |
| VSCode | vscode-extension/src/chatPanel.ts |
Handle consent events | ~50 | ❌ No |
| HTTP server | ppxai/server/http.py |
Add consent endpoint | ~30 | ❌ No |
| TOTAL | All additive | ~577 | ❌ No |
Consent Implementation Pattern:
# Engine provides consent callback (both sync TUI and async VSCode work)
class EngineClient:
def __init__(self, consent_callback: Optional[Callable] = None):
self.consent_callback = consent_callback # UI provides this
# Tools use the callback
class EditTool(BaseTool):
async def execute(self, file_path: str, ...):
# Request consent (awaits response from UI)
if not await self._check_consent(file_path):
return "Error: Edit permission denied by user"
# Proceed...
# TUI: Synchronous prompt
async def tui_consent(file_path: str) -> tuple[bool, str]:
response = prompt(f"⚠️ Edit {file_path}? (y/n/always/never): ")
return (response in ['y', 'always'], response)
# VSCode: Event-based via HTTP
# 1. Server emits CONSENT_REQUEST SSE event
# 2. Extension shows modal dialog
# 3. Extension POSTs response to /consent/respond
# 4. Server resolves Future, tool proceeds
Why This Works: - ✅ Callback pattern is flexible (works for both TUI and VSCode) - ✅ Async-friendly (tools can await consent) - ✅ Event-based for VSCode (SSE + HTTP endpoint) - ✅ No changes to existing tools or commands - ✅ Default behavior: If no callback, auto-approve (backward compatible)
Foundations for Future Phases¶
Phase 1 establishes critical foundations needed by Phases 2-6:
For Phase 2 (@git context) & Phase 3 (@tree context): - ✅ No dependencies on Phase 1 - ✅ Context injection system already exists (ContextInjector) - ✅ Working directory tracking already exists - Phase 1 provides: Examples of clean tool implementation
For Phase 5 (/agent loop) - CRITICAL DEPENDENCIES: - ✅ Consent mechanism - Agent will make multiple tool calls - ✅ "always" mode - Essential for uninterrupted autonomous execution - ✅ Session-scoped state - Consent persists across iterations - ✅ Async callback pattern - Agent loop is async - ✅ Event-based coordination - VSCode agent needs non-blocking consent
Phase 1 Consent Design Explicitly Supports /agent Loop:
# Agent loop scenario (Phase 5)
User: "/agent implement user auth with tests"
# Iteration 1: Create auth.py
→ Consent prompt: "Edit auth.py? (y/n/always/never)"
→ User: "always" ← Critical for autonomous mode
→ edit_consent_mode = "always"
# Iteration 2: Create test_auth.py
→ NO prompt (always mode)
→ Tool executes immediately
# Iteration 3-5: Fix tests, refactor
→ NO prompts (always mode)
→ Fully autonomous execution
If Phase 1 consent was blocking or not session-scoped: - ❌ Agent would interrupt on every file edit - ❌ Autonomous mode would be unusable - ❌ Would need to redesign in Phase 5
Phase 1 Design Decisions That Enable Phase 5: 1. ✅ Async consent callback (doesn't block event loop) 2. ✅ Session-scoped consent state (persists across tool calls) 3. ✅ "always" mode (enables true autonomy) 4. ✅ Event-based for VSCode (non-blocking UI) 5. ✅ Future-based coordination (tool waits for user response)
New Event Type Required (Phase 1):
# ppxai/engine/types.py
class EventType(Enum):
# ... existing events ...
CONSENT_REQUEST = "consent_request" # NEW - Phase 1
New HTTP Endpoint Required (Phase 1):
# ppxai/server/http.py
@app.post("/consent/respond")
async def respond_consent(request: ConsentResponse):
"""VSCode extension responds to consent request."""
# Resolve Future that edit tool is awaiting
return {"ok": True}
Validation Checklist: - [ ] Consent callback pattern supports both TUI and VSCode - [ ] Async design doesn't block /agent loop (Phase 5) - [ ] Session state persists across multiple tool calls - [ ] "always" mode truly bypasses all prompts - [ ] Event-based consent works with SSE streaming - [ ] Backward compatible (no callback = auto-approve)
Tools to Implement¶
apply_patch(file_path: str, unified_diff: str)- Apply standard unified diff patches
- Validate patch format before applying
- Atomic operation with rollback on failure
-
Return success/failure with line numbers affected
-
replace_block(file_path: str, search: str, replace: str) - Search for exact text block and replace
- Case-sensitive by default
- Fail if search text not found or found multiple times
-
Return matched location and new content
-
insert_text(file_path: str, line_number: int, text: str) - Insert text at specific line number
- Preserve indentation context
- Support multiple lines
-
Return confirmation with line range
-
delete_lines(file_path: str, start_line: int, end_line: int) - Delete range of lines (inclusive)
- Validate line numbers exist
- Return deleted content for undo capability
Implementation Details¶
File: ppxai/engine/tools/builtin/editor.py
from ppxai.engine.tools.base import BaseTool
from typing import Dict, Any
import difflib
from pathlib import Path
class ApplyPatchTool(BaseTool):
"""Apply unified diff patch to a file."""
def name(self) -> str:
return "apply_patch"
def description(self) -> str:
return "Apply a unified diff patch to a file"
def parameters(self) -> Dict[str, Any]:
return {
"type": "object",
"properties": {
"file_path": {
"type": "string",
"description": "Path to file to patch"
},
"unified_diff": {
"type": "string",
"description": "Unified diff format patch"
}
},
"required": ["file_path", "unified_diff"]
}
def execute(self, file_path: str, unified_diff: str) -> str:
# Implementation with safety checks
# - Validate file exists
# - Parse diff
# - Apply atomically
# - Rollback on failure
pass
# Similar for ReplaceBlockTool, InsertTextTool, DeleteLinesTool
Manual Testing (Phase 1)¶
Test 1: Tool Functionality (test each tool individually without /agent)
# Test apply_patch
User: "Create a patch to fix the typo in README.md line 42 and apply it"
→ Consent prompt: "⚠️ AI wants to edit README.md. Allow? (y/n/always/never)"
→ User: "y"
AI: [generates diff, calls apply_patch]
→ Verify: Typo fixed in README.md
# Test replace_block
User: "In auth.py, replace the old login function with this new implementation: [code]"
→ Consent prompt: "⚠️ AI wants to edit auth.py. Allow? (y/n/always/never)"
→ User: "y"
AI: [calls replace_block with search/replace]
→ Verify: Function replaced correctly
# Test insert_text
User: "Add error handling at line 100 in client.py: [code]"
→ Consent prompt: "⚠️ AI wants to edit client.py. Allow? (y/n/always/never)"
→ User: "y"
AI: [calls insert_text]
→ Verify: Code inserted at correct line
# Test delete_lines
User: "Delete the deprecated code from lines 50-75 in utils.py"
→ Consent prompt: "⚠️ AI wants to edit utils.py. Allow? (y/n/always/never)"
→ User: "y"
AI: [calls delete_lines]
→ Verify: Lines deleted correctly
Test 2: Consent Flow
# Test per-file consent (already allowed)
User: "Fix another typo in README.md line 100"
→ NO consent prompt (already allowed in this session)
AI: [calls apply_patch]
→ Verify: No interruption for same file
# Test consent denial
User: "Edit config.json to change port to 8080"
→ Consent prompt: "⚠️ AI wants to edit config.json. Allow? (y/n/always/never)"
→ User: "n"
AI: [receives error from tool]
→ Verify: AI reports edit was denied, file unchanged
# Test "always" mode
User: "Refactor server.py to use async/await"
→ Consent prompt: "⚠️ AI wants to edit server.py. Allow? (y/n/always/never)"
→ User: "always"
AI: [calls replace_block multiple times]
→ Verify: No more prompts for any file
# Test "never" mode (new session)
User: "/clear" (start fresh session)
User: "/tools enable"
User: "Fix bug in database.py"
→ Consent prompt: "⚠️ AI wants to edit database.py. Allow? (y/n/always/never)"
→ User: "never"
AI: [receives error from tool]
→ Verify: All edit attempts denied for entire session
Test 3: VSCode Extension
- Test modal dialog appears in VSCode
- Test all consent options work
- Test consent state persists across multiple edits
Success Criteria: - ✅ All 4 tools work reliably - ✅ Proper error messages on failure - ✅ Atomic operations (no partial edits) - ✅ Clear confirmation messages - ✅ Consent prompts appear for first edit of each file - ✅ Consent state persists within session - ✅ "y" allows specific file, "always" allows all files, "never" blocks all - ✅ Denied edits return clear error to AI - ✅ VSCode modal dialog works correctly
Phase 1 Deliverables Summary¶
What Gets Built: 1. ✅ 4 file editing tools (apply_patch, replace_block, insert_text, delete_lines) 2. ✅ Per-file session consent mechanism 3. ✅ Session state fields (allowed_files, edit_consent_mode) 4. ✅ Async consent callback pattern 5. ✅ TUI consent prompt integration 6. ✅ VSCode consent dialog integration 7. ✅ New EventType.CONSENT_REQUEST 8. ✅ New HTTP endpoint /consent/respond 9. ✅ Tool registration in builtin system
What This Enables: - ✅ Immediate Value: Safe autonomous file editing - ✅ Phase 2/3 Foundation: Example of clean tool implementation - ✅ Phase 5 Critical: Consent mechanism for /agent loop - ✅ Backward Compatible: Both TUI and VSCode keep working - ✅ Future-Proof: Designed for autonomous multi-step workflows
What Doesn't Break: - ✅ Existing tools (read_file, search_files, etc.) unchanged - ✅ Existing commands unchanged - ✅ Existing TUI workflows unchanged - ✅ Existing VSCode workflows unchanged - ✅ No configuration changes required
Files Created/Modified:
NEW: ppxai/engine/tools/builtin/editor.py (~400 lines)
MODIFIED: ppxai/engine/session.py (+20 lines)
MODIFIED: ppxai/engine/types.py (+5 lines)
MODIFIED: ppxai/engine/client.py (+40 lines)
MODIFIED: ppxai/engine/tools/builtin/__init__.py (+2 lines)
MODIFIED: ppxai/main.py or commands.py (+30 lines)
MODIFIED: vscode-extension/src/chatPanel.ts (+50 lines)
MODIFIED: ppxai/server/http.py (+30 lines)
TOTAL: ~577 lines added (all additive, no deletions)
Time Investment: 6-8 hours Risk Level: Low (additive changes only, well-architected consent) Validation: Manual testing (Test 1-3 above) Next Phase Dependency: None (Phases 2-3 can proceed independently)
Phase 2: @git Context Provider (2-3 hours)¶
Goal: Automatically inject git diff context when user references @git
Implementation¶
File: ppxai/engine/context.py (extend existing ContextInjector)
class ContextInjector:
# ... existing code ...
GIT_PATTERN = r'@git\b'
def inject_git_context(self, working_dir: str) -> Optional[InjectedContext]:
"""
Inject git diff (staged + unstaged) as context.
Returns:
InjectedContext with git diff or None if not in git repo
"""
import subprocess
try:
# Get unstaged changes
unstaged = subprocess.run(
['git', 'diff'],
cwd=working_dir,
capture_output=True,
text=True
)
# Get staged changes
staged = subprocess.run(
['git', 'diff', '--staged'],
cwd=working_dir,
capture_output=True,
text=True
)
# Combine with headers
content = ""
if staged.stdout.strip():
content += "=== Staged Changes ===\n"
content += staged.stdout + "\n"
if unstaged.stdout.strip():
content += "=== Unstaged Changes ===\n"
content += unstaged.stdout + "\n"
if not content:
return InjectedContext(
source="@git",
content="No changes in working directory",
language="text",
truncated=False,
size=0
)
return InjectedContext(
source="@git",
content=content,
language="diff",
truncated=len(content) > self.MAX_FILE_SIZE,
size=len(content)
)
except subprocess.CalledProcessError:
return None # Not a git repository
Update message enhancement:
def enhance_message(self, message: str, working_dir: Optional[str] = None) -> Tuple[str, List[InjectedContext]]:
"""Enhanced to support @git pattern."""
# ... existing @file logic ...
# Check for @git pattern
if re.search(self.GIT_PATTERN, message):
git_ctx = self.inject_git_context(working_dir or self.working_dir)
if git_ctx:
contexts.append(git_ctx)
# ... rest of enhancement logic ...
Manual Testing (Phase 2)¶
# Test basic git context
User: "What changes did I make @git"
AI: [sees diff context, summarizes changes]
# Test with no changes
User: "Review my changes @git"
AI: [sees "No changes", responds appropriately]
# Test code review workflow
User: "Review my authentication changes for security issues @git"
AI: [analyzes diff, provides security review]
# Test combined with file
User: "Compare my changes @git with the original design in DESIGN.md"
AI: [sees both git diff and DESIGN.md content]
Success Criteria: - @git detected in messages - Staged and unstaged changes both captured - Proper formatting in context block - Graceful handling of non-git directories - Works alongside @file references
Phase 3: @tree Context Provider (2-3 hours)¶
Goal: Inject visual project structure when user references @tree
Implementation¶
File: ppxai/engine/context.py (extend existing ContextInjector)
class ContextInjector:
# ... existing code ...
TREE_PATTERN = r'@tree\b'
def inject_tree_context(self, working_dir: str, max_depth: int = 3) -> InjectedContext:
"""
Inject directory tree structure as context.
Args:
working_dir: Root directory to tree
max_depth: Maximum depth to traverse
Returns:
InjectedContext with tree structure
"""
from pathlib import Path
def build_tree(path: Path, prefix: str = "", depth: int = 0) -> str:
"""Recursively build tree structure."""
if depth > max_depth:
return ""
# Respect .gitignore
gitignore_patterns = self._load_gitignore(path)
items = sorted(path.iterdir(), key=lambda x: (not x.is_dir(), x.name))
output = []
for i, item in enumerate(items):
# Skip ignored files
if self._is_ignored(item, gitignore_patterns):
continue
is_last = i == len(items) - 1
current_prefix = "└── " if is_last else "├── "
next_prefix = " " if is_last else "│ "
if item.is_dir():
output.append(f"{prefix}{current_prefix}{item.name}/")
output.append(build_tree(item, prefix + next_prefix, depth + 1))
else:
output.append(f"{prefix}{current_prefix}{item.name}")
return "\n".join(filter(None, output))
tree = build_tree(Path(working_dir))
# Add header with stats
total_files = tree.count('\n') - tree.count('/\n')
total_dirs = tree.count('/\n')
content = f"Project: {Path(working_dir).name}\n"
content += f"Directories: {total_dirs}, Files: {total_files}\n"
content += f"Max depth: {max_depth}\n\n"
content += tree
return InjectedContext(
source="@tree",
content=content,
language="text",
truncated=False,
size=len(content)
)
def _load_gitignore(self, path: Path) -> List[str]:
"""Load .gitignore patterns."""
gitignore_file = path / '.gitignore'
if gitignore_file.exists():
with open(gitignore_file) as f:
return [line.strip() for line in f if line.strip() and not line.startswith('#')]
return []
def _is_ignored(self, path: Path, patterns: List[str]) -> bool:
"""Check if path matches gitignore patterns."""
import fnmatch
name = path.name
return any(fnmatch.fnmatch(name, pattern) for pattern in patterns)
Manual Testing (Phase 3)¶
# Test basic tree
User: "Show me the project structure @tree"
AI: [sees tree, describes structure]
# Test architectural planning
User: "Where should I add the new caching layer @tree"
AI: [analyzes structure, suggests location]
# Test refactoring guidance
User: "I want to reorganize the API modules. Current structure: @tree"
AI: [sees structure, provides refactoring suggestions]
# Test combined contexts
User: "Review my changes @git in the context of the project structure @tree"
AI: [sees both, provides comprehensive review]
# Test depth control (future enhancement)
User: "Show a shallow tree @tree depth=1"
AI: [shows only top-level structure]
Success Criteria: - @tree detected in messages - Directory structure displayed correctly - .gitignore patterns respected - Reasonable depth limit (3 levels) - File/directory counts included - Works with @git and @file
Phase 4: Manual Testing & Refinement (3-4 hours)¶
Goal: Dogfood all new features to validate design and find issues
Comprehensive Test Scenarios¶
Scenario 1: File Editing Workflow
1. User: "Read test_commands.py and find the failing test"
AI: [uses read_file tool]
2. User: "The test at line 185 is checking the wrong value. Fix it using replace_block"
AI: [uses editor.replace_block]
3. User: "Run pytest to verify the fix"
AI: [uses shell.execute_command]
4. User: "If it still fails, apply this patch: [diff]"
AI: [uses editor.apply_patch if needed]
Scenario 2: Code Review Workflow
1. User: "What changes are in my working directory @git"
AI: [sees staged + unstaged diffs, summarizes]
2. User: "Are there any bugs or security issues in these changes @git"
AI: [analyzes diff, provides review]
3. User: "Fix the SQL injection vulnerability you found"
AI: [uses editor tool to fix]
4. User: "Show me the updated diff @git"
AI: [shows new diff with fix applied]
Scenario 3: Architecture Planning
1. User: "Show me the current structure @tree"
AI: [displays tree]
2. User: "I need to add caching. Where should the cache module go @tree"
AI: [analyzes structure, suggests location]
3. User: "Create cache.py in that location with basic structure"
AI: [uses editor.insert_text or creates file]
4. User: "Show updated structure @tree"
AI: [displays tree with new file]
Scenario 4: Combined Workflow (Ultimate Test)
1. User: "Review my authentication changes @git against the project structure @tree"
AI: [uses both contexts for comprehensive analysis]
2. User: "You suggested moving auth.py to src/auth/. Do that and update imports"
AI: [uses multiple editor tools for refactoring]
3. User: "Verify no broken imports @tree"
AI: [checks structure, may run tests]
Refinement Checklist¶
Based on testing, refine:
- [ ] Truncation Limits: Adjust MAX_FILE_SIZE if needed
- [ ] Error Messages: Ensure clear, actionable errors
- [ ] Safety Confirmations: Add warnings for destructive operations
- [ ] Context Formatting: Polish how diffs/trees display
- [ ] Performance: Optimize tree generation for large projects
- [ ] Edge Cases: Handle empty repos, binary files, symlinks
- [ ] User Feedback: Add progress indicators for slow operations
Issues to Watch For¶
- File Editing:
- Encoding issues (UTF-8 vs ASCII)
- Line ending differences (CRLF vs LF)
- Indentation preservation
-
Partial edit failures
-
Git Context:
- Large diffs truncation strategy
- Binary file diffs
- Merge conflicts in diff
-
Submodule handling
-
Tree Context:
- Symlink loops
- Hidden files strategy
- Large directory structures
- .gitignore edge cases
Phase 5: /agent Loop Implementation (6-8 hours)¶
Goal: Build autonomous execution loop using the proven tools
Now that we have confidence in the tools, implement the agent loop:
Command Handler¶
File: ppxai/commands.py (add new handler)
def handle_agent(self, args: str):
"""Handle /agent command for autonomous task execution."""
if not args.strip():
console.print("[red]Usage: /agent <task description>[/red]\n")
return
task = args.strip()
max_iterations = 5 # Configurable
console.print(f"[cyan]🤖 Starting autonomous task:[/cyan] {task}\n")
console.print(f"[dim]Max iterations: {max_iterations}[/dim]\n")
iteration = 0
task_complete = False
while iteration < max_iterations and not task_complete:
iteration += 1
console.print(f"[yellow]--- Iteration {iteration}/{max_iterations} ---[/yellow]\n")
# Construct prompt for iteration
if iteration == 1:
prompt = f"""Task: {task}
Please work on this task autonomously. You have access to tools for:
- File editing (apply_patch, replace_block, insert_text, delete_lines)
- File reading and searching
- Shell commands
- Git context (@git)
- Project structure (@tree)
After each action, assess if the task is complete. If complete, respond with:
TASK_COMPLETE: <summary of what was done>
If you need to continue, explain what you're doing and call the appropriate tools."""
else:
prompt = f"""Continue working on the task: {task}
Previous iteration completed. Assess the current state and continue if needed.
If the task is complete, respond with:
TASK_COMPLETE: <summary>"""
# Send to AI
try:
response = self.client.chat(prompt)
# Check for completion signal
if "TASK_COMPLETE:" in response:
task_complete = True
summary = response.split("TASK_COMPLETE:")[1].strip()
console.print(f"\n[green]✅ Task completed:[/green] {summary}\n")
except KeyboardInterrupt:
console.print("\n[yellow]Agent loop interrupted by user[/yellow]\n")
break
except Exception as e:
console.print(f"[red]Error in iteration {iteration}: {e}[/red]\n")
break
if not task_complete and iteration >= max_iterations:
console.print(f"[yellow]⚠️ Task incomplete after {max_iterations} iterations[/yellow]\n")
Engine Support¶
File: ppxai/engine/client.py (add agent mode)
def chat_agent(self, task: str, max_iterations: int = 5) -> Generator[Event, None, None]:
"""
Autonomous agent mode - AI loops until task complete or max iterations.
Args:
task: High-level task description
max_iterations: Max number of AI turns
Yields:
Events for each iteration
"""
for iteration in range(1, max_iterations + 1):
# Emit iteration start event
yield Event(
type=EventType.AGENT_ITERATION,
data={"iteration": iteration, "max": max_iterations}
)
# Construct iteration prompt
prompt = self._build_agent_prompt(task, iteration)
# Stream chat response
response_text = ""
for event in self.chat(prompt):
yield event
if event.type == EventType.STREAM_CHUNK:
response_text += event.data
# Check for completion
if self._task_complete(response_text):
yield Event(
type=EventType.AGENT_COMPLETE,
data={"iterations": iteration, "summary": self._extract_summary(response_text)}
)
break
else:
# Max iterations reached
yield Event(
type=EventType.AGENT_MAX_ITERATIONS,
data={"iterations": max_iterations}
)
def _build_agent_prompt(self, task: str, iteration: int) -> str:
"""Build prompt for agent iteration."""
# Implementation
pass
def _task_complete(self, response: str) -> bool:
"""Check if AI signaled task completion."""
return "TASK_COMPLETE:" in response
def _extract_summary(self, response: str) -> str:
"""Extract completion summary from response."""
if "TASK_COMPLETE:" in response:
return response.split("TASK_COMPLETE:")[1].strip()
return ""
Safety Features¶
- Max Iterations: Prevent infinite loops
- Interrupt Support: Ctrl-C breaks agent loop
- Confirmation Prompts: For destructive operations
- Rollback Capability: Track edits for undo
- Progress Logging: Clear visibility into agent actions
Manual Testing (Phase 5)¶
# Test simple task
/agent Fix the failing test in test_commands.py
# Test multi-step task
/agent Review my changes @git and fix any issues you find
# Test planning task
/agent Reorganize the auth module based on @tree structure
# Test complex task
/agent Implement caching for the API, add tests, and update docs
# Test interrupt
/agent <long task>
[Press Ctrl-C during execution]
# Test max iterations
/agent <task that takes 10+ steps>
[Should stop at 5 iterations with clear message]
Success Criteria: - Agent completes simple tasks autonomously - Proper iteration tracking - Clean interrupt handling - Clear completion/failure messaging - Doesn't exceed max iterations
Phase 6: Testing & Documentation (4-5 hours)¶
Goal: Comprehensive testing and documentation for v1.11.0
Unit Tests¶
File: tests/test_editor_tools.py
def test_apply_patch():
"""Test unified diff patch application."""
# Create temp file
# Apply valid patch
# Verify result
# Test invalid patch
# Test non-existent file
def test_replace_block():
"""Test block replacement."""
# Test exact match
# Test no match
# Test multiple matches (should fail)
# Test case sensitivity
def test_insert_text():
"""Test text insertion."""
# Test at line number
# Test out of bounds
# Test indentation preservation
def test_delete_lines():
"""Test line deletion."""
# Test valid range
# Test invalid range
# Test beyond file end
File: tests/test_context_providers.py
def test_git_context_injection():
"""Test @git context provider."""
# Create temp git repo
# Make changes
# Test @git detection
# Verify diff in context
# Test no changes
# Test non-git directory
def test_tree_context_injection():
"""Test @tree context provider."""
# Create temp directory structure
# Test @tree detection
# Verify tree in context
# Test depth limiting
# Test .gitignore respect
def test_combined_contexts():
"""Test multiple context providers together."""
# Test @git + @tree
# Test @file + @git
# Test all three
File: tests/test_agent_loop.py
def test_agent_basic_task():
"""Test simple agent task completion."""
# Mock AI that completes in 1 iteration
# Verify completion event
# Verify summary extracted
def test_agent_max_iterations():
"""Test max iteration limit."""
# Mock AI that never completes
# Verify stops at max iterations
# Verify appropriate event
def test_agent_interrupt():
"""Test agent interrupt handling."""
# Start agent task
# Interrupt during iteration
# Verify clean shutdown
Integration Tests¶
End-to-end workflows:
def test_code_review_workflow():
"""Test full code review workflow with @git."""
# Setup git repo with changes
# Ask for review with @git
# Verify diff in context
# Verify AI response references changes
def test_refactoring_workflow():
"""Test refactoring with @tree."""
# Setup project structure
# Ask for refactoring suggestions with @tree
# Verify tree in context
# Verify AI considers structure
def test_autonomous_debugging():
"""Test agent finding and fixing a bug."""
# Create file with known bug
# /agent fix the bug
# Verify bug found
# Verify fix applied
# Verify tests pass
Documentation Updates¶
File: CLAUDE.md
## Current Version: v1.11.0 (Agentic Workflow)
**What's New in v1.11.0:**
- **Autonomous Execution**: New `/agent` command for multi-step task execution
- **Native File Editing**: Safe, atomic file editing tools (apply_patch, replace_block, insert_text, delete_lines)
- **Git Context**: `@git` automatically injects diffs for code review workflows
- **Project Structure**: `@tree` provides architectural awareness
- **Combined Workflows**: Mix @git, @tree, and @file for comprehensive context
**Examples:**
```bash
/agent Fix all failing tests
/agent Review my changes @git
/agent Refactor the auth module based on @tree
**File**: `README.md`
Update "What's New" section and add examples:
```markdown
### v1.11.0 - Agentic Workflow
Transform ppxai into an autonomous developer agent:
- `/agent <task>` - Autonomous multi-step task execution
- `@git` - Automatic diff injection for code review
- `@tree` - Project structure awareness
- Native file editing tools (apply_patch, replace_block, insert_text, delete_lines)
**Example Workflows:**
```bash
# Autonomous debugging
/agent Fix the failing test in test_auth.py
# Code review with context
Review my authentication changes @git
# Architecture planning
Where should I add caching given this structure @tree
# Combined contexts
/agent Refactor the API based on @tree and review changes @git
**File**: `docs/AGENTIC_WORKFLOW.md` (New)
Create comprehensive guide:
```markdown
# Agentic Workflow Guide
## Overview
The `/agent` command enables ppxai to work autonomously on multi-step tasks...
## File Editing Tools
### apply_patch
### replace_block
### insert_text
### delete_lines
## Context Providers
### @git - Git Diff Context
### @tree - Project Structure
### @file - File Content (existing)
## Agent Loop
### How It Works
### Best Practices
### Limitations
### Troubleshooting
## Example Workflows
### Debugging
### Code Review
### Refactoring
### Feature Implementation
File: vscode-extension/README.md
Add section on agentic features and context providers.
Timeline Summary¶
| Phase | Component | Hours | Dependencies |
|---|---|---|---|
| 1 | File editing tools + consent | 6-8 | None |
| 2 | @git context | 2-3 | None |
| 3 | @tree context | 2-3 | None |
| 4 | Manual testing | 3-4 | Phases 1-3 |
| 5 | /agent loop | 6-8 | Phases 1-4 |
| 6 | Testing & docs | 4-5 | All phases |
| Total | 23-31 hours |
Success Criteria¶
Phase 1 Complete When:¶
- [ ] All 4 file editing tools implemented
- [ ] Per-file session consent mechanism working
- [ ] Each tool tested manually
- [ ] Consent flow tested (y/n/always/never)
- [ ] Atomic operations confirmed
- [ ] Error handling validated
- [ ] VSCode modal dialog working
Phase 2 Complete When:¶
- [ ] @git pattern detected correctly
- [ ] Staged and unstaged diffs captured
- [ ] Works in git and non-git directories
- [ ] Formatted appropriately in context
Phase 3 Complete When:¶
- [ ] @tree pattern detected correctly
- [ ] Directory structure displayed
- [ ] .gitignore patterns respected
- [ ] Reasonable depth limits enforced
Phase 4 Complete When:¶
- [ ] All tools tested in real workflows
- [ ] Edge cases identified and handled
- [ ] Context formatting refined
- [ ] Performance acceptable
Phase 5 Complete When:¶
- [ ] /agent command implemented
- [ ] Multi-iteration tasks work
- [ ] Proper termination conditions
- [ ] Interrupt handling works
Phase 6 Complete When:¶
- [ ] All unit tests passing
- [ ] Integration tests passing
- [ ] Documentation complete
- [ ] Examples validated
Benefits of Incremental Approach¶
- Lower Risk: Test each component in isolation
- Faster Feedback: Validate design decisions early
- Better Debugging: Isolate issues to specific components
- Incremental Value: Each phase delivers usable features
- Confidence Building: Know tools work before building automation
- User Input: Can incorporate feedback before /agent implementation
Next Steps¶
- Create feature branch:
feature/agentic-workflow-v1.11.0 - Start with Phase 1: Implement file editing tools
- Test manually after each phase
- Proceed to next phase only when previous is validated
- Build /agent loop last with full confidence
Notes¶
- All code must maintain backward compatibility
- Security review required for file editing operations
- Performance testing needed for large projects
- Documentation must include real-world examples
- Tests must cover edge cases and error conditions
Last Updated: 2025-12-20 Next Review: After Phase 1 completion