Phase 14: AI Agents - Assignment¶

Build a Production-Ready AI Agent System

📋 Assignment Overview¶

Objective: Build a fully functional AI agent that can autonomously accomplish complex tasks using multiple tools and reasoning.

Estimated Time: 10-15 hours
Weight: 100 points + 20 bonus points
Due: End of Week 3

🎯 Learning Objectives¶

After completing this assignment, you will be able to:

✅ Design and implement tool schemas for AI agents
✅ Build agents that use multiple tools effectively
✅ Implement error handling and validation
✅ Add memory and state management
✅ Evaluate agent performance
✅ Deploy a production-ready agent

📦 Deliverables¶

Agent Implementation (Python code)
Tool Definitions (JSON schemas + implementations)
Test Suite (Unit tests + integration tests)
Documentation (README + API docs)
Demo Video or Live Demo (3-5 minutes)
Report (2-3 pages analyzing your agent)

📝 Part 4: Documentation & Demo (20 points)¶

4.1 README.md (8 points)¶

# [Your Agent Name]

## Overview
Brief description of what your agent does

## Features
- Feature 1
- Feature 2

## Installation
```bash
pip install -r requirements.txt

Usage¶

from my_agent import Agent
agent = Agent()
result = agent.run("your query")

Architecture¶

Diagram showing components

API Reference¶

Tool descriptions and parameters

Examples¶

5+ example interactions

### 4.2 Code Documentation (5 points)
- [ ] Docstrings for all functions
- [ ] Type hints
- [ ] Inline comments for complex logic
- [ ] API reference (auto-generated)

### 4.3 Demo (7 points)
**Option 1: Video Demo (3-5 minutes)**
- Show agent handling 3+ different queries
- Explain tool selection decisions
- Demonstrate error handling

**Option 2: Live Demo + Gradio UI**
- Build web interface
- Demo during presentation
- Include example queries

---

## 🎁 Bonus Challenges (+20 points)

### Bonus 1: Advanced Reasoning (+5 points)
Implement **ReAct** (Reasoning + Acting) pattern:

Thought: I need to find the revenue data Action: execute_query(“SELECT SUM(revenue) FROM sales WHERE year=2024”) Observation: Total revenue is \(5.2M Thought: Now I should compare to 2023 Action: execute_query("SELECT SUM(revenue) FROM sales WHERE year=2023") Observation: 2023 revenue was \)4.1M Thought: Growth is 26.8%, I can now respond Final Answer: Revenue grew by 26.8% from \(4.1M to \)5.2M

### Bonus 2: Parallel Tool Execution (+5 points)
- Execute multiple independent tools concurrently
- Aggregate results efficiently
- Handle parallel errors gracefully

### Bonus 3: Agent Optimization (+5 points)
- Cache frequent API calls
- Optimize token usage
- Reduce latency with streaming
- Smart tool selection (skip unnecessary tools)

### Bonus 4: Production Deployment (+5 points)
- Deploy as REST API (FastAPI/Flask)
- Add authentication
- Rate limiting
- Monitoring dashboard
- Docker containerization

---

## 📊 Grading Rubric

### Part 1: Agent Design & Implementation (40 points)
| Criteria | Points | Description |
|----------|--------|-------------|
| **Architecture** | 15 | Clean code, separation of concerns, configurability |
| **Tools** | 15 | All tools work correctly, proper schemas, error handling |
| **Reasoning** | 10 | Intelligent tool selection, multi-step planning |

### Part 2: Memory & State (20 points)
| Criteria | Points | Description |
|----------|--------|-------------|
| **Conversation History** | 8 | Properly stores and retrieves context |
| **Task Memory** | 7 | Tracks progress, resumes from failures |
| **Implementation** | 5 | Clean code, efficient storage |

### Part 3: Testing & Evaluation (20 points)
| Criteria | Points | Description |
|----------|--------|-------------|
| **Unit Tests** | 8 | Comprehensive coverage, edge cases |
| **Integration Tests** | 7 | End-to-end scenarios, error cases |
| **Metrics** | 5 | Proper evaluation methodology |

### Part 4: Documentation & Demo (20 points)
| Criteria | Points | Description |
|----------|--------|-------------|
| **README** | 8 | Clear, comprehensive, examples |
| **Code Docs** | 5 | Docstrings, type hints, comments |
| **Demo** | 7 | Shows key features, explains decisions |

### Bonus (up to +20 points)
- ReAct pattern: +5
- Parallel execution: +5
- Optimization: +5
- Deployment: +5

---

## 💡 Hints & Tips

### Getting Started
1. **Start simple:** Build basic agent with 1-2 tools first
2. **Test early:** Write tests as you build tools
3. **Iterate:** Add features incrementally
4. **Use frameworks:** LangChain can simplify development

### Tool Design
- Keep tools focused (single responsibility)
- Validate inputs rigorously
- Return structured data (JSON)
- Include helpful error messages

### Debugging
- Log all LLM calls and tool executions
- Test tools independently before agent integration
- Use `print` statements liberally
- Check token usage to avoid context overflow

### Common Pitfalls
- ❌ Tools that do too much (break into smaller tools)
- ❌ Poor error handling (always validate inputs)
- ❌ No logging (impossible to debug)
- ❌ Ignoring context limits (manage tokens carefully)

---

## 📚 Resources

### Code Examples
- [OpenAI Function Calling Examples](https://cookbook.openai.com/examples/how_to_call_functions_with_chat_models)
- [LangChain Agent Templates](https://python.langchain.com/docs/modules/agents/agent_types/)
- [Agent Design Patterns](https://github.com/microsoft/ai-agents-for-beginners)

### Testing
- [Pytest Documentation](https://docs.pytest.org/)
- [Unit Testing Best Practices](https://realpython.com/python-testing/)

### Deployment
- [FastAPI Tutorial](https://fastapi.tiangolo.com/tutorial/)
- [Docker for Python](https://docs.docker.com/language/python/)

---

## 🤝 Collaboration Policy

- **Individual assignment:** Complete independently
- **Getting help:** Office hours, Discord, Stack Overflow
- **Code sharing:** Don't share solutions, but discuss approaches
- **AI assistance:** OK to use for debugging, not for writing entire agent

---

## 📅 Submission

**Submit via GitHub:**
1. Create repo: `ai-agent-[your-name]`
2. Include all deliverables
3. Add comprehensive README
4. Submit repo link

**Deadline:** [Date]

**Late Policy:** -10% per day, up to 3 days

---

## ❓ FAQ

**Q: Can I use LangChain or must I build from scratch?**  
A: You can use frameworks, but you must understand and explain the code.

**Q: How many tools are required?**  
A: Minimum 4 tools. More is better if they're all useful.

**Q: Can I use mock/fake APIs for testing?**  
A: Yes for testing, but include at least one real API integration.

**Q: What if my agent makes mistakes?**  
A: That's OK! Document the failure cases and explain why they occur.

**Q: Can I work in a team?**  
A: No, this is individual. But you can discuss ideas with classmates.

---

**Good luck building your AI agent! 🚀🤖**

Phase 14: AI Agents - Assignment¶

📋 Assignment Overview¶

🎯 Learning Objectives¶

📦 Deliverables¶

🏗️ Part 1: Agent Design & Implementation (40 points)¶

Choose ONE Agent Type:¶

Option A: SQL Agent (Recommended for beginners)¶

Option B: Research Agent¶

Option C: Code Debugging Agent¶

Option D: Personal Assistant Agent¶

Requirements (All Options):¶

🧠 Part 2: Memory & State Management (20 points)¶

2.1 Conversation History (8 points)¶

2.2 Task Memory (7 points)¶

2.3 Long-Term Memory (Optional - 5 bonus points)¶

🧪 Part 3: Testing & Evaluation (20 points)¶

3.1 Unit Tests (8 points)¶

3.2 Integration Tests (7 points)¶

3.3 Evaluation Metrics (5 points)¶