Fine-tuning models for reasoning (e.g., DeepSeek-R1)
Iterative refinement of plans based on outcomes
instead of having LLMs learn βwhatβ to answer they learn βhowβ to answer!
ReAct Framework
ReAct combines reasoning and acting in a structured cycle for autonomous behavior.
Thought: Reasoning about current situation
Action: Executing tools or operations
Observation: Analyzing action results
Continues until task completion
Self-Reflection and Learning
Agents improve through reflection on past failures and successes.
Reflexion: Verbal reinforcement from prior failures
Three roles: Actor, Evaluator, Self-reflection
SELF-REFINE: Iterative output refinement
Memory modules track reflections for future use
---
Multi-Agent Systems
Multiple specialized agents collaborate to solve complex problems.
Each agent has specific tools and expertise
Supervisor orchestrates agent communication
Reduces complexity through specialization
Examples: AutoGen, MetaGPT, CAMEL frameworks
Claude coding
This is not sci-fi
I use such agents everyday for coding in claude code
I also use Gemini / jules as another agent platform to review and improve Claude production. (security audit, code review, refactoring, SEO improvements, performance audit,etc )
Both platform
have a plan or generate mode (different models)
always create a plan
can call tools : web search, github connection, file upload
call a multitude of functions : search the codebase, execute command line
have memory : context window
can self reflect
How do we augment humans ?
sports ?
Gene editing with CRISPR (might take a few generations)
Intermission
---
Projects
Let's take the rest of the time to work on your projects
Next time
Inference via APIs : openAI, Gemini,
Fast Inference with Other models: openrouter, groq.