1. Intro and Session Overview
- Presenter introduction "Hi everyone, I'm Nico and I work on the AI SDK at Vercel."
- Session goal "In this session, we'll explore how to build agents with the AI SDK."
- Structure
- Fundamentals part: Introduction to the core building blocks of the AI SDK
- Hands-on part: Building a Deep Research clone in Node.js
2. Project Setup and How to Run
- Clone the repo, install dependencies, copy environment variables "First, clone the repository, install the dependencies, then copy the environment variables."
- How to run
- A single
index.tsfile - Run with
pnpm rundev(aliaspd)
- A single
3. Core Building Blocks: Text Generation Functions
3.1 generateText Function
-
Basic usage
- Call
generateTextinside an asyncmainfunction - Model: OpenAI GPT-4.0 mini
- Prompt: "hello world"
- Print the result
"Hello, how can I assist you today?"
- Call
-
Message array support
- You can pass a message array instead of a prompt
- Messages are composed of
roleandcontent
-
Advantage of a unified interface
- "One of the key features of the AI SDK is its unified interface. You can switch between different models by changing just one line!"
3.2 Flexibility of Model Switching
-
When web search is needed
- GPT-4.0 mini only has training data up to 2024 and doesn't know about 2025 events
"I'm sorry, but I don't have information about the AI Engineer Summit scheduled for 2025."
- Switch to a Perplexity model
- "Just swap the model."
- Using Perplexity Sonar Pro
"The AI Engineer Summit 2025 was held in New York from February 19 to 22."
- Sources can be verified via the sources property
- GPT-4.0 mini only has training data up to 2024 and doesn't know about 2025 events
-
Support for various providers
- "You can find various providers in the SDK docs."
- Example: Google Gemini Flash 1.5 + search grounding
4. Tools and Function Calling: Interacting with the Outside World
4.1 The Concept of Tools
-
"Tools can seem complex at first glance, but they're simple at their core."
-
How tools are provided
- Pass a list of available tools to the model alongside the prompt
- Each tool includes: name, description, and required data
-
Tool calling flow
- When the model determines a tool is needed, it generates a tool call
- The developer parses that tool call, executes it, and processes the result
4.2 Simple Example: Adding Two Numbers
-
Code structure
- Pass a
toolsobject togenerateText - Tool name:
addNumbers - Includes description, parameters, and execute function
- Uses the
toolutility for type safety
- Pass a
-
Execution result
"Tool result: addNumbers, args: 10, 5, result: 15"
-
Only a tool call is generated, no text
- "Now the model generates only a tool call, not text."
4.3 Synthesizing Tool Results into Text
- maxSteps option
- "maxSteps causes the model to take the tool call result, feed it back into the conversation context, and repeat the next step."
- "It repeats until text is generated or the maximum number of steps is reached."
- Enables multi-step agent behavior
4.4 Demonstrating Agent Behavior with Multiple Tools
-
Adding a weather information tool
getWeathertool: requires latitude, longitude, and city name- "If only the city name is in the prompt, the model can infer the remaining parameters."
-
Compound prompt example
"Get the weather for San Francisco and New York, then add the two values together."
-
Execution result
"The current temperature in San Francisco is 12.3°C and in New York is 15.2°C. Adding the two temperatures gives 27.5°C."
-
Step structure
- Step 1: Tool calls for weather in both cities
- Step 2: Tool call to add the two values
- Step 3: Final text generation
5. Structured Output
5.1 experimental output Option in generateText
- Defining a schema with Zod
- "Zod is a TypeScript validation library that pairs perfectly with the AI SDK."
- Example:
output: object({ sum: z.number(), });
- Execution result
"experimental output: sum = 27.5"
5.2 generateObject Function
-
Definition
- "generateObject is a function specialized for structured output, and it's my favorite function."
-
Example
- Prompt: "Generate 10 definitions of an AI agent."
- Schema:
definitions: string[] - Result:
"An AI agent is a software entity that performs tasks autonomously or semi-autonomously using artificial intelligence technology." (plus 9 more)
-
Zod's describe feature
- You can add detailed instructions per definition, such as "use as much jargon as possible and make it completely incomprehensible"
- Result:
"An autonomous entity leverages algorithmic heuristics to optimize decision-making processes in dynamic environments." (and so on)
6. Hands-on: Building a Deep Research Clone
6.1 What is Deep Research?
-
"Deep Research is a service where you enter a topic, it browses the web, gathers materials, follows threads of thought, and ultimately produces a report."
-
Core workflow
- Input query
- Generate sub-queries
- Web search for each query
- Analyze search results and generate follow-up questions
- Recursively repeat as needed
- Finally, synthesize all information into a report
-
Depth and breadth concepts
- "Depth: how many levels deep to go; breadth: how many queries to generate at each level"
6.2 Step-by-Step Implementation Details
1) Sub-query Generation Function
- generateSearchQueries
- Input: query, number of queries to generate
- Output: array of search-friendly queries
- Example:
"What does it take to become a D1 shot put athlete?"
- "D1 shot put athlete requirements"
- "D1 shot put athlete training methods"
- "NCAA Division 1 shot put qualifications"
2) Web Search Function
- Using the Exa API
- "Exa is a fast and affordable web search API."
- Remove unnecessary information from results (favicons, etc.)
- "Reducing the token count lowers costs and improves model efficiency."
3) Search Result Evaluation and Agent Loop
- searchAndProcess function
- Uses two tools:
- Web search tool
- Search result evaluation tool (relevant/irrelevant)
- Repeated using maxSteps
- "If the evaluation result is irrelevant, it leaves feedback saying 'Please search again with a more specific query.'"
- "This causes the model to keep trying to find better results."
- Uses two tools:
4) Generating Learnings and Follow-up Questions
- generateLearnings function
- Input: query, search results
- Output:
- learning (insight)
- followUpQuestions (array of follow-up questions)
- Example:
"To become a D1 shot put athlete, you need four years of competitive experience in high school, placements at state/national competitions, and a throw of over 60 feet." "Follow-up questions: What are training methods for shot put athletes? What are the differences between D1, D2, and D3?"
5) Recursive Deep Research Function
-
deepResearch function
- Recursively uses followUpQuestions as new queries
- Manages accumulated research state globally
- Iterations limited by depth and breadth
-
Preventing duplicate sources
- "Sources already used are marked as irrelevant to prevent duplication."
6) Final Report Generation
-
generateReport function
- Generates a report based on accumulated research data
- Uses the "o3 mini" model
- Saves the result as a Markdown file
-
Adding a system prompt
- "You are a professional researcher. Today's date is XX. Use Markdown format. Clearly label any speculation or predictions."
- "To prevent the model from relying on reasoning alone, clearly specify the structure and format you want."
-
Execution result
"Below is a comprehensive report on the requirements, skills, training, common beginner mistakes, and more for becoming a D1 shot put athlete."
- "Top-level competitiveness, four years of athletic experience, throws of 55 feet or more, differences by gender, and more..."
7. Closing and Resources
- "That's how you can build a Deep Research agent in just 218 lines of code!"
- "If you have questions, DM me on X (Twitter) @nikoalbanese10."
- "Also check out the docs and cookbook guides at SDK.purcell.ai."
- "Thanks to Swix for suggesting this session, and see you next time!"
Key Keyword Summary
- AI SDK
- generateText / generateObject
- Tools / function calling
- Multi-step agents
- Structured output
- Zod schema
- Deep Research workflow
- Recursive agents
- Web search API (Exa)
- Markdown report generation
- System prompt design
"I hope this session inspires you to go build your own amazing agents and research systems with the AI SDK!" 😊
