Vercel AI SDK Masterclass: From Fundamentals to Deep Research Clone

1. Intro and Session Overview

Presenter introduction "Hi everyone, I'm Nico and I work on the AI SDK at Vercel."
Session goal "In this session, we'll explore how to build agents with the AI SDK."
Structure
- Fundamentals part: Introduction to the core building blocks of the AI SDK
- Hands-on part: Building a Deep Research clone in Node.js

2. Project Setup and How to Run

Clone the repo, install dependencies, copy environment variables "First, clone the repository, install the dependencies, then copy the environment variables."
How to run
- A single index.ts file
- Run with pnpm rundev (alias pd)

3. Core Building Blocks: Text Generation Functions

3.1 generateText Function

Basic usage
- Call generateText inside an async main function
- Model: OpenAI GPT-4.0 mini
- Prompt: "hello world"
- Print the result
  
  "Hello, how can I assist you today?"
Message array support
- You can pass a message array instead of a prompt
- Messages are composed of role and content
Advantage of a unified interface
- "One of the key features of the AI SDK is its unified interface. You can switch between different models by changing just one line!"

3.2 Flexibility of Model Switching

When web search is needed
- GPT-4.0 mini only has training data up to 2024 and doesn't know about 2025 events
  
  "I'm sorry, but I don't have information about the AI Engineer Summit scheduled for 2025."
- Switch to a Perplexity model
  - "Just swap the model."
  - Using Perplexity Sonar Pro
    
    "The AI Engineer Summit 2025 was held in New York from February 19 to 22."
  - Sources can be verified via the sources property
Support for various providers
- "You can find various providers in the SDK docs."
- Example: Google Gemini Flash 1.5 + search grounding

4. Tools and Function Calling: Interacting with the Outside World

4.1 The Concept of Tools

"Tools can seem complex at first glance, but they're simple at their core."
How tools are provided
- Pass a list of available tools to the model alongside the prompt
- Each tool includes: name, description, and required data
Tool calling flow
- When the model determines a tool is needed, it generates a tool call
- The developer parses that tool call, executes it, and processes the result

4.2 Simple Example: Adding Two Numbers

Code structure
- Pass a tools object to generateText
- Tool name: addNumbers
- Includes description, parameters, and execute function
- Uses the tool utility for type safety
Execution result

"Tool result: addNumbers, args: 10, 5, result: 15"
Only a tool call is generated, no text
- "Now the model generates only a tool call, not text."

4.3 Synthesizing Tool Results into Text

maxSteps option
- "maxSteps causes the model to take the tool call result, feed it back into the conversation context, and repeat the next step."
- "It repeats until text is generated or the maximum number of steps is reached."
- Enables multi-step agent behavior

4.4 Demonstrating Agent Behavior with Multiple Tools

Adding a weather information tool
- getWeather tool: requires latitude, longitude, and city name
- "If only the city name is in the prompt, the model can infer the remaining parameters."
Compound prompt example

"Get the weather for San Francisco and New York, then add the two values together."
Execution result

"The current temperature in San Francisco is 12.3°C and in New York is 15.2°C. Adding the two temperatures gives 27.5°C."
Step structure
- Step 1: Tool calls for weather in both cities
- Step 2: Tool call to add the two values
- Step 3: Final text generation

5. Structured Output

5.1 experimental output Option in generateText

Defining a schema with Zod
- "Zod is a TypeScript validation library that pairs perfectly with the AI SDK."
- Example:
```
output: object({
  sum: z.number(),
});
```
Execution result

"experimental output: sum = 27.5"

5.2 generateObject Function

Definition
- "generateObject is a function specialized for structured output, and it's my favorite function."
Example
- Prompt: "Generate 10 definitions of an AI agent."
- Schema: definitions: string[]
- Result:
  
  "An AI agent is a software entity that performs tasks autonomously or semi-autonomously using artificial intelligence technology." (plus 9 more)
Zod's describe feature
- You can add detailed instructions per definition, such as "use as much jargon as possible and make it completely incomprehensible"
- Result:
  
  "An autonomous entity leverages algorithmic heuristics to optimize decision-making processes in dynamic environments." (and so on)

6. Hands-on: Building a Deep Research Clone

6.1 What is Deep Research?

"Deep Research is a service where you enter a topic, it browses the web, gathers materials, follows threads of thought, and ultimately produces a report."
Core workflow
1. Input query
2. Generate sub-queries
3. Web search for each query
4. Analyze search results and generate follow-up questions
5. Recursively repeat as needed
6. Finally, synthesize all information into a report
Depth and breadth concepts
- "Depth: how many levels deep to go; breadth: how many queries to generate at each level"

6.2 Step-by-Step Implementation Details

1) Sub-query Generation Function

generateSearchQueries
- Input: query, number of queries to generate
- Output: array of search-friendly queries
- Example:
  "What does it take to become a D1 shot put athlete?"
  - "D1 shot put athlete requirements"
  - "D1 shot put athlete training methods"
  - "NCAA Division 1 shot put qualifications"

2) Web Search Function

Using the Exa API
- "Exa is a fast and affordable web search API."
- Remove unnecessary information from results (favicons, etc.)
- "Reducing the token count lowers costs and improves model efficiency."

3) Search Result Evaluation and Agent Loop

searchAndProcess function
- Uses two tools:
  1. Web search tool
  2. Search result evaluation tool (relevant/irrelevant)
- Repeated using maxSteps
- "If the evaluation result is irrelevant, it leaves feedback saying 'Please search again with a more specific query.'"
- "This causes the model to keep trying to find better results."

4) Generating Learnings and Follow-up Questions

generateLearnings function
- Input: query, search results
- Output:
  - learning (insight)
  - followUpQuestions (array of follow-up questions)
- Example:
  
  "To become a D1 shot put athlete, you need four years of competitive experience in high school, placements at state/national competitions, and a throw of over 60 feet." "Follow-up questions: What are training methods for shot put athletes? What are the differences between D1, D2, and D3?"

5) Recursive Deep Research Function

deepResearch function
- Recursively uses followUpQuestions as new queries
- Manages accumulated research state globally
- Iterations limited by depth and breadth
Preventing duplicate sources
- "Sources already used are marked as irrelevant to prevent duplication."

6) Final Report Generation

generateReport function
- Generates a report based on accumulated research data
- Uses the "o3 mini" model
- Saves the result as a Markdown file
Adding a system prompt
- "You are a professional researcher. Today's date is XX. Use Markdown format. Clearly label any speculation or predictions."
- "To prevent the model from relying on reasoning alone, clearly specify the structure and format you want."
Execution result
"Below is a comprehensive report on the requirements, skills, training, common beginner mistakes, and more for becoming a D1 shot put athlete."
- "Top-level competitiveness, four years of athletic experience, throws of 55 feet or more, differences by gender, and more..."

7. Closing and Resources

"That's how you can build a Deep Research agent in just 218 lines of code!"
"If you have questions, DM me on X (Twitter) @nikoalbanese10."
"Also check out the docs and cookbook guides at SDK.purcell.ai."
"Thanks to Swix for suggesting this session, and see you next time!"

Key Keyword Summary

AI SDK
generateText / generateObject
Tools / function calling
Multi-step agents
Structured output
Zod schema
Deep Research workflow
Recursive agents
Web search API (Exa)
Markdown report generation
System prompt design

"I hope this session inspires you to go build your own amazing agents and research systems with the AI SDK!" 😊

1. Intro and Session Overview

Presenter introduction "Hi everyone, I'm Nico and I work on the AI SDK at Vercel."
Session goal "In this session, we'll explore how to build agents with the AI SDK."
Structure
- Fundamentals part: Introduction to the core building blocks of the AI SDK
- Hands-on part: Building a Deep Research clone in Node.js

2. Project Setup and How to Run

Clone the repo, install dependencies, copy environment variables "First, clone the repository, install the dependencies, then copy the environment variables."
How to run
- A single index.ts file
- Run with pnpm rundev (alias pd)

3. Core Building Blocks: Text Generation Functions

3.1 generateText Function

Basic usage
- Call generateText inside an async main function
- Model: OpenAI GPT-4.0 mini
- Prompt: "hello world"
- Print the result
  
  "Hello, how can I assist you today?"
Message array support
- You can pass a message array instead of a prompt
- Messages are composed of role and content
Advantage of a unified interface
- "One of the key features of the AI SDK is its unified interface. You can switch between different models by changing just one line!"

3.2 Flexibility of Model Switching

When web search is needed
- GPT-4.0 mini only has training data up to 2024 and doesn't know about 2025 events
  
  "I'm sorry, but I don't have information about the AI Engineer Summit scheduled for 2025."
- Switch to a Perplexity model
  - "Just swap the model."
  - Using Perplexity Sonar Pro
    
    "The AI Engineer Summit 2025 was held in New York from February 19 to 22."
  - Sources can be verified via the sources property
Support for various providers
- "You can find various providers in the SDK docs."
- Example: Google Gemini Flash 1.5 + search grounding

4. Tools and Function Calling: Interacting with the Outside World

4.1 The Concept of Tools

"Tools can seem complex at first glance, but they're simple at their core."
How tools are provided
- Pass a list of available tools to the model alongside the prompt
- Each tool includes: name, description, and required data
Tool calling flow
- When the model determines a tool is needed, it generates a tool call
- The developer parses that tool call, executes it, and processes the result

4.2 Simple Example: Adding Two Numbers

Code structure
- Pass a tools object to generateText
- Tool name: addNumbers
- Includes description, parameters, and execute function
- Uses the tool utility for type safety
Execution result

"Tool result: addNumbers, args: 10, 5, result: 15"
Only a tool call is generated, no text
- "Now the model generates only a tool call, not text."

4.3 Synthesizing Tool Results into Text

maxSteps option
- "maxSteps causes the model to take the tool call result, feed it back into the conversation context, and repeat the next step."
- "It repeats until text is generated or the maximum number of steps is reached."
- Enables multi-step agent behavior

4.4 Demonstrating Agent Behavior with Multiple Tools

Adding a weather information tool
- getWeather tool: requires latitude, longitude, and city name
- "If only the city name is in the prompt, the model can infer the remaining parameters."
Compound prompt example

"Get the weather for San Francisco and New York, then add the two values together."
Execution result

"The current temperature in San Francisco is 12.3°C and in New York is 15.2°C. Adding the two temperatures gives 27.5°C."
Step structure
- Step 1: Tool calls for weather in both cities
- Step 2: Tool call to add the two values
- Step 3: Final text generation

5. Structured Output

5.1 experimental output Option in generateText

Defining a schema with Zod
- "Zod is a TypeScript validation library that pairs perfectly with the AI SDK."
- Example:
```
output: object({
  sum: z.number(),
});
```
Execution result

"experimental output: sum = 27.5"

5.2 generateObject Function

Definition
- "generateObject is a function specialized for structured output, and it's my favorite function."
Example
- Prompt: "Generate 10 definitions of an AI agent."
- Schema: definitions: string[]
- Result:
  
  "An AI agent is a software entity that performs tasks autonomously or semi-autonomously using artificial intelligence technology." (plus 9 more)
Zod's describe feature
- You can add detailed instructions per definition, such as "use as much jargon as possible and make it completely incomprehensible"
- Result:
  
  "An autonomous entity leverages algorithmic heuristics to optimize decision-making processes in dynamic environments." (and so on)

6. Hands-on: Building a Deep Research Clone

6.1 What is Deep Research?

"Deep Research is a service where you enter a topic, it browses the web, gathers materials, follows threads of thought, and ultimately produces a report."
Core workflow
1. Input query
2. Generate sub-queries
3. Web search for each query
4. Analyze search results and generate follow-up questions
5. Recursively repeat as needed
6. Finally, synthesize all information into a report
Depth and breadth concepts
- "Depth: how many levels deep to go; breadth: how many queries to generate at each level"

6.2 Step-by-Step Implementation Details

1) Sub-query Generation Function

generateSearchQueries
- Input: query, number of queries to generate
- Output: array of search-friendly queries
- Example:
  "What does it take to become a D1 shot put athlete?"
  - "D1 shot put athlete requirements"
  - "D1 shot put athlete training methods"
  - "NCAA Division 1 shot put qualifications"

2) Web Search Function

Using the Exa API
- "Exa is a fast and affordable web search API."
- Remove unnecessary information from results (favicons, etc.)
- "Reducing the token count lowers costs and improves model efficiency."

3) Search Result Evaluation and Agent Loop

searchAndProcess function
- Uses two tools:
  1. Web search tool
  2. Search result evaluation tool (relevant/irrelevant)
- Repeated using maxSteps
- "If the evaluation result is irrelevant, it leaves feedback saying 'Please search again with a more specific query.'"
- "This causes the model to keep trying to find better results."

4) Generating Learnings and Follow-up Questions

generateLearnings function
- Input: query, search results
- Output:
  - learning (insight)
  - followUpQuestions (array of follow-up questions)
- Example:
  
  "To become a D1 shot put athlete, you need four years of competitive experience in high school, placements at state/national competitions, and a throw of over 60 feet." "Follow-up questions: What are training methods for shot put athletes? What are the differences between D1, D2, and D3?"

5) Recursive Deep Research Function

deepResearch function
- Recursively uses followUpQuestions as new queries
- Manages accumulated research state globally
- Iterations limited by depth and breadth
Preventing duplicate sources
- "Sources already used are marked as irrelevant to prevent duplication."

6) Final Report Generation

generateReport function
- Generates a report based on accumulated research data
- Uses the "o3 mini" model
- Saves the result as a Markdown file
Adding a system prompt
- "You are a professional researcher. Today's date is XX. Use Markdown format. Clearly label any speculation or predictions."
- "To prevent the model from relying on reasoning alone, clearly specify the structure and format you want."
Execution result
"Below is a comprehensive report on the requirements, skills, training, common beginner mistakes, and more for becoming a D1 shot put athlete."
- "Top-level competitiveness, four years of athletic experience, throws of 55 feet or more, differences by gender, and more..."

7. Closing and Resources

"That's how you can build a Deep Research agent in just 218 lines of code!"
"If you have questions, DM me on X (Twitter) @nikoalbanese10."
"Also check out the docs and cookbook guides at SDK.purcell.ai."
"Thanks to Swix for suggesting this session, and see you next time!"

Key Keyword Summary

AI SDK
generateText / generateObject
Tools / function calling
Multi-step agents
Structured output
Zod schema
Deep Research workflow
Recursive agents
Web search API (Exa)
Markdown report generation
System prompt design

"I hope this session inspires you to go build your own amazing agents and research systems with the AI SDK!" 😊

1. Intro and Session Overview

2. Project Setup and How to Run

3. Core Building Blocks: Text Generation Functions

3.1 generateText Function

3.2 Flexibility of Model Switching

4. Tools and Function Calling: Interacting with the Outside World

4.1 The Concept of Tools

4.2 Simple Example: Adding Two Numbers

4.3 Synthesizing Tool Results into Text

4.4 Demonstrating Agent Behavior with Multiple Tools

5. Structured Output

5.1 experimental output Option in generateText

5.2 generateObject Function

6. Hands-on: Building a Deep Research Clone

6.1 What is Deep Research?

6.2 Step-by-Step Implementation Details

1) Sub-query Generation Function

2) Web Search Function

3) Search Result Evaluation and Agent Loop

4) Generating Learnings and Follow-up Questions

5) Recursive Deep Research Function

6) Final Report Generation

7. Closing and Resources

Key Keyword Summary

Related writing

Inside YC's AI Playbook

SensorLM: Giving Wearable Data Language

Why Agent-Era Skill Standardization Changes Everything

Reading

1. Intro and Session Overview

2. Project Setup and How to Run

3. Core Building Blocks: Text Generation Functions

3.1 generateText Function

3.2 Flexibility of Model Switching

4. Tools and Function Calling: Interacting with the Outside World

4.1 The Concept of Tools

4.2 Simple Example: Adding Two Numbers

4.3 Synthesizing Tool Results into Text

4.4 Demonstrating Agent Behavior with Multiple Tools

5. Structured Output

5.1 experimental output Option in generateText

5.2 generateObject Function

6. Hands-on: Building a Deep Research Clone

6.1 What is Deep Research?

6.2 Step-by-Step Implementation Details

1) Sub-query Generation Function

2) Web Search Function

3) Search Result Evaluation and Agent Loop

4) Generating Learnings and Follow-up Questions

5) Recursive Deep Research Function

6) Final Report Generation

7. Closing and Resources

Key Keyword Summary

Related writing

Inside YC's AI Playbook

SensorLM: Giving Wearable Data Language

Why Agent-Era Skill Standardization Changes Everything