Overview
Satori provides first-class integration with the Vercel AI SDK through the @satori/tools package. This guide covers everything from basic setup to advanced patterns.
Installation
npm install @satori/tools ai @ai-sdk/openai
Basic Integration
The memoryTools() function creates AI SDK-compatible tools that the LLM can use to manage memories:
import { memoryTools , getMemoryContext } from '@satori/tools' ;
const tools = memoryTools ({
apiKey: process . env . SATORI_API_KEY ! ,
baseUrl: process . env . SATORI_URL ! ,
userId: 'user-123' ,
});
Step 2: Pre-fetch Memory Context
Fetch relevant memories before calling the LLM:
const memoryContext = await getMemoryContext (
{
apiKey: process . env . SATORI_API_KEY ! ,
baseUrl: process . env . SATORI_URL ! ,
userId: 'user-123' ,
},
userMessage ,
{ limit: 5 , threshold: 0.7 }
);
Step 3: Stream with Memory
Use streamText() with memory tools and context:
import { streamText } from 'ai' ;
import { openai } from '@ai-sdk/openai' ;
const result = await streamText ({
model: openai ( 'gpt-4o' ),
system: `You are a helpful assistant with long-term memory.
What you know about this user:
${ memoryContext }
When the user shares important information, use the add_memory tool to save it.` ,
messages ,
tools ,
});
return result . toDataStreamResponse ();
Complete API Route Example
Here’s a full Next.js API route with memory:
import { streamText } from 'ai' ;
import { openai } from '@ai-sdk/openai' ;
import { memoryTools , getMemoryContext } from '@satori/tools' ;
export async function POST ( req : Request ) {
try {
const { messages } = await req . json ();
const userMessage = messages [ messages . length - 1 ]. content ;
// Get user ID from your auth system
const session = await getSession ( req );
if ( ! session ?. userId ) {
return new Response ( 'Unauthorized' , { status: 401 });
}
// Create memory configuration
const memoryConfig = {
apiKey: process . env . SATORI_API_KEY ! ,
baseUrl: process . env . SATORI_URL ! ,
userId: session . userId ,
};
// Create memory tools
const tools = memoryTools ( memoryConfig );
// Pre-fetch relevant context
const memoryContext = await getMemoryContext (
memoryConfig ,
userMessage ,
{ limit: 5 }
);
// Stream response with memory
const result = await streamText ({
model: openai ( 'gpt-4o' ),
system: `You are a helpful assistant with long-term memory.
What you know about this user:
${ memoryContext }
Use the add_memory tool when the user:
- Shares preferences or opinions
- Provides personal information
- Mentions important dates or events
- Expresses goals or intentions
Be natural and conversational. Don't explicitly mention that you're saving memories.` ,
messages ,
tools ,
maxSteps: 5 , // Allow multiple tool calls
});
return result . toDataStreamResponse ();
} catch ( error ) {
console . error ( 'Chat error:' , error );
return new Response ( 'Internal Server Error' , { status: 500 });
}
}
Set maxSteps: 5 to allow the LLM to make multiple tool calls in a single response (e.g., search and then save).
The memoryTools() function provides two tools:
add_memory
Saves information to memory. The LLM calls this automatically when it detects important information.
// LLM automatically calls this
{
tool : 'add_memory' ,
parameters : {
memory : 'User prefers TypeScript over JavaScript for type safety'
}
}
The information to save. Should be a complete, self-contained statement.
Optional metadata for categorization: {
memory : 'User prefers TypeScript' ,
metadata : {
category : 'preferences' ,
tags : [ 'programming' , 'languages' ]
}
}
delete_memory
Removes a specific memory by ID.
// LLM calls this when user asks to forget something
{
tool : 'delete_memory' ,
parameters : {
memoryId : 'uuid-of-memory'
}
}
The UUID of the memory to delete. The LLM can get this from the context.
Advanced Patterns
Pattern 1: Conditional Context Injection
Only inject context when relevant:
const userMessage = messages [ messages . length - 1 ]. content ;
// Check if message might need memory context
const needsContext = /what | remember | know | told | said/ i . test ( userMessage );
let memoryContext = '' ;
if ( needsContext ) {
memoryContext = await getMemoryContext ( config , userMessage );
}
const result = await streamText ({
model: openai ( 'gpt-4o' ),
system: `You are a helpful assistant.
${ memoryContext ? ` \n What you know: \n ${ memoryContext } ` : '' } ` ,
messages ,
tools ,
});
Pattern 2: Category-Based Memory
Use metadata to organize memories by category:
const tools = {
add_preference: tool ({
description: 'Save a user preference' ,
parameters: z . object ({
preference: z . string (),
}),
execute : async ({ preference }) => {
await client . addMemory ( preference , {
metadata: { category: 'preference' },
});
return 'Preference saved' ;
},
}),
add_fact: tool ({
description: 'Save a factual piece of information' ,
parameters: z . object ({
fact: z . string (),
}),
execute : async ({ fact }) => {
await client . addMemory ( fact , {
metadata: { category: 'fact' },
});
return 'Fact saved' ;
},
}),
};
Show users when memories are being saved:
'use client' ;
import { useChat } from 'ai/react' ;
export default function ChatPage () {
const { messages , input , handleInputChange , handleSubmit } = useChat ({
onToolCall : ({ toolCall }) => {
if ( toolCall . toolName === 'add_memory' ) {
console . log ( 'Saving memory:' , toolCall . args . memory );
// Show toast notification
}
},
});
return (
< div >
{ messages . map (( message ) => (
< div key = {message. id } >
< p >{message. content } </ p >
{ /* Show tool calls */ }
{ message . toolInvocations ?. map (( tool , i ) => (
< div key = { i } className = "text-sm text-gray-500" >
{ tool . toolName === ' add_memory ' && (
< span >💾 Saved to memory </ span >
)}
</ div >
))}
</ div >
))}
< form onSubmit = { handleSubmit } >
< input value = { input } onChange = { handleInputChange } />
< button type = "submit" > Send </ button >
</ form >
</ div >
);
}
Pattern 4: Multi-Step Reasoning
Allow the LLM to search before responding:
const tools = {
... memoryTools ( config ),
search_memory: tool ({
description: 'Search for relevant memories' ,
parameters: z . object ({
query: z . string (). describe ( 'What to search for' ),
}),
execute : async ({ query }) => {
const memories = await client . searchMemories ( query , { limit: 3 });
return memories . map ( m => m . content ). join ( ' \n ' );
},
}),
};
const result = await streamText ({
model: openai ( 'gpt-4o' ),
system: `You are a helpful assistant with memory.
Use search_memory to find relevant information before answering questions.
Use add_memory to save important new information.` ,
messages ,
tools ,
maxSteps: 5 , // Allow search → respond → save flow
});
This pattern is less reliable than pre-fetching context. The LLM may not always call search_memory when needed.
Error Handling
Handle errors gracefully in production:
export async function POST ( req : Request ) {
try {
const { messages } = await req . json ();
// Validate input
if ( ! messages || ! Array . isArray ( messages )) {
return new Response ( 'Invalid request' , { status: 400 });
}
const memoryConfig = {
apiKey: process . env . SATORI_API_KEY ! ,
baseUrl: process . env . SATORI_URL ! ,
userId: session . userId ,
};
// Try to fetch context, but don't fail if it errors
let memoryContext = '' ;
try {
memoryContext = await getMemoryContext (
memoryConfig ,
messages [ messages . length - 1 ]. content
);
} catch ( error ) {
console . error ( 'Failed to fetch memory context:' , error );
// Continue without context
}
const tools = memoryTools ( memoryConfig );
const result = await streamText ({
model: openai ( 'gpt-4o' ),
system: `You are a helpful assistant.
${ memoryContext ? ` \n What you know: \n ${ memoryContext } ` : '' } ` ,
messages ,
tools ,
});
return result . toDataStreamResponse ();
} catch ( error ) {
console . error ( 'Chat error:' , error );
// Return user-friendly error
return new Response (
JSON . stringify ({ error: 'Failed to process message' }),
{ status: 500 , headers: { 'Content-Type' : 'application/json' } }
);
}
}
Testing
Test your memory integration:
import { POST } from './route' ;
describe ( 'Chat API with Memory' , () => {
it ( 'saves memories when user shares information' , async () => {
const request = new Request ( 'http://localhost:3000/api/chat' , {
method: 'POST' ,
headers: { 'Content-Type' : 'application/json' },
body: JSON . stringify ({
messages: [
{ role: 'user' , content: 'Remember that I love TypeScript' }
],
}),
});
const response = await POST ( request );
expect ( response . status ). toBe ( 200 );
// Verify memory was saved
const memories = await client . searchMemories ( 'TypeScript' );
expect ( memories ). toHaveLength ( 1 );
expect ( memories [ 0 ]. content ). toContain ( 'TypeScript' );
});
});
Cache embeddings for common queries
const embeddingCache = new Map < string , string >();
async function getCachedContext ( query : string ) {
if ( embeddingCache . has ( query )) {
return embeddingCache . get ( query ) ! ;
}
const context = await getMemoryContext ( config , query );
embeddingCache . set ( query , context );
return context ;
}
Parallel context fetching
// Fetch context and start LLM call in parallel
const [ memoryContext ] = await Promise . all ([
getMemoryContext ( config , userMessage ),
// Other async operations
]);
// Fetch fewer memories for faster responses
const context = await getMemoryContext ( config , userMessage , {
limit: 3 , // Instead of default 10
});
Next Steps