Skip to main content

Overview

Satori provides first-class integration with the Vercel AI SDK through the @satori/tools package. This guide covers everything from basic setup to advanced patterns.

Installation

npm install @satori/tools ai @ai-sdk/openai

Basic Integration

Step 1: Create Memory Tools

The memoryTools() function creates AI SDK-compatible tools that the LLM can use to manage memories:
import { memoryTools, getMemoryContext } from '@satori/tools';

const tools = memoryTools({
  apiKey: process.env.SATORI_API_KEY!,
  baseUrl: process.env.SATORI_URL!,
  userId: 'user-123',
});

Step 2: Pre-fetch Memory Context

Fetch relevant memories before calling the LLM:
const memoryContext = await getMemoryContext(
  {
    apiKey: process.env.SATORI_API_KEY!,
    baseUrl: process.env.SATORI_URL!,
    userId: 'user-123',
  },
  userMessage,
  { limit: 5, threshold: 0.7 }
);

Step 3: Stream with Memory

Use streamText() with memory tools and context:
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';

const result = await streamText({
  model: openai('gpt-4o'),
  system: `You are a helpful assistant with long-term memory.
  
What you know about this user:
${memoryContext}

When the user shares important information, use the add_memory tool to save it.`,
  messages,
  tools,
});

return result.toDataStreamResponse();

Complete API Route Example

Here’s a full Next.js API route with memory:
app/api/chat/route.ts
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { memoryTools, getMemoryContext } from '@satori/tools';

export async function POST(req: Request) {
  try {
    const { messages } = await req.json();
    const userMessage = messages[messages.length - 1].content;
    
    // Get user ID from your auth system
    const session = await getSession(req);
    if (!session?.userId) {
      return new Response('Unauthorized', { status: 401 });
    }
    
    // Create memory configuration
    const memoryConfig = {
      apiKey: process.env.SATORI_API_KEY!,
      baseUrl: process.env.SATORI_URL!,
      userId: session.userId,
    };
    
    // Create memory tools
    const tools = memoryTools(memoryConfig);
    
    // Pre-fetch relevant context
    const memoryContext = await getMemoryContext(
      memoryConfig,
      userMessage,
      { limit: 5 }
    );
    
    // Stream response with memory
    const result = await streamText({
      model: openai('gpt-4o'),
      system: `You are a helpful assistant with long-term memory.
      
What you know about this user:
${memoryContext}

Use the add_memory tool when the user:
- Shares preferences or opinions
- Provides personal information
- Mentions important dates or events
- Expresses goals or intentions

Be natural and conversational. Don't explicitly mention that you're saving memories.`,
      messages,
      tools,
      maxSteps: 5, // Allow multiple tool calls
    });
    
    return result.toDataStreamResponse();
  } catch (error) {
    console.error('Chat error:', error);
    return new Response('Internal Server Error', { status: 500 });
  }
}
Set maxSteps: 5 to allow the LLM to make multiple tool calls in a single response (e.g., search and then save).

Available Tools

The memoryTools() function provides two tools:

add_memory

Saves information to memory. The LLM calls this automatically when it detects important information.
// LLM automatically calls this
{
  tool: 'add_memory',
  parameters: {
    memory: 'User prefers TypeScript over JavaScript for type safety'
  }
}
memory
string
required
The information to save. Should be a complete, self-contained statement.
metadata
object
Optional metadata for categorization:
{
  memory: 'User prefers TypeScript',
  metadata: {
    category: 'preferences',
    tags: ['programming', 'languages']
  }
}

delete_memory

Removes a specific memory by ID.
// LLM calls this when user asks to forget something
{
  tool: 'delete_memory',
  parameters: {
    memoryId: 'uuid-of-memory'
  }
}
memoryId
string
required
The UUID of the memory to delete. The LLM can get this from the context.

Advanced Patterns

Pattern 1: Conditional Context Injection

Only inject context when relevant:
const userMessage = messages[messages.length - 1].content;

// Check if message might need memory context
const needsContext = /what|remember|know|told|said/i.test(userMessage);

let memoryContext = '';
if (needsContext) {
  memoryContext = await getMemoryContext(config, userMessage);
}

const result = await streamText({
  model: openai('gpt-4o'),
  system: `You are a helpful assistant.
  ${memoryContext ? `\nWhat you know:\n${memoryContext}` : ''}`,
  messages,
  tools,
});

Pattern 2: Category-Based Memory

Use metadata to organize memories by category:
const tools = {
  add_preference: tool({
    description: 'Save a user preference',
    parameters: z.object({
      preference: z.string(),
    }),
    execute: async ({ preference }) => {
      await client.addMemory(preference, {
        metadata: { category: 'preference' },
      });
      return 'Preference saved';
    },
  }),
  add_fact: tool({
    description: 'Save a factual piece of information',
    parameters: z.object({
      fact: z.string(),
    }),
    execute: async ({ fact }) => {
      await client.addMemory(fact, {
        metadata: { category: 'fact' },
      });
      return 'Fact saved';
    },
  }),
};

Pattern 3: Streaming with Tool Call Feedback

Show users when memories are being saved:
'use client';

import { useChat } from 'ai/react';

export default function ChatPage() {
  const { messages, input, handleInputChange, handleSubmit } = useChat({
    onToolCall: ({ toolCall }) => {
      if (toolCall.toolName === 'add_memory') {
        console.log('Saving memory:', toolCall.args.memory);
        // Show toast notification
      }
    },
  });
  
  return (
    <div>
      {messages.map((message) => (
        <div key={message.id}>
          <p>{message.content}</p>
          
          {/* Show tool calls */}
          {message.toolInvocations?.map((tool, i) => (
            <div key={i} className="text-sm text-gray-500">
              {tool.toolName === 'add_memory' && (
                <span>💾 Saved to memory</span>
              )}
            </div>
          ))}
        </div>
      ))}
      
      <form onSubmit={handleSubmit}>
        <input value={input} onChange={handleInputChange} />
        <button type="submit">Send</button>
      </form>
    </div>
  );
}

Pattern 4: Multi-Step Reasoning

Allow the LLM to search before responding:
const tools = {
  ...memoryTools(config),
  search_memory: tool({
    description: 'Search for relevant memories',
    parameters: z.object({
      query: z.string().describe('What to search for'),
    }),
    execute: async ({ query }) => {
      const memories = await client.searchMemories(query, { limit: 3 });
      return memories.map(m => m.content).join('\n');
    },
  }),
};

const result = await streamText({
  model: openai('gpt-4o'),
  system: `You are a helpful assistant with memory.
  
Use search_memory to find relevant information before answering questions.
Use add_memory to save important new information.`,
  messages,
  tools,
  maxSteps: 5, // Allow search → respond → save flow
});
This pattern is less reliable than pre-fetching context. The LLM may not always call search_memory when needed.

Error Handling

Handle errors gracefully in production:
export async function POST(req: Request) {
  try {
    const { messages } = await req.json();
    
    // Validate input
    if (!messages || !Array.isArray(messages)) {
      return new Response('Invalid request', { status: 400 });
    }
    
    const memoryConfig = {
      apiKey: process.env.SATORI_API_KEY!,
      baseUrl: process.env.SATORI_URL!,
      userId: session.userId,
    };
    
    // Try to fetch context, but don't fail if it errors
    let memoryContext = '';
    try {
      memoryContext = await getMemoryContext(
        memoryConfig,
        messages[messages.length - 1].content
      );
    } catch (error) {
      console.error('Failed to fetch memory context:', error);
      // Continue without context
    }
    
    const tools = memoryTools(memoryConfig);
    
    const result = await streamText({
      model: openai('gpt-4o'),
      system: `You are a helpful assistant.
      ${memoryContext ? `\nWhat you know:\n${memoryContext}` : ''}`,
      messages,
      tools,
    });
    
    return result.toDataStreamResponse();
  } catch (error) {
    console.error('Chat error:', error);
    
    // Return user-friendly error
    return new Response(
      JSON.stringify({ error: 'Failed to process message' }),
      { status: 500, headers: { 'Content-Type': 'application/json' } }
    );
  }
}

Testing

Test your memory integration:
import { POST } from './route';

describe('Chat API with Memory', () => {
  it('saves memories when user shares information', async () => {
    const request = new Request('http://localhost:3000/api/chat', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        messages: [
          { role: 'user', content: 'Remember that I love TypeScript' }
        ],
      }),
    });
    
    const response = await POST(request);
    expect(response.status).toBe(200);
    
    // Verify memory was saved
    const memories = await client.searchMemories('TypeScript');
    expect(memories).toHaveLength(1);
    expect(memories[0].content).toContain('TypeScript');
  });
});

Performance Optimization

const embeddingCache = new Map<string, string>();

async function getCachedContext(query: string) {
  if (embeddingCache.has(query)) {
    return embeddingCache.get(query)!;
  }
  
  const context = await getMemoryContext(config, query);
  embeddingCache.set(query, context);
  return context;
}
// Fetch context and start LLM call in parallel
const [memoryContext] = await Promise.all([
  getMemoryContext(config, userMessage),
  // Other async operations
]);
// Fetch fewer memories for faster responses
const context = await getMemoryContext(config, userMessage, {
  limit: 3, // Instead of default 10
});

Next Steps