Vercel AI SDK Integration

Overview

Satori provides first-class integration with the Vercel AI SDK through the @usesatori/tools package. This guide covers everything from basic setup to advanced patterns.

Installation

npm install @usesatori/tools ai @ai-sdk/openai

Basic Integration

Step 1: Create Memory Tools

The memoryTools() function creates AI SDK-compatible tools that the LLM can use to manage memories:

import { memoryTools, getContext } from '@usesatori/tools';

const tools = memoryTools({
  apiKey: process.env.SATORI_API_KEY!,
  baseUrl: process.env.SATORI_URL!,
  userId: 'user-123',
});

Step 2: Pre-fetch Memory Context

Fetch relevant memories before calling the LLM:

const memoryContext = await getContext(
  {
    apiKey: process.env.SATORI_API_KEY!,
    baseUrl: process.env.SATORI_URL!,
    userId: 'user-123',
  },
  userMessage,
  { limit: 5, threshold: 0.7 }
);

Step 3: Stream with Memory

Use streamText() with memory tools and context:

import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';

const result = await streamText({
  model: openai('gpt-4o'),
  system: `You are a helpful assistant with long-term memory.
  
What you know about this user:
${memoryContext}

When the user shares important information, use the add_item tool to save it.`,
  messages,
  tools,
});

return result.toDataStreamResponse();

Complete API Route Example

Here’s a full Next.js API route with memory:

app/api/chat/route.ts

import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { memoryTools, getContext } from '@usesatori/tools';

export async function POST(req: Request) {
  try {
    const { messages } = await req.json();
    const userMessage = messages[messages.length - 1].content;
    
    // Get user ID from your auth system
    const session = await getSession(req);
    if (!session?.userId) {
      return new Response('Unauthorized', { status: 401 });
    }
    
    // Create memory configuration
    const memoryConfig = {
      apiKey: process.env.SATORI_API_KEY!,
      baseUrl: process.env.SATORI_URL!,
      userId: session.userId,
    };
    
    // Create memory tools
    const tools = memoryTools(memoryConfig);
    
    // Pre-fetch relevant context
    const memoryContext = await getContext(
      memoryConfig,
      userMessage,
      { limit: 5 }
    );
    
    // Stream response with memory
    const result = await streamText({
      model: openai('gpt-4o'),
      system: `You are a helpful assistant with long-term memory.
      
What you know about this user:
${memoryContext}

Use the add_item tool when the user:
- Shares preferences or opinions
- Provides personal information
- Mentions important dates or events
- Expresses goals or intentions

Be natural and conversational. Don't explicitly mention that you're saving memories.`,
      messages,
      tools,
      maxSteps: 5, // Allow multiple tool calls
    });
    
    return result.toDataStreamResponse();
  } catch (error) {
    console.error('Chat error:', error);
    return new Response('Internal Server Error', { status: 500 });
  }
}

Set maxSteps: 5 to allow the LLM to make multiple tool calls in a single response (e.g., search and then save).

Available Tools

The memoryTools() function provides two tools:

add_item

Saves information to memory. The LLM calls this automatically when it detects important information.

// LLM automatically calls this
{
  tool: 'add_item',
  parameters: {
    memory: 'User prefers TypeScript over JavaScript for type safety'
  }
}

memory

string

required

The information to save. Should be a complete, self-contained statement.

metadata

object

Optional metadata for categorization:

{
  memory: 'User prefers TypeScript',
  metadata: {
    category: 'preferences',
    tags: ['programming', 'languages']
  }
}

delete_memory

Removes a specific memory by ID.

// LLM calls this when user asks to forget something
{
  tool: 'delete_memory',
  parameters: {
    memoryId: 'uuid-of-memory'
  }
}

memoryId

string

required

The UUID of the memory to delete. The LLM can get this from the context.

Advanced Patterns

Pattern 1: Conditional Context Injection

Only inject context when relevant:

const userMessage = messages[messages.length - 1].content;

// Check if message might need memory context
const needsContext = /what|remember|know|told|said/i.test(userMessage);

let memoryContext = '';
if (needsContext) {
  memoryContext = await getContext(config, userMessage);
}

const result = await streamText({
  model: openai('gpt-4o'),
  system: `You are a helpful assistant.
  ${memoryContext ? `\nWhat you know:\n${memoryContext}` : ''}`,
  messages,
  tools,
});

Pattern 2: Category-Based Memory

Use metadata to organize memories by category:

const tools = {
  add_preference: tool({
    description: 'Save a user preference',
    parameters: z.object({
      preference: z.string(),
    }),
    execute: async ({ preference }) => {
      await client.addMemory(preference, {
        metadata: { category: 'preference' },
      });
      return 'Preference saved';
    },
  }),
  add_fact: tool({
    description: 'Save a factual piece of information',
    parameters: z.object({
      fact: z.string(),
    }),
    execute: async ({ fact }) => {
      await client.addMemory(fact, {
        metadata: { category: 'fact' },
      });
      return 'Fact saved';
    },
  }),
};

Pattern 3: Streaming with Tool Call Feedback

Show users when memories are being saved:

'use client';

import { useChat } from 'ai/react';

export default function ChatPage() {
  const { messages, input, handleInputChange, handleSubmit } = useChat({
    onToolCall: ({ toolCall }) => {
      if (toolCall.toolName === 'add_item') {
        console.log('Saving memory:', toolCall.args.memory);
        // Show toast notification
      }
    },
  });
  
  return (
    <div>
      {messages.map((message) => (
        <div key={message.id}>
          <p>{message.content}</p>
          
          {/* Show tool calls */}
          {message.toolInvocations?.map((tool, i) => (
            <div key={i} className="text-sm text-gray-500">
              {tool.toolName === 'add_item' && (
                <span>💾 Saved to memory</span>
              )}
            </div>
          ))}
        </div>
      ))}
      
      <form onSubmit={handleSubmit}>
        <input value={input} onChange={handleInputChange} />
        <button type="submit">Send</button>
      </form>
    </div>
  );
}

Pattern 4: Multi-Step Reasoning

Allow the LLM to search before responding:

const tools = {
  ...memoryTools(config),
  search_memory: tool({
    description: 'Search for relevant memories',
    parameters: z.object({
      query: z.string().describe('What to search for'),
    }),
    execute: async ({ query }) => {
      const memories = await client.searchMemories(query, { limit: 3 });
      return memories.map(m => m.content).join('\n');
    },
  }),
};

const result = await streamText({
  model: openai('gpt-4o'),
  system: `You are a helpful assistant with memory.
  
Use search_memory to find relevant information before answering questions.
Use add_item to save important new information.`,
  messages,
  tools,
  maxSteps: 5, // Allow search → respond → save flow
});

This pattern is less reliable than pre-fetching context. The LLM may not always call search_memory when needed.

Error Handling

Handle errors gracefully in production:

export async function POST(req: Request) {
  try {
    const { messages } = await req.json();
    
    // Validate input
    if (!messages || !Array.isArray(messages)) {
      return new Response('Invalid request', { status: 400 });
    }
    
    const memoryConfig = {
      apiKey: process.env.SATORI_API_KEY!,
      baseUrl: process.env.SATORI_URL!,
      userId: session.userId,
    };
    
    // Try to fetch context, but don't fail if it errors
    let memoryContext = '';
    try {
      memoryContext = await getContext(
        memoryConfig,
        messages[messages.length - 1].content
      );
    } catch (error) {
      console.error('Failed to fetch memory context:', error);
      // Continue without context
    }
    
    const tools = memoryTools(memoryConfig);
    
    const result = await streamText({
      model: openai('gpt-4o'),
      system: `You are a helpful assistant.
      ${memoryContext ? `\nWhat you know:\n${memoryContext}` : ''}`,
      messages,
      tools,
    });
    
    return result.toDataStreamResponse();
  } catch (error) {
    console.error('Chat error:', error);
    
    // Return user-friendly error
    return new Response(
      JSON.stringify({ error: 'Failed to process message' }),
      { status: 500, headers: { 'Content-Type': 'application/json' } }
    );
  }
}

Testing

Test your memory integration:

import { POST } from './route';

describe('Chat API with Memory', () => {
  it('saves memories when user shares information', async () => {
    const request = new Request('http://localhost:3000/api/chat', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        messages: [
          { role: 'user', content: 'Remember that I love TypeScript' }
        ],
      }),
    });
    
    const response = await POST(request);
    expect(response.status).toBe(200);
    
    // Verify memory was saved
    const memories = await client.searchMemories('TypeScript');
    expect(memories).toHaveLength(1);
    expect(memories[0].content).toContain('TypeScript');
  });
});

Performance Optimization

Cache embeddings for common queries

const embeddingCache = new Map<string, string>();

async function getCachedContext(query: string) {
  if (embeddingCache.has(query)) {
    return embeddingCache.get(query)!;
  }
  
  const context = await getContext(config, query);
  embeddingCache.set(query, context);
  return context;
}

Parallel context fetching

// Fetch context and start LLM call in parallel
const [memoryContext] = await Promise.all([
  getContext(config, userMessage),
  // Other async operations
]);

Limit context size

// Fetch fewer memories for faster responses
const context = await getContext(config, userMessage, {
  limit: 3, // Instead of default 10
});

Next Steps

Direct Client

Use MemoryClient for custom integrations

Next.js Integration

Build a complete Next.js app with memory

API Reference

Explore the complete API

Examples

See complete implementations

Getting Started

Core Concepts

Integration Guides

Examples

Help

Vercel AI SDK Integration

Overview

Installation

Basic Integration

Step 1: Create Memory Tools

Step 2: Pre-fetch Memory Context

Step 3: Stream with Memory

Complete API Route Example

Available Tools

add_item

delete_memory

Advanced Patterns

Pattern 1: Conditional Context Injection

Pattern 2: Category-Based Memory

Pattern 3: Streaming with Tool Call Feedback

Pattern 4: Multi-Step Reasoning

Error Handling

Testing

Performance Optimization

Next Steps

Direct Client

Next.js Integration

API Reference

Examples

Getting Started

Core Concepts

Integration Guides

Examples

Help

​Overview

​Installation

​Basic Integration

​Step 1: Create Memory Tools

​Step 2: Pre-fetch Memory Context

​Step 3: Stream with Memory

​Complete API Route Example

​Available Tools

​add_item

​delete_memory

​Advanced Patterns

​Pattern 1: Conditional Context Injection

​Pattern 2: Category-Based Memory

​Pattern 3: Streaming with Tool Call Feedback

​Pattern 4: Multi-Step Reasoning

​Error Handling

​Testing

​Performance Optimization

​Next Steps

Direct Client

Next.js Integration

API Reference

Examples

Overview

Installation

Basic Integration

Step 1: Create Memory Tools

Step 2: Pre-fetch Memory Context

Step 3: Stream with Memory

Complete API Route Example

Available Tools

add_item

delete_memory

Advanced Patterns

Pattern 1: Conditional Context Injection

Pattern 2: Category-Based Memory

Pattern 3: Streaming with Tool Call Feedback

Pattern 4: Multi-Step Reasoning

Error Handling

Testing

Performance Optimization

Next Steps