Embeddings

AI

Generate text embeddings for semantic search, similarity matching, and RAG applications.

Semantic Search

Find similar documents

RAG Ready

Retrieval-augmented gen

Batch Support

Multiple texts at once

Dimension Control

Reduce for efficiency

Overview

Embeddings convert text into numerical vectors that capture semantic meaning. Use them for semantic search, document similarity, clustering, and retrieval-augmented generation (RAG).

Generate Embeddings

Create embeddings for text:

import { platform } from '@/lib/platform'

// Single text
const result = await platform.ai.embed({
  model: 'openai/text-embedding-3-small',
  input: 'The quick brown fox jumps over the lazy dog.',
})

console.log(result.embeddings[0]) // Float array [0.023, -0.012, ...]
console.log(result.dimensions)     // 1536

// Multiple texts (more efficient)
const results = await platform.ai.embed({
  model: 'openai/text-embedding-3-small',
  input: [
    'First document text...',
    'Second document text...',
    'Third document text...',
  ],
})

// results.embeddings[0], results.embeddings[1], results.embeddings[2]

Supported Models

Available embedding models:

PropertyTypeDescription
openai/text-embedding-3-large3072 dimHighest quality
openai/text-embedding-3-small1536 dimBalanced (recommended)
openai/text-embedding-ada-0021536 dimLegacy, cost-effective

RAG Pattern

Retrieval-Augmented Generation with embeddings:

// 1. User asks a question
const question = 'What is your refund policy?'

// 2. Find relevant documents
const queryEmbed = await platform.ai.embed({
  model: 'openai/text-embedding-3-small',
  input: question,
})

const relevantDocs = await db.query(`
  SELECT content FROM documents
  ORDER BY embedding <=> $1
  LIMIT 3
`, [queryEmbed.embeddings[0]])

// 3. Generate answer with context
const context = relevantDocs.map(d => d.content).join('\n\n')

const answer = await platform.ai.chat({
  model: 'anthropic/claude-3.5-sonnet',
  messages: [
    {
      role: 'system',
      content: `Answer the user's question based on the following context:

${context}

If the answer is not in the context, say you don't know.`,
    },
    { role: 'user', content: question },
  ],
})

Dimension Reduction

Reduce dimensions for storage efficiency:

// text-embedding-3-* supports dimension reduction
const result = await platform.ai.embed({
  model: 'openai/text-embedding-3-small',
  input: 'Your text here',
  dimensions: 512, // Reduce from 1536 to 512
})

// Smaller vectors = less storage, faster search
// Trade-off: slightly lower accuracy

Storage Optimization

For large document collections, reducing dimensions to 512 or 256 can significantly reduce storage costs while maintaining good search quality.

Best Practices

Chunk large docs

Split into ~500 token chunks for better retrieval

Use consistent models

Query and document embeddings must match

Batch requests

Embed multiple texts in one request

Cache embeddings

Store in database, don't re-generate

Use vector DBs

pgvector, Pinecone, or Weaviate