Embeddings | Sylphx Documentation

Semantic Search

Find similar documents

RAG Ready

Retrieval-augmented gen

Batch Support

Multiple texts at once

Dimension Control

Reduce for efficiency

Overview

Embeddings convert text into numerical vectors that capture semantic meaning. Use them for semantic search, document similarity, clustering, and retrieval-augmented generation (RAG).

Generate Embeddings

Create embeddings for text:

import { platform } from '@/lib/platform'

// Single text
const result = await platform.ai.embed({
  model: 'openai/text-embedding-3-small',
  input: 'The quick brown fox jumps over the lazy dog.',
})

console.log(result.embeddings[0]) // Float array [0.023, -0.012, ...]
console.log(result.dimensions)     // 1536

// Multiple texts (more efficient)
const results = await platform.ai.embed({
  model: 'openai/text-embedding-3-small',
  input: [
    'First document text...',
    'Second document text...',
    'Third document text...',
  ],
})

// results.embeddings[0], results.embeddings[1], results.embeddings[2]

Supported Models

Available embedding models:

Property	Type	Description
`openai/text-embedding-3-large`	`3072 dim`	Highest quality
`openai/text-embedding-3-small`	`1536 dim`	Balanced (recommended)
`openai/text-embedding-ada-002`	`1536 dim`	Legacy, cost-effective

Semantic Search

Find similar documents using cosine similarity:

// 1. Index your documents (do this once)
const documents = [
  { id: '1', text: 'How to reset your password' },
  { id: '2', text: 'Billing and payment methods' },
  { id: '3', text: 'Getting started with the API' },
]

const docEmbeddings = await platform.ai.embed({
  model: 'openai/text-embedding-3-small',
  input: documents.map(d => d.text),
})

// Store embeddings in your database (with vector extensions)

// 2. Search at query time
const query = 'How do I change my password?'
const queryEmbedding = await platform.ai.embed({
  model: 'openai/text-embedding-3-small',
  input: query,
})

// Find similar documents using cosine similarity
// Vector similarity: SELECT * FROM documents ORDER BY embedding <=> $1 LIMIT 5

Cosine similarity helper

function cosineSimilarity(a: number[], b: number[]): number {
  let dotProduct = 0
  let normA = 0
  let normB = 0

  for (let i = 0; i < a.length; i++) {
    dotProduct += a[i] * b[i]
    normA += a[i] * a[i]
    normB += b[i] * b[i]
  }

  return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB))
}

// Use it
const similarities = docEmbeddings.embeddings.map((docEmb, i) => ({
  id: documents[i].id,
  score: cosineSimilarity(queryEmbedding.embeddings[0], docEmb),
}))

const topResults = similarities.sort((a, b) => b.score - a.score).slice(0, 5)

RAG Pattern

Retrieval-Augmented Generation with embeddings:

// 1. User asks a question
const question = 'What is your refund policy?'

// 2. Find relevant documents
const queryEmbed = await platform.ai.embed({
  model: 'openai/text-embedding-3-small',
  input: question,
})

const relevantDocs = await db.query(`
  SELECT content FROM documents
  ORDER BY embedding <=> $1
  LIMIT 3
`, [queryEmbed.embeddings[0]])

// 3. Generate answer with context
const context = relevantDocs.map(d => d.content).join('\n\n')

const answer = await platform.ai.chat({
  model: 'anthropic/claude-sonnet-4',
  messages: [
    {
      role: 'system',
      content: `Answer the user's question based on the following context:

${context}

If the answer is not in the context, say you don't know.`,
    },
    { role: 'user', content: question },
  ],
})

Dimension Reduction

Reduce dimensions for storage efficiency:

// text-embedding-3-* supports dimension reduction
const result = await platform.ai.embed({
  model: 'openai/text-embedding-3-small',
  input: 'Your text here',
  dimensions: 512, // Reduce from 1536 to 512
})

// Smaller vectors = less storage, faster search
// Trade-off: slightly lower accuracy

Storage Optimization

For large document collections, reducing dimensions to 512 or 256 can significantly reduce storage costs while maintaining good search quality.

Best Practices

Chunk large docs

Split into ~500 token chunks for better retrieval

Use consistent models

Query and document embeddings must match

Batch requests

Embed multiple texts in one request

Cache embeddings

Store in database, don't re-generate

Use vector DBs

Vector databases (Pinecone, Weaviate, etc.)

Image Generation

Generate images from text

Chat Completions

Build conversational AI