Semantic Search
Find similar documents
RAG Ready
Retrieval-augmented gen
Batch Support
Multiple texts at once
Dimension Control
Reduce for efficiency
Overview
Embeddings convert text into numerical vectors that capture semantic meaning. Use them for semantic search, document similarity, clustering, and retrieval-augmented generation (RAG).
Generate Embeddings
Create embeddings for text:
import { platform } from '@/lib/platform'
// Single text
const result = await platform.ai.embed({
model: 'openai/text-embedding-3-small',
input: 'The quick brown fox jumps over the lazy dog.',
})
console.log(result.embeddings[0]) // Float array [0.023, -0.012, ...]
console.log(result.dimensions) // 1536
// Multiple texts (more efficient)
const results = await platform.ai.embed({
model: 'openai/text-embedding-3-small',
input: [
'First document text...',
'Second document text...',
'Third document text...',
],
})
// results.embeddings[0], results.embeddings[1], results.embeddings[2]Supported Models
Available embedding models:
| Property | Type | Description |
|---|---|---|
openai/text-embedding-3-large | 3072 dim | Highest quality |
openai/text-embedding-3-small | 1536 dim | Balanced (recommended) |
openai/text-embedding-ada-002 | 1536 dim | Legacy, cost-effective |
Semantic Search
Find similar documents using cosine similarity:
// 1. Index your documents (do this once)
const documents = [
{ id: '1', text: 'How to reset your password' },
{ id: '2', text: 'Billing and payment methods' },
{ id: '3', text: 'Getting started with the API' },
]
const docEmbeddings = await platform.ai.embed({
model: 'openai/text-embedding-3-small',
input: documents.map(d => d.text),
})
// Store embeddings in your database (e.g., with pgvector)
// 2. Search at query time
const query = 'How do I change my password?'
const queryEmbedding = await platform.ai.embed({
model: 'openai/text-embedding-3-small',
input: query,
})
// Find similar documents using cosine similarity
// With pgvector: SELECT * FROM documents ORDER BY embedding <=> $1 LIMIT 5function cosineSimilarity(a: number[], b: number[]): number {
let dotProduct = 0
let normA = 0
let normB = 0
for (let i = 0; i < a.length; i++) {
dotProduct += a[i] * b[i]
normA += a[i] * a[i]
normB += b[i] * b[i]
}
return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB))
}
// Use it
const similarities = docEmbeddings.embeddings.map((docEmb, i) => ({
id: documents[i].id,
score: cosineSimilarity(queryEmbedding.embeddings[0], docEmb),
}))
const topResults = similarities.sort((a, b) => b.score - a.score).slice(0, 5)RAG Pattern
Retrieval-Augmented Generation with embeddings:
// 1. User asks a question
const question = 'What is your refund policy?'
// 2. Find relevant documents
const queryEmbed = await platform.ai.embed({
model: 'openai/text-embedding-3-small',
input: question,
})
const relevantDocs = await db.query(`
SELECT content FROM documents
ORDER BY embedding <=> $1
LIMIT 3
`, [queryEmbed.embeddings[0]])
// 3. Generate answer with context
const context = relevantDocs.map(d => d.content).join('\n\n')
const answer = await platform.ai.chat({
model: 'anthropic/claude-3.5-sonnet',
messages: [
{
role: 'system',
content: `Answer the user's question based on the following context:
${context}
If the answer is not in the context, say you don't know.`,
},
{ role: 'user', content: question },
],
})Dimension Reduction
Reduce dimensions for storage efficiency:
// text-embedding-3-* supports dimension reduction
const result = await platform.ai.embed({
model: 'openai/text-embedding-3-small',
input: 'Your text here',
dimensions: 512, // Reduce from 1536 to 512
})
// Smaller vectors = less storage, faster search
// Trade-off: slightly lower accuracyStorage Optimization
Best Practices
Chunk large docs
Split into ~500 token chunks for better retrieval
Use consistent models
Query and document embeddings must match
Batch requests
Embed multiple texts in one request
Cache embeddings
Store in database, don't re-generate
Use vector DBs
pgvector, Pinecone, or Weaviate