Usage Analytics
Tokens & requests
Cost Tracking
Per-model breakdown
Spending Quotas
Monthly & daily limits
Rate Limits
Per-model throttling
Overview
The platform tracks all AI usage automatically, providing detailed analytics on tokens, requests, and costs. Set quotas to control spending and prevent unexpected bills.
Get Usage Statistics
Query your AI usage data:
import { platform } from '@/lib/platform'
// Get usage for the last 30 days
const usage = await platform.ai.getUsage({
days: 30,
})
console.log(usage)
// {
// totalRequests: 15420,
// totalTokens: 2345678,
// totalCost: 45.67,
// byModel: {
// 'anthropic/claude-3.5-sonnet': { requests: 1200, tokens: 890000, cost: 32.50 },
// 'anthropic/claude-3-haiku': { requests: 14000, tokens: 1400000, cost: 12.00 },
// 'openai/text-embedding-3-small': { requests: 220, tokens: 55678, cost: 1.17 },
// },
// byDay: [
// { date: '2024-01-15', requests: 520, tokens: 78000, cost: 1.52 },
// { date: '2024-01-16', requests: 480, tokens: 72000, cost: 1.44 },
// // ...
// ],
// }
// Get usage for a specific user
const userUsage = await platform.ai.getUsage({
userId: user.id,
days: 30,
})Per-Request Usage
Each AI response includes usage information:
const response = await platform.ai.chat({
model: 'anthropic/claude-3.5-sonnet',
messages: [...],
})
console.log(response.usage)
// {
// promptTokens: 150,
// completionTokens: 320,
// totalTokens: 470,
// }
// Cost is tracked automatically by the platform
// View per-model costs in the dashboardSet Quotas
Control AI spending with quotas:
// Set app-wide monthly quota
await platform.ai.setQuota({
type: 'monthly',
limit: 100, // $100 per month
action: 'block', // 'block' or 'warn'
})
// Set per-user quota
await platform.ai.setUserQuota({
userId: user.id,
type: 'daily',
limit: 5, // $5 per day per user
})Configure quotas in the dashboard under App Settings → AI Services → Quotas.
Rate Limits
Default rate limits per model category:
| Property | Type | Description |
|---|---|---|
Large models (Claude 3 Opus, etc.) | 50 req/min | 40,000 tokens/min |
Medium models (Claude 3.5 Sonnet, Gemini Pro) | 100 req/min | 80,000 tokens/min |
Fast models (Claude 3 Haiku, Gemini Flash) | 500 req/min | 150,000 tokens/min |
Embeddings | 1000 req/min | 1,000,000 tokens/min |
Image generation | 10 req/min | N/A |
Increase Limits
Handle Rate Limits
Handle rate limit errors gracefully:
try {
const response = await platform.ai.chat({
model: 'anthropic/claude-3.5-sonnet',
messages: [...],
})
} catch (error) {
if (error.code === 'RATE_LIMIT_EXCEEDED') {
// Wait and retry
const retryAfter = error.retryAfter // Seconds to wait
await new Promise(resolve => setTimeout(resolve, retryAfter * 1000))
// Retry the request
}
if (error.code === 'QUOTA_EXCEEDED') {
// User or app quota exceeded
// Show upgrade prompt or wait until quota resets
}
}Cost Optimization
Tips to reduce AI costs:
Right-size models
Haiku/Flash for simple tasks
Limit tokens
Set maxTokens to cap responses
Cache responses
Avoid duplicate API calls
Batch embeddings
Multiple texts per request
Trim prompts
Remove unnecessary context
Stream early
Fail fast on bad outputs
Billing Tiers
Free
Limited monthly AI credits included
Pro
Higher credits + pay-as-you-go
Enterprise
Custom pricing & volume discounts
View detailed billing in the dashboard under Settings → Billing → AI Usage.