Welcome!

Slide to unlock and explore

Slide to unlock

Command Palette

Search for a command to run...

0
Blog
PreviousNext

Introducing APIForge - From Plain English to a Deployed REST API in Under a Minute

APIForge turns a plain-English description into a fully deployed REST API — isolated PostgreSQL database, Docker container, subdomain routing, and auto-generated OpenAPI docs included.

The Idea

Most developer tools that promise "deploy in one click" are lying. They deploy to their infrastructure, give you a subdomain on their platform, and lock you in. Or they're scaffolding tools that spit out boilerplate code you still have to set up, host, and maintain yourself.

APIForge is different. You describe what you want in plain English. You get a real REST API — running in its own Docker container, with its own isolated PostgreSQL database, accessible at its own subdomain — fully provisioned and live in under a minute.

No YAML. No Dockerfiles. No database setup. No Nginx config. Just describe it, and it exists.

What It Actually Does

Here's a real example. You type:

A task manager with users and tasks. Each task has a title, a completion status, a due date, and belongs to a user.

APIForge:

  1. Parses that into a structured schema (tables, columns, relationships)
  2. Spins up a dedicated PostgreSQL container with your schema initialized
  3. Generates and builds a Node.js + Express API with full CRUD routes
  4. Deploys the API container behind Nginx at task-manager-abc123.apiforgelive.xyz
  5. Generates an OpenAPI spec and serves interactive docs at /docs

Your API is immediately queryable:

# Create a user
curl -X POST https://task-manager-abc123.apiforgelive.xyz/users \
  -H "Content-Type: application/json" \
  -H "X-API-Key: af_live_xxxx" \
  -d '{"name": "Arnab", "email": "arnab@example.com"}'
 
# Create a task
curl -X POST https://task-manager-abc123.apiforgelive.xyz/tasks \
  -H "Content-Type: application/json" \
  -H "X-API-Key: af_live_xxxx" \
  -d '{"title": "Write blog post", "done": false, "due_date": "2026-02-01", "user_id": 1}'
 
# Get all tasks
curl https://task-manager-abc123.apiforgelive.xyz/tasks \
  -H "X-API-Key: af_live_xxxx"

This is real data, in a real PostgreSQL database, returning real JSON. Not a mock.

Why Not Just Use Supabase / Firebase / PlanetScale?

Fair question. Those are great products. APIForge isn't competing with managed databases — it's solving a different problem.

With Supabase, you still have to:

  • Create the project
  • Design the schema in the UI or write migrations
  • Set up Row Level Security
  • Write your API layer (or use PostgREST and figure out its quirks)
  • Configure auth

APIForge does all of that from a sentence. The target user is someone who needs a backend for a prototype, a hackathon, a demo, or a side project — and doesn't want to spend two hours on setup before writing a single line of their actual application code.

Full Isolation Per API

Every API provisioned on APIForge gets:

  • Its own PostgreSQL container — not a shared database with a tenant_id column. A separate container, separate credentials, separate data.
  • Its own Docker network — containers for different APIs cannot communicate with each other.
  • Its own Nginx config block — subdomain routing is per-API with no shared upstream.
  • Resource caps — each container is limited to 256MB RAM and 50% of one CPU, so a runaway query can't affect other users.

This is more expensive than a shared-database model. It's a deliberate trade-off for data isolation and the ability to give users direct database credentials without any risk of cross-tenant access.

The Tech Stack

  • Frontend: Next.js dashboard for managing APIs, viewing analytics, and billing
  • Backend: Node.js + Express orchestration server
  • Generated APIs: Node.js + Express (generated code, containerized)
  • Database: PostgreSQL 15 (one container per API, plus one for platform data)
  • Container runtime: Docker Engine
  • Reverse proxy: Nginx (dynamic config, wildcard subdomain)
  • Billing: Stripe (subscriptions, webhooks)
  • Rate limiting: Redis (sliding window counters)
  • Hosting: Single VPS — no Kubernetes, no cloud overhead

What's In This Series

This is the first of four posts covering how APIForge is built:

  1. This post — What it does and why
  2. Dynamic Container Provisioning & Isolated PostgreSQL — The provisioning pipeline, from LLM schema extraction to a running container
  3. Subdomain Routing, OpenAPI Generation & API Management — Dynamic Nginx config, auto-generated docs, rate limiting, and request logging
  4. Stripe Billing & Production Deployment — Tier enforcement, webhook handling, and what shipping to prod looked like

APIForge is live at apiforgelive.xyz. Source at github.com/ishikabhoyar/apiforge.

Manage API access across your organization:

// Admin creates team-scoped keys
const teamKey = await apiforge.keys.create({
  name: "Marketing Team",
  quota: {
    monthlyBudget: 500, // $500/month
    requestsPerMinute: 100,
  },
  allowedModels: ["gpt-4", "claude-3"],
  allowedIPs: ["203.0.113.0/24"],
});
 
// Team member uses key with automatic quota enforcement
const client = new APIForge({ apiKey: teamKey.key });

Enterprise Features:

  • Team-based quota management
  • IP whitelisting
  • Model access controls
  • Audit logging
  • Chargeback reporting

Intelligent Routing Strategies

Strategy 1: Cost-Optimized Routing

APIForge automatically selects the cheapest provider for your request:

// Real-time pricing comparison
const pricing = {
  "gpt-4": { input: 0.03, output: 0.06 }, // per 1K tokens
  "claude-3-opus": { input: 0.015, output: 0.075 },
  "gemini-pro": { input: 0.00025, output: 0.0005 },
};
 
// APIForge calculates expected cost and routes accordingly
const response = await client.chat.completions.create({
  model: "auto",
  routingStrategy: "cheapest",
  messages: [{ role: "user", content: longPrompt }],
});
 
// Response includes cost breakdown
console.log(response.usage.cost); // { provider: 'gemini-pro', cost: 0.002 }

Strategy 2: Latency-Optimized Routing

Route to the fastest provider based on real-time latency:

const response = await client.chat.completions.create({
  model: "auto",
  routingStrategy: "fastest",
  messages: [{ role: "user", content: "Quick response needed" }],
});

How it Works:

  • Real-time latency monitoring per provider
  • Geographic routing (nearest provider)
  • Historical performance data
  • Automatic failover on timeout

Strategy 3: Quality-Optimized Routing

Use the best model for the task:

const response = await client.chat.completions.create({
  model: "auto",
  routingStrategy: "quality",
  task: "code-generation", // or 'creative-writing', 'analysis', etc.
  messages: [{ role: "user", content: "Write a Python function" }],
});

Real-Time Billing & Usage Tracking

Usage Dashboard

// Get real-time usage statistics
const usage = await client.usage.get({
  period: "today",
});
 
console.log(usage);
// {
//   requests: 1247,
//   tokens: { input: 89234, output: 34521 },
//   cost: 12.34,
//   breakdown: {
//     'openai:gpt-4': { requests: 850, cost: 9.50 },
//     'anthropic:claude-3': { requests: 397, cost: 2.84 }
//   }
// }

Cost Attribution

// Tag requests for cost tracking
const response = await client.chat.completions.create({
  model: "gpt-4",
  messages: [{ role: "user", content: prompt }],
  metadata: {
    project: "chatbot-v2",
    environment: "production",
    userId: "user_123",
  },
});
 
// Query costs by tag
const costs = await client.billing.costs({
  groupBy: "project",
  period: "month",
});

Budget Alerts

// Set up spending alerts
await client.billing.alerts.create({
  type: "budget",
  threshold: 100, // $100
  period: "monthly",
  action: "notify", // or 'block'
  notification: {
    email: "admin@example.com",
    slack: "https://hooks.slack.com/...",
  },
});

Rate Limiting & Quota Management

APIForge provides sophisticated rate limiting at multiple levels:

Per-Key Rate Limits

// Create key with specific rate limits
const key = await apiforge.keys.create({
  name: "High-Volume Key",
  limits: {
    requestsPerMinute: 1000,
    tokensPerDay: 10000000,
    concurrentRequests: 50,
  },
});

User-Level Quotas

// Enforce quotas per end-user
const response = await client.chat.completions.create({
  model: "gpt-4",
  messages: [{ role: "user", content: prompt }],
  user: "user_123", // Quota enforced per user
});
 
// Get user's quota status
const quota = await client.quotas.get({ userId: "user_123" });
// { used: 850, limit: 1000, resetAt: '2025-02-05T00:00:00Z' }

Tiered Pricing

// Different tiers with different quotas
const tiers = {
  free: {
    requestsPerDay: 100,
    maxTokens: 10000,
    models: ["gpt-3.5-turbo"],
  },
  pro: {
    requestsPerDay: 10000,
    maxTokens: 1000000,
    models: ["gpt-4", "claude-3", "gemini-pro"],
  },
  enterprise: {
    requestsPerDay: Infinity,
    maxTokens: Infinity,
    models: "*",
    sla: "99.9%",
  },
};

Edge Architecture & Performance

Global Distribution

APIForge is deployed on Cloudflare Workers, ensuring:

Request Flow:
User (Tokyo) → Nearest Edge (Tokyo) → APIForge → OpenAI API
                     ↓
              1-5ms overhead
                     ↓
              Total: ~50-100ms (vs 200-400ms without edge)

Performance Benchmarks

Operation                    | Direct API | APIForge
-----------------------------|------------|----------
Simple completion (100 tokens) | 80ms     | 85ms
Large completion (2000 tokens) | 3.2s     | 3.21s
Provider failover              | N/A      | 150ms
Rate limit check               | N/A      | <1ms
Usage tracking                 | N/A      | <1ms

Key Insight: APIForge adds only 1-5ms overhead while providing routing, billing, and failover.

API Schema & Versioning

Unified Schema

APIForge normalizes responses across providers:

// Unified completion response
interface CompletionResponse {
  id: string;
  object: "chat.completion";
  created: number;
  model: string;
  provider: string; // 'openai' | 'anthropic' | 'google'
  choices: Array<{
    index: number;
    message: {
      role: "assistant";
      content: string;
    };
    finish_reason: "stop" | "length" | "content_filter";
  }>;
  usage: {
    prompt_tokens: number;
    completion_tokens: number;
    total_tokens: number;
    cost: number; // In USD
  };
}

Version Management

// Use specific API version
const client = new APIForge({
  apiKey: "af_live_xxx",
  version: "v1.1.0",
});
 
// Or auto-upgrade
const client = new APIForge({
  apiKey: "af_live_xxx",
  version: "latest",
});

Security & Compliance

API Key Management

// Create scoped keys
const readOnlyKey = await apiforge.keys.create({
  name: "Analytics Dashboard",
  permissions: ["usage:read", "billing:read"],
  expiresAt: "2025-12-31",
});
 
// Rotate keys without downtime
const newKey = await apiforge.keys.rotate({
  oldKey: "af_live_old_xxx",
  gracePeriod: 86400, // 24 hours
});

Audit Logging

// Every request is logged
const logs = await client.audit.logs({
  startDate: "2025-02-01",
  endDate: "2025-02-04",
  filters: {
    userId: "user_123",
    statusCode: [400, 401, 403],
  },
});
 
// Logs include:
// - Timestamp
// - User/Key
// - Request details
// - Response status
// - Cost incurred
// - Provider used

Best Practices

1. Use Routing Strategies Wisely

// For production: prioritize reliability
const prodConfig = {
  routingStrategy: "quality",
  fallbackProviders: ["anthropic", "google"],
  retryAttempts: 3,
};
 
// For development: prioritize cost
const devConfig = {
  routingStrategy: "cheapest",
  fallbackProviders: [],
  retryAttempts: 1,
};

2. Tag Requests for Analytics

// Always include metadata
const response = await client.chat.completions.create({
  model: "gpt-4",
  messages: [{ role: "user", content: prompt }],
  metadata: {
    feature: "chat",
    version: "v2",
    userId: user.id,
    sessionId: session.id,
  },
});

3. Implement Graceful Degradation

async function getAIResponse(prompt) {
  try {
    return await client.chat.completions.create({
      model: "gpt-4",
      messages: [{ role: "user", content: prompt }],
    });
  } catch (error) {
    if (error.code === "QUOTA_EXCEEDED") {
      // Fallback to cheaper model
      return await client.chat.completions.create({
        model: "gpt-3.5-turbo",
        messages: [{ role: "user", content: prompt }],
      });
    }
    throw error;
  }
}

4. Monitor Usage Proactively

// Set up webhooks for real-time alerts
await client.webhooks.create({
  url: "https://your-app.com/webhooks/apiforge",
  events: [
    "quota.warning", // 80% quota used
    "quota.exceeded",
    "provider.down",
    "unusual.activity",
  ],
});

Cost Comparison

Scenario: 1M Requests/Month

Direct OpenAI:
- 1M requests × GPT-4 × avg 500 tokens
= 500M tokens
= $15,000/month

APIForge (Intelligent Routing):
- 400K requests → GPT-4 (high-quality tasks) = $6,000
- 300K requests → Claude-3 Sonnet (analysis) = $2,250
- 300K requests → Gemini Pro (simple tasks) = $75
= $8,325/month + $50 APIForge fee
= $8,375/month

Savings: $6,625/month (44% reduction)

Conclusion

APIForge revolutionizes how developers integrate AI services:

  1. Single API: One endpoint for all AI providers
  2. Intelligent Routing: Cost, latency, and quality optimization
  3. Real-Time Billing: Track every token and dollar spent
  4. Enterprise-Ready: Rate limiting, quotas, team management
  5. Edge-First: Sub-10ms overhead with global deployment
  6. Developer-Friendly: Clean APIs, comprehensive SDKs

Whether you're building a simple chatbot or a complex multi-model AI platform, APIForge provides the infrastructure to scale reliably while optimizing costs.

Quick Start Checklist

  • Sign up at apiforge.com
  • Create your first API key
  • Install the SDK: npm install @apiforge/sdk
  • Make your first request with automatic provider routing
  • Set up usage alerts and budgets
  • Configure routing strategies for your use case
  • Monitor real-time usage in the dashboard
  • Scale to millions of requests without infrastructure changes

Start building with APIForge today and stop worrying about AI provider complexity!


Next in this series: Part 2 - Deep Dive into Intelligent Routing Strategies where we'll explore advanced routing algorithms, custom routing logic, and real-world optimization techniques.