Fastino
  • Playground
  • Community
  • Blog
  • GET STARTED
    • Quickstart
    • Use the API
    • Rate Limits
    • Privacy Mode
    • GPU / CPU
  • Models
    • Classification
    • PII
    • Information Extraction
    • Text to JSON (experimental)
    • Function Calling (experimental)
  • Summarizaton (experimental)
  • Profanity Censor (experimental)
Powered by GitBook
On this page
  • CPU Endpoints
  • GPU Endpoints
  1. GET STARTED

GPU / CPU

Fastino offers two types of model-serving endpoints—CPU and GPU—designed to support different usage patterns and performance needs.

CPU Endpoints

CPU endpoints are the default option for Build (Free) tier users and provide a cost-effective, energy efficient environment for lightweight, experimental, or offline workloads where latency or throughput are less critical.

Key details:

✅ Available on Build, Pro, and Team plans

✅ Energy efficient

❌ Batching not supported – only one input per request is allowed

GPU Endpoints

GPU endpoints deliver high-throughput, low-latency performance and are exclusively available to Pro and Team users. They’re ideal for real-time production use cases and high-volume processing.

Key details:

✅ Available only on Pro and Team plans

✅ Fully supports batching and single input requests

✅ Lower latency

✅ Higher throughput

PreviousPrivacy ModeNextClassification

Last updated 1 day ago