GPU / CPU

Fastino offers two types of model-serving endpoints—CPU and GPU—designed to support different usage patterns and performance needs.

CPU Endpoints

CPU endpoints are the default option for Build (Free) tier users and provide a cost-effective, energy efficient environment for lightweight, experimental, or offline workloads where latency or throughput are less critical.

Key details:

✅ Available on Build, Pro, and Team plans

✅ Energy efficient

❌ Batching not supported – only one input per request is allowed

GPU Endpoints

GPU endpoints deliver high-throughput, low-latency performance and are exclusively available to Pro and Team users. They’re ideal for real-time production use cases and high-volume processing.

Key details:

✅ Available only on Pro and Team plans

✅ Fully supports batching and single input requests

✅ Lower latency

✅ Higher throughput

PreviousPrivacy Mode NextClassification

Last updated 1 day ago

GPU / CPU

Fastino offers two types of model-serving endpoints—CPU and GPU—designed to support different usage patterns and performance needs.

CPU Endpoints

Key details:

✅ Available on Build, Pro, and Team plans

✅ Energy efficient

❌ Batching not supported – only one input per request is allowed

GPU Endpoints

GPU endpoints deliver high-throughput, low-latency performance and are exclusively available to Pro and Team users. They’re ideal for real-time production use cases and high-volume processing.

Key details:

✅ Available only on Pro and Team plans

✅ Fully supports batching and single input requests

✅ Lower latency

✅ Higher throughput

PreviousPrivacy Mode NextClassification

Last updated 1 day ago