Model Fleet

Choose Your Intelligence Level

Select the right neural model for your workload. From lightning-fast edge models to our most capable reasoning engine.

Edge / Fast

Lightning fast responses for simple tasks and extraction.

Context Window8,000 tokens

Generation SpeedVery High (~120 t/s)

Input (per 1M)$0.10

Output (per 1M)$0.20

Balanced / Capable

The sweet spot of intelligence, speed, and cost efficiency.

Context Window128,000 tokens

Generation SpeedHigh (~50 t/s)

Input (per 1M)$0.50

Output (per 1M)$1.50

Heavy / Reasoning

Maximum intelligence for solving the hardest multi-step problems.

Context Window200,000 tokens

Generation SpeedModerate (~20 t/s)

Input (per 1M)$5.00

Output (per 1M)$15.00