Pricing

Explore pay-per-token plans to get started and dedicated resources to scale.

Start for free

Start experimenting with $20 in free credit to fine-tune or inference any model.

Choose the right model

Select from a variety of models with different capabilities and price points.

Built to scale

Switch from pay-per-token to dedicated resources to save on usage costs as you scale.

Start with pay-per-token

Choose the right model for your task and only pay for the resources you use.

Prices are per 1,000 tokens. Tokens are ~4 characters, where 1,000 tokens is about 750 words.

Model
Parameters
Completions usage
6
billion
$
0.003
 / 1000 tokens
16
billion
$
0.008
 / 1000 tokens
20
billion
$
0.010
 / 1000 tokens

Scale with dedicated resources

Dedicated resources are GPUs to host your models. They offer more reliable latency and increased throughput compared to shared resources (pay-per-token) while being more cost efficient at scale.

Multiple models of the same type and size can be hosted on a single GPU.

Model
Parameters
GPU usage
6
billion
$
1.39
 / GPU hour
16
billion
$
1.67
 / GPU hour
20
billion
$
1.94
 / GPU hour
Model
Parameters
GPU usage
6
billion
$
2.22
 / GPU hour
16
billion
$
2.50
 / GPU hour
20
billion
$
2.78
 / GPU hour
Model
Parameters
GPU usage
6
billion
$
2.78
 / GPU hour
16
billion
$
3.88
 / GPU hour
20
billion
$
4.17
 / GPU hour

Fine-tune your models

Train your own custom models by fine-tuning on your training data.

When using your fine-tuned model, you’ll be billed at the same rate as the base model.

Model
Parameters
Training cost
6
billion
$
0.0002
 / training example
16
billion
$
0.0005
 / training example
20
billion
$
0.0008
 / training example

Frequently asked questions

What is a token?
What is a GPU hour?
How will I know how many tokens or GPU hours I use?

Ready to get started?

Start fine-tuning and deploying language models or explore Forefront Solutions.

Transparent, flexible pricing

Pay per token or per hour with flat-rate hourly GPUs. No hidden fees or confusing math.

pricing details
Start your integration

Get up an running with your models in just a few minutes.

documentation