llm

Groq

World's fastest AI inference — LPU chips delivering 500+ tokens/sec Groq offers ultra-fast inference for open-source LLMs (Llama 3, Mixtral, Gemma) using proprietary LPU hardware.

Visit Groq

Pricing

API / Token Pricing

Model / Plan

Price

Unit

Llama 4 Scout Input

$0.11

1M tokens (input)

Llama 4 Scout Output

$0.34

1M tokens (output)

Llama 4 Maverick Input

$0.5

1M tokens (input)

Llama 4 Maverick Output

$0.77

1M tokens (output)

Llama 3.3 70B Input

$0.59

1M tokens (input)

Llama 3.3 70B Output

$0.79

1M tokens (output)

Llama 3.1 8B Input

$0.05

1M tokens (input)

Llama 3.1 8B Output

$0.08

1M tokens (output)

Features

context window131072

api available✓

supports vision✓

function calling✓

open source✗

streaming✓

free tier✓

Links

Official Website Full Changelog

Data Freshness

Pricing updated: 19m ago