model hubs servingQuick Start ↓

Get Started with JetStream

Throughput and memory optimized engine for LLM inference on XLA devices.

Getting Started

1

Read the official documentation

The JetStream team maintains comprehensive docs that cover installation, configuration, and common patterns.

Open JetStream Docs
2

Create an account

Visit the JetStream website to create your account and explore pricing options.

Visit JetStream
3

Review strengths, tradeoffs, and alternatives

Our full tool profile covers JetStream's strengths, weaknesses, pricing, and how it compares to alternatives.

View full profile

Best For

Teams working on high-throughput applications that require efficient use of TPUs.

Projects focused on optimizing memory and throughput in LLM inference tasks.

Resources