← All hardware
Google's 7th-gen TPU built from the ground up for the age of inference.
Pros
- Massive 9,216-chip single-pod scale
- Excellent perf/watt and price/performance via cloud
- Deep integration with Google Cloud and JAX
Cons
- Cloud-only; cannot be purchased on-prem
- Locked to Google Cloud ecosystem
- Smaller third-party software ecosystem than CUDA
✓ Where it shines / best for
- Large-scale, latency-sensitive LLM inference serving
- Frontier-model and Mixture-of-Experts deployment on Google Cloud
- Enterprises needing high memory-capacity-per-chip for big models
✕ Not the best fit for
- On-premises hardware buyers (cloud-only)
- Tiny or hobbyist workloads
- Teams requiring native CUDA ecosystem
Features
- ✓ LLM
- ✓ API access
- ✓ HBM3E
- ✓ Inference
- ✓ Training
- ✓ High Bandwidth
- ✓ Jax
- ✓ Cloud TPU
- ✓ Moe
Pricing
| Plan | Price | Billing | Notes |
|---|---|---|---|
| Cloud rental (GA pricing) | Contact Google Cloud | per chip-hour | Ironwood (7th-gen TPU) offered via Google Cloud; public per-chip-hour list pricing limited at launch — quoted through sales/committed-use. |
| Committed-use discounts | varies | 1-yr / 3-yr | Standard Google Cloud CUD discounts apply once generally available. |
Pricing verified from the official source. Prices change often — confirm on the vendor's site before buying.
Specifications
| memory | 192 GB HBM per chip, ~7.37 TB/s |
| cooling | liquid-cooled, cloud-only |
| pod_scale | 9,216 chips per pod |
| architecture | Ironwood (TPU v7 / tpu7x), Google custom ASIC |
| perf_per_watt | 2x vs TPU v6e Trillium |
| fp8_performance | ~4.6 PFLOPS peak per chip |
Sponsored
A full review is being generated for this product and will appear here shortly.