← All hardware
Power-efficient inference chip: 512 TFLOPS FP8 at 150-200W
Pros
- Outstanding performance per watt
- Air-coolable inference option
- Strong LLM efficiency
- Independent NVIDIA alternative
Cons
- Small software ecosystem
- Inference-focused (not training)
- Limited availability
- Newer vendor risk
✓ Where it shines / best for
- Power- and cost-efficient LLM inference in data centers
- Enterprises wanting a low-TDP alternative to inference GPUs
- On-prem inference deployments in standard PCIe servers
✕ Not the best fit for
- Large-scale model training (it is inference-focused)
- CUDA-only workloads without porting to the Furiosa SDK
- Consumer or edge/mobile on-device use
Features
- ✓ AI inference
- ✓ Data-center scale
- ✓ LLM
- ✓ Energy Efficient
- ✓ HBM3
- ✓ PCIe
- ✓ Generative AI
- ✓ Low-power / efficient
Pricing
| Plan | Price | Billing | Notes |
|---|---|---|---|
| Card/system (purchase) | Not publicly listed | one-time | RNGD PCIe inference card sold to enterprises via direct sales and server partners; pricing not publicly posted. Contact FuriosaAI for quotes/evaluation units. |
Pricing verified from the official source. Prices change often — confirm on the vendor's site before buying.
Specifications
| use | Data center inference |
| power | 150-200W |
| memory | HBM3 |
| performance | 512 TFLOPS FP8 / 512 TOPS INT8 |
| architecture | Tensor Contraction Processor (TSMC 5nm) |
Sponsored
A full review is being generated for this product and will appear here shortly.