What is Throughput?
AI EngineeringLast updated:
The number of AI requests or tokens a system can process per unit of time.
Throughput measures system capacity for handling concurrent AI requests. Optimization strategies include batching, model parallelism, and efficient infrastructure scaling.