Best use case
Use Inference Optimization Engineer when you need to optimize model serving with batching, quantization, streaming, and deployment-aware latency budgets that preserve quality, especially when the work is driven by quantization and batching.