LLM Inference Engineering
- Undertitel
- Quantization, KV-Cache Optimization, and High-Throughput Serving: A Production Engineer's Guide to INT4/INT8 Quantization, vLLM, TGI, Speculative Decoding, and Cost Optimization
- Författare
- Chatvariety Team
- ISBN
- 9798180985187
- Språk
- engelska
- Vikt
- 122 gram
- Utgivningsdatum
- 10.6.2026
- Förlag
- Independently Published
- Sidor
- 84
