1 posts
#llm-inference
vLLM v0.21.0 Production Update: KV Offload and Multi-Server Port Bug
v0.22.0 doesn't exist yet. v0.21.0 ships KV offload, spec decode, and a multi-server port bug still under review.
Creeta
Showing 1 of 1 posts
v0.22.0 doesn't exist yet. v0.21.0 ships KV offload, spec decode, and a multi-server port bug still under review.