A high-throughput and memory-efficient inference and serving engine for LLMs
vllm is a standout tool — capabilities like language-models, persistent-memory, with native support for REST.
Leverage language-models for enhanced productivity
Persistent memory for better context
| Type | Tool |
| Language | Python |
| Trust Score | 12.0/100 (New) |
| Protocols | REST |
| Install | pip install vllm |
| Source | https://pypi.org/project/vllm/ |
Add a trust badge to your README:
[](https://fushu.dev/agent/c8fe6b6ce20e)
click to copy
Install now and integrate into your workflow in minutes.