LoRA Resolver Plugins¶
This directory contains vLLM's LoRA resolver plugins built on the LoRAResolver framework. They automatically discover and load LoRA adapters from a specified local storage path, eliminating the need for manual configuration or server restarts.
Overview¶
LoRA Resolver Plugins provide a flexible way to dynamically load LoRA adapters at runtime. When vLLM receives a request for a LoRA adapter that hasn't been loaded yet, the resolver plugins will attempt to locate and load the adapter from their configured storage locations. This enables:
- Dynamic LoRA Loading: Load adapters on-demand without server restarts
- Multiple Storage Backends: Support for filesystem, S3, and custom backends. The built-in
lora_filesystem_resolverrequires a local storage path, but custom resolvers can be implemented to fetch from any source. - Automatic Discovery: Seamless integration with existing LoRA workflows
- Scalable Deployment: Centralized adapter management across multiple vLLM instances
Prerequisites¶
Before using LoRA Resolver Plugins, ensure the following environment variables are configured:
Required Environment Variables¶
-
VLLM_ALLOW_RUNTIME_LORA_UPDATING: Must be set totrueor1to enable dynamic LoRA loading -
VLLM_PLUGINS: Must include the desired resolver plugins (comma-separated list) -
VLLM_LORA_RESOLVER_CACHE_DIR: Must be set to a valid directory path for filesystem resolver
Optional Environment Variables¶
VLLM_PLUGINS: If not set, all available plugins will be loaded. If set to empty string, no plugins will be loaded.
Available Resolvers¶
lora_filesystem_resolver¶
The filesystem resolver is installed with vLLM by default and enables loading LoRA adapters from a local directory structure.
Setup Steps¶
-
Create the LoRA adapter storage directory:
-
Set environment variables:
-
Start vLLM server: Your base model can be
meta-llama/Llama-2-7b-hf. Please make sure you set up the Hugging Face token in your env varexport HF_TOKEN=xxx235.
Directory Structure Requirements¶
The filesystem resolver expects LoRA adapters to be organized in the following structure:
/path/to/lora/adapters/
├── adapter1/
│ ├── adapter_config.json
│ ├── adapter_model.bin
│ └── tokenizer files (if applicable)
├── adapter2/
│ ├── adapter_config.json
│ ├── adapter_model.bin
│ └── tokenizer files (if applicable)
└── ...
Each adapter directory must contain:
-
adapter_config.json: Required configuration file with the following structure: -
adapter_model.bin: The LoRA adapter weights file
Usage Example¶
-
Prepare your LoRA adapter:
-
Verify the directory structure:
-
Make a request using the adapter:
How It Works¶
- When vLLM receives a request for a LoRA adapter named
my_sql_adapter - The filesystem resolver checks if
/path/to/lora/adapters/my_sql_adapter/exists - If found, it validates the
adapter_config.jsonfile - If the configuration matches the base model and is valid, the adapter is loaded
- The request is processed normally with the newly loaded adapter
- The adapter remains available for future requests
Advanced Configuration¶
Multiple Resolvers¶
You can configure multiple resolver plugins to load adapters from different sources:
'lora_s3_resolver' is an example of a custom resolver you would need to implement
All listed resolvers are enabled; at request time, vLLM tries them in order until one succeeds.
Custom Resolver Implementation¶
To implement your own resolver plugin:
-
Create a new resolver class:
-
Register the resolver:
Troubleshooting¶
Common Issues¶
- "VLLM_LORA_RESOLVER_CACHE_DIR must be set to a valid directory"
- Ensure the directory exists and is accessible
-
Check file permissions on the directory
-
"LoRA adapter not found"
- Verify the adapter directory name matches the requested model name
- Check that
adapter_config.jsonexists and is valid JSON -
Ensure
adapter_model.binexists in the directory -
"Invalid adapter configuration"
- Verify
peft_typeis set to "LORA" - Check that
base_model_name_or_pathmatches your base model -
Ensure
target_modulesis properly configured -
"LoRA rank exceeds maximum"
- Check that
rvalue inadapter_config.jsondoesn't exceedmax_lora_ranksetting
Debugging Tips¶
-
Enable debug logging:
-
Verify environment variables:
-
Test adapter configuration: