feat: add remote vLLM support with provider abstraction layer

- LLMClient Protocol for provider-agnostic inference
- VLLMClient for OpenAI-compatible /v1/chat/completions API
- LLM client factory with provider routing (ollama/vllm)
- VLLMConfig with VLLM_* environment variable loading
- Updated extractor worker with health check and provider switching
- Updated event classifier to use LLMClient protocol
- Helm values for vLLM configuration
- 18 unit tests + 6 property-based tests
- Full backward compatibility preserved
This commit is contained in:
Celes Renata
2026-04-23 08:17:23 +00:00
parent 63e4fb96ea
commit 117b693b19
15 changed files with 1876 additions and 77 deletions
+13
View File
@@ -155,6 +155,19 @@ class OllamaClient:
if self._owns_client:
await self._http.aclose()
async def call_llm(
self,
prompts: dict[str, str],
json_schema: dict[str, object],
document_text: str = "",
) -> ExtractionAttempt:
"""Public LLM client interface — delegates to _call_ollama().
Satisfies the LLMClient protocol so OllamaClient can be used
interchangeably with VLLMClient.
"""
return await self._call_ollama(prompts, json_schema, document_text)
async def extract(
self,
document_text: str,