- 新增nvidia_router.py: 手动加载NVIDIA prompt-task-and-complexity-classifier模型 - DeBERTa-v3-base backbone + 8个分类头(task_type/creativity/reasoning/domain等) - 综合多维度评分实现simple/medium/complex三级路由 - 映射: simple->qwen-flash, medium->qwen-plus, complex->qwen-max - main.py切换到NVIDIA路由替代RouteLLM BERT二分类 - 移除LiteLLM依赖解决版本冲突,使用原生httpx调用 - 版本升级至v0.3.0
14 lines
325 B
Plaintext
14 lines
325 B
Plaintext
fastapi>=0.104.0
|
|
uvicorn[standard]>=0.24.0
|
|
pydantic>=2.5.0
|
|
litellm>=1.0.0
|
|
tiktoken>=0.5.0
|
|
httpx>=0.25.0
|
|
python-dotenv>=1.0.0
|
|
transformers>=4.30.0
|
|
torch>=2.0.0
|
|
# NVIDIA Multi-head Classifier for 3-tier routing
|
|
# nvidia/prompt-task-and-complexity-classifier will be loaded via transformers
|
|
pytest>=7.4.0
|
|
pytest-asyncio>=0.21.0
|