feat(router): 集成NVIDIA多头分类器实现3-tier智能路由
- 新增nvidia_router.py: 手动加载NVIDIA prompt-task-and-complexity-classifier模型 - DeBERTa-v3-base backbone + 8个分类头(task_type/creativity/reasoning/domain等) - 综合多维度评分实现simple/medium/complex三级路由 - 映射: simple->qwen-flash, medium->qwen-plus, complex->qwen-max - main.py切换到NVIDIA路由替代RouteLLM BERT二分类 - 移除LiteLLM依赖解决版本冲突,使用原生httpx调用 - 版本升级至v0.3.0
This commit is contained in:
@@ -7,5 +7,7 @@ httpx>=0.25.0
|
||||
python-dotenv>=1.0.0
|
||||
transformers>=4.30.0
|
||||
torch>=2.0.0
|
||||
# NVIDIA Multi-head Classifier for 3-tier routing
|
||||
# nvidia/prompt-task-and-complexity-classifier will be loaded via transformers
|
||||
pytest>=7.4.0
|
||||
pytest-asyncio>=0.21.0
|
||||
|
||||
Reference in New Issue
Block a user