262 lines
5.3 KiB
Markdown
262 lines
5.3 KiB
Markdown
|
|
# DeepSeek-R1 32B 配置完成 ✅
|
|||
|
|
|
|||
|
|
## 📋 当前配置
|
|||
|
|
|
|||
|
|
```java
|
|||
|
|
API地址: http://127.0.0.1:11434/v1/chat/completions
|
|||
|
|
模型: deepseek-r1:32b
|
|||
|
|
温度: 0.2 (更精确)
|
|||
|
|
最大Token: 800 (详细反馈)
|
|||
|
|
Top-P: 0.9 (高质量输出)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🎯 32B模型优势
|
|||
|
|
|
|||
|
|
相比7B模型:
|
|||
|
|
- ✅ **更强的语义理解能力**
|
|||
|
|
- ✅ **更准确的评分**
|
|||
|
|
- ✅ **更详细的反馈建议**
|
|||
|
|
- ✅ **更好的中文处理**
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🚀 快速测试(3步)
|
|||
|
|
|
|||
|
|
### **步骤1:验证DeepSeek服务**
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 测试模型是否可用
|
|||
|
|
curl http://127.0.0.1:11434/api/tags
|
|||
|
|
|
|||
|
|
# 应该看到:
|
|||
|
|
{
|
|||
|
|
"models": [
|
|||
|
|
{
|
|||
|
|
"name": "deepseek-r1:32b",
|
|||
|
|
...
|
|||
|
|
}
|
|||
|
|
]
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### **步骤2:启动Whisper服务**
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
cd Test/python
|
|||
|
|
python whisper_server.py
|
|||
|
|
|
|||
|
|
# 看到:
|
|||
|
|
🎤 本地Whisper语音识别服务
|
|||
|
|
📌 API接口:
|
|||
|
|
http://localhost:5001/health
|
|||
|
|
http://localhost:5001/evaluate
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### **步骤3:重新编译并启动后端**
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
cd Study-Vue-redis
|
|||
|
|
mvn clean package -DskipTests
|
|||
|
|
|
|||
|
|
# 重启后端
|
|||
|
|
# Windows: 双击ry-study-admin.jar
|
|||
|
|
# Linux: java -jar ry-study-admin/target/ry-study-admin.jar
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 📊 预期日志
|
|||
|
|
|
|||
|
|
### **启动时:**
|
|||
|
|
```
|
|||
|
|
DeepSeek本地大模型 (URL: http://127.0.0.1:11434/v1/chat/completions,
|
|||
|
|
Model: deepseek-r1:32b,
|
|||
|
|
状态: 运行中✅)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### **评测时:**
|
|||
|
|
```
|
|||
|
|
🎤 Whisper识别结果: 你好世界
|
|||
|
|
🧠 使用DeepSeek智能评分(语义理解)
|
|||
|
|
调用DeepSeek: 你是一位专业的语音评测专家。请对以下语音...
|
|||
|
|
✅ DeepSeek响应成功
|
|||
|
|
✅ DeepSeek智能评测完成: 得分=95, 反馈=发音清晰准确,语言表达流畅自然,完全符合标准要求。建议:保持当前发音水平。
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🎯 32B模型特别优势
|
|||
|
|
|
|||
|
|
### **1. 更智能的同义词识别**
|
|||
|
|
|
|||
|
|
**输入:**
|
|||
|
|
- 标准:"你好"
|
|||
|
|
- 识别:"您好"
|
|||
|
|
|
|||
|
|
**7B模型:** 可能扣分
|
|||
|
|
**32B模型:** ✅ 完全识别为正确(理解礼貌用语)
|
|||
|
|
|
|||
|
|
### **2. 更详细的反馈**
|
|||
|
|
|
|||
|
|
**7B模型反馈:**
|
|||
|
|
```
|
|||
|
|
"发音清晰"
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**32B模型反馈:**
|
|||
|
|
```
|
|||
|
|
"发音清晰准确,语调自然流畅。'您好'作为礼貌用语使用恰当,
|
|||
|
|
体现了良好的语言素养。建议:可以尝试增加语速变化,使表达
|
|||
|
|
更加生动。"
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### **3. 语法纠错能力**
|
|||
|
|
|
|||
|
|
能识别并指出:
|
|||
|
|
- 语序错误
|
|||
|
|
- 用词不当
|
|||
|
|
- 语法问题
|
|||
|
|
- 逻辑不通
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🔧 性能优化建议
|
|||
|
|
|
|||
|
|
### **32B模型资源需求:**
|
|||
|
|
|
|||
|
|
| 配置 | 最低要求 | 推荐配置 |
|
|||
|
|
|------|---------|----------|
|
|||
|
|
| **内存** | 16GB | 32GB |
|
|||
|
|
| **显存** | 12GB | 24GB |
|
|||
|
|
| **推理速度(CPU)** | ~8秒 | ~5秒 |
|
|||
|
|
| **推理速度(GPU)** | ~1秒 | ~0.5秒 |
|
|||
|
|
|
|||
|
|
### **如果速度慢,可以:**
|
|||
|
|
|
|||
|
|
1. **启用GPU加速**(如果有NVIDIA显卡)
|
|||
|
|
```bash
|
|||
|
|
# Ollama自动使用GPU,无需配置
|
|||
|
|
# 确认GPU使用:
|
|||
|
|
nvidia-smi
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
2. **使用量化模型**(速度提升2-3倍)
|
|||
|
|
```bash
|
|||
|
|
# 下载4-bit量化版本
|
|||
|
|
ollama pull deepseek-r1:32b-q4
|
|||
|
|
|
|||
|
|
# 修改配置
|
|||
|
|
private static final String MODEL_NAME = "deepseek-r1:32b-q4";
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
3. **降级到7B模型**(如果资源不足)
|
|||
|
|
```bash
|
|||
|
|
ollama pull deepseek-r1:7b
|
|||
|
|
|
|||
|
|
# 修改配置
|
|||
|
|
private static final String MODEL_NAME = "deepseek-r1:7b";
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 📝 测试用例
|
|||
|
|
|
|||
|
|
### **测试1:基本评测**
|
|||
|
|
|
|||
|
|
**录音内容:** "你好世界"
|
|||
|
|
**标准文本:** "你好世界"
|
|||
|
|
|
|||
|
|
**预期结果:**
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"score": 98,
|
|||
|
|
"accuracy": 100,
|
|||
|
|
"fluency": 98,
|
|||
|
|
"completeness": 100,
|
|||
|
|
"pronunciation": 96,
|
|||
|
|
"feedback": "发音准确,表达流畅"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### **测试2:同义词识别**
|
|||
|
|
|
|||
|
|
**录音内容:** "您好,今天天气非常不错"
|
|||
|
|
**标准文本:** "你好,今天天气很好"
|
|||
|
|
|
|||
|
|
**预期结果:**
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"score": 95,
|
|||
|
|
"accuracy": 98,
|
|||
|
|
"feedback": "语义完全正确。'您好'='你好'(礼貌用语),'非常不错'='很好'(程度副词)"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### **测试3:语法检查**
|
|||
|
|
|
|||
|
|
**录音内容:** "我昨天去了公园玩"
|
|||
|
|
**标准文本:** "我昨天去公园玩了"
|
|||
|
|
|
|||
|
|
**预期结果:**
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"score": 92,
|
|||
|
|
"accuracy": 95,
|
|||
|
|
"feedback": "语义正确,但语序略有不同。建议:'去公园玩了'更符合中文表达习惯"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🎯 评分标准(32B模型优化)
|
|||
|
|
|
|||
|
|
### **准确度 (Accuracy)**
|
|||
|
|
- 100%: 完全一致或同义词
|
|||
|
|
- 90-99%: 语义正确,表达略有差异
|
|||
|
|
- 80-89%: 主要内容正确,细节有误
|
|||
|
|
- <80%: 内容有明显错误
|
|||
|
|
|
|||
|
|
### **流利度 (Fluency)**
|
|||
|
|
- 100%: 表达自然流畅
|
|||
|
|
- 90-99%: 基本流畅,略有停顿
|
|||
|
|
- 80-89%: 有明显停顿但可理解
|
|||
|
|
- <80%: 不流畅,影响理解
|
|||
|
|
|
|||
|
|
### **完整度 (Completeness)**
|
|||
|
|
- 100%: 完整表达所有内容
|
|||
|
|
- 90-99%: 内容基本完整
|
|||
|
|
- 80-89%: 遗漏部分内容
|
|||
|
|
- <80%: 内容严重不完整
|
|||
|
|
|
|||
|
|
### **发音 (Pronunciation)**
|
|||
|
|
- 100%: 发音标准清晰
|
|||
|
|
- 90-99%: 发音清楚,略有口音
|
|||
|
|
- 80-89%: 发音可辨,但不够清晰
|
|||
|
|
- <80%: 发音模糊,难以辨认
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## ✅ 配置检查清单
|
|||
|
|
|
|||
|
|
- [x] DeepSeek API地址:http://127.0.0.1:11434
|
|||
|
|
- [x] 模型名称:deepseek-r1:32b
|
|||
|
|
- [x] 温度参数:0.2(精确)
|
|||
|
|
- [x] 最大Token:800(详细反馈)
|
|||
|
|
- [ ] Whisper服务已启动(5001端口)
|
|||
|
|
- [ ] 后端已重新编译
|
|||
|
|
- [ ] 后端服务已重启
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🎉 下一步
|
|||
|
|
|
|||
|
|
1. ✅ **启动Whisper服务**
|
|||
|
|
2. ✅ **重新编译后端**
|
|||
|
|
3. ✅ **在APP中测试录音**
|
|||
|
|
4. ✅ **查看后端日志**
|
|||
|
|
5. ✅ **验证智能评分效果**
|
|||
|
|
|
|||
|
|
**32B模型将提供最智能、最准确的语音评测!** 🧠✨
|