80 lines
1.6 KiB
Markdown
80 lines
1.6 KiB
Markdown
|
|
# RAG 知识库服务 - 部署指南
|
|||
|
|
|
|||
|
|
## 环境要求
|
|||
|
|
|
|||
|
|
- Python 3.11(必须是 3.11,不能用 3.14)
|
|||
|
|
- Ollama(本地大模型服务)
|
|||
|
|
|
|||
|
|
## 一、安装 Python 3.11
|
|||
|
|
|
|||
|
|
下载地址:https://www.python.org/downloads/release/python-3119/
|
|||
|
|
|
|||
|
|
安装时勾选 "Add Python to PATH"
|
|||
|
|
|
|||
|
|
## 二、安装 Ollama
|
|||
|
|
|
|||
|
|
下载地址:https://ollama.com/download
|
|||
|
|
|
|||
|
|
安装后运行以下命令下载模型:
|
|||
|
|
```bash
|
|||
|
|
ollama pull nomic-embed-text
|
|||
|
|
ollama pull qwen2.5:7b
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 三、安装依赖
|
|||
|
|
|
|||
|
|
在 `rag-python` 目录下运行:
|
|||
|
|
```bash
|
|||
|
|
py -3.11 -m pip install -r requirements.txt
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 四、使用方法
|
|||
|
|
|
|||
|
|
### 1. 添加文档
|
|||
|
|
|
|||
|
|
把要索引的文档放到 `knowledge_docs` 文件夹中
|
|||
|
|
|
|||
|
|
支持的格式:`.txt` `.md` `.pdf` `.docx`
|
|||
|
|
|
|||
|
|
### 2. 建立索引
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
py -3.11 batch_index.py
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
注意:扫描版 PDF 需要 OCR 识别,速度较慢(每页约 5-10 秒)
|
|||
|
|
|
|||
|
|
### 3. 启动服务
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
py -3.11 app.py
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
服务默认运行在 http://localhost:5000
|
|||
|
|
|
|||
|
|
## 五、常见问题
|
|||
|
|
|
|||
|
|
### Q: 提示缺少模块?
|
|||
|
|
```bash
|
|||
|
|
py -3.11 -m pip install 模块名
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Q: OCR 识别很慢?
|
|||
|
|
扫描版 PDF 需要逐页识别,272 页大约需要 20-30 分钟。有 GPU 会快很多。
|
|||
|
|
|
|||
|
|
### Q: 如何测试服务?
|
|||
|
|
```bash
|
|||
|
|
curl http://localhost:5000/api/knowledge/search?query=测试
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 六、目录结构
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
rag-python/
|
|||
|
|
├── knowledge_docs/ # 放入要索引的文档
|
|||
|
|
├── index_data/ # 生成的索引文件(自动创建)
|
|||
|
|
├── batch_index.py # 批量索引脚本
|
|||
|
|
├── app.py # Web 服务入口
|
|||
|
|
└── requirements.txt # 依赖列表
|
|||
|
|
```
|