Compare commits
54 Commits
33f0e48bf5
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| 498d5110e9 | |||
| 851d536b59 | |||
| adc9c76864 | |||
| 624e158be9 | |||
| 5bc40abbc1 | |||
| bd2c457f54 | |||
| 179bfa327b | |||
| c2357ffb67 | |||
| 0d287e7c1f | |||
| 674ee1e1e2 | |||
| 0cf231f9f7 | |||
| f82da3bab1 | |||
| 22a50ad5c6 | |||
| 0d9e427a34 | |||
| ec68b83827 | |||
| 130bbfb090 | |||
| 6e83136dc6 | |||
| f6f4da7d07 | |||
| a2be43d42a | |||
| a4c106fa5a | |||
| f24ca9aa29 | |||
| a537d3825b | |||
| e67931c3ca | |||
| b7cd03434d | |||
| a9d6c4699d | |||
| 3984b81f86 | |||
| d62cd2fcca | |||
| d44a294bf7 | |||
| 57e0029eb1 | |||
| a2ecc7f451 | |||
| 6ae10c9d36 | |||
| 20b2f46533 | |||
| 43ec564daa | |||
| 8cc25b7c2e | |||
| a158e3d6bf | |||
| 71bef2bd06 | |||
| b62d4ff40d | |||
| 272f4440fd | |||
| 1693c1963f | |||
| e614bfcf93 | |||
| 28ea813110 | |||
| 18aff6b945 | |||
| 9c48648b26 | |||
| afeb00ccc4 | |||
| deea6764cf | |||
| 9e20d439bf | |||
| 389486ad6e | |||
| f24acb18cf | |||
| c2836428ca | |||
| 9026aa8f4b | |||
| ff7e114324 | |||
| 1c91dd45ed | |||
| b32549c5df | |||
| 8fcf3bcfe2 |
68
.gitignore
vendored
Normal file
68
.gitignore
vendored
Normal file
@@ -0,0 +1,68 @@
|
||||
# Python
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
|
||||
# Logs
|
||||
*.log
|
||||
integrated_product_system.log
|
||||
|
||||
# Databases
|
||||
*.db
|
||||
*.sqlite
|
||||
|
||||
# IDE
|
||||
.trae/
|
||||
.vscode/
|
||||
.idea/
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
|
||||
# OS
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
|
||||
# Test files
|
||||
*test*.py
|
||||
*Test*.py
|
||||
pytest_cache/
|
||||
.tox/
|
||||
.coverage
|
||||
coverage.xml
|
||||
|
||||
# Temporary files
|
||||
*.tmp
|
||||
*.temp
|
||||
temp*.txt
|
||||
*.bak
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
temp_*.txt
|
||||
|
||||
# Bug and debug files
|
||||
*debug*.png
|
||||
*bug*.txt
|
||||
|
||||
# Batch files
|
||||
*.bat
|
||||
|
||||
# Output files
|
||||
*.out
|
||||
*.output
|
||||
|
||||
# Environment
|
||||
.env
|
||||
.env.local
|
||||
.env.*.local
|
||||
|
||||
# Documentation build
|
||||
_build/
|
||||
build/
|
||||
dist/
|
||||
*.egg-info/
|
||||
|
||||
# Other
|
||||
2025年12月*.txt
|
||||
*.png
|
||||
68
.trae/documents/实现用户关注数转换功能.md
Normal file
68
.trae/documents/实现用户关注数转换功能.md
Normal file
@@ -0,0 +1,68 @@
|
||||
## 实现计划
|
||||
|
||||
### 1. 数据库结构更新
|
||||
|
||||
* **修改`init_database`方法**:在`product_analysis`表中添加`follows`字段,用于存储转换后的用户关注数
|
||||
|
||||
### 2. 添加用户关注数转换方法
|
||||
|
||||
* **创建`convert_user_count_to_number`方法**:使用Ollama API将`user_count`文本转换为数字
|
||||
|
||||
* 处理不同格式:"53 followers" → 53,"1.9K followers" → 1900
|
||||
|
||||
* 调用Ollama API进行智能转换
|
||||
|
||||
* 返回转换后的数字
|
||||
|
||||
### 3. 集成到现有分析流程
|
||||
|
||||
* **修改`get_product_data`方法**:在查询中包含`user_count`和`url`字段
|
||||
|
||||
* **更新`analyze_products`方法**:
|
||||
|
||||
* 扩展返回值处理,包含`user_count`和`url`
|
||||
|
||||
* 在分析过程中调用转换方法处理关注数
|
||||
|
||||
* 将转换后的数字传递给保存方法
|
||||
|
||||
### 4. 更新数据保存方法
|
||||
|
||||
* **修改`save_analysis_result`方法**:添加`follows`参数,将转换后的关注数保存到数据库
|
||||
|
||||
### 5. 添加关注数分析更新功能
|
||||
|
||||
* **创建`analyze_follower_counts`方法**:
|
||||
|
||||
* 查询所有产品及其分析记录
|
||||
|
||||
* 对每个产品转换`user_count`并更新`product_analysis.follows`
|
||||
|
||||
* 处理已有分析记录的关注数更新
|
||||
|
||||
### 6. 完善工作流程
|
||||
|
||||
* **更新`run_full_workflow_async`方法**:添加第4步,执行关注数分析更新
|
||||
|
||||
## 预期效果
|
||||
|
||||
* 新的`product_analysis`表将包含`follows`字段,存储转换后的数字关注数
|
||||
|
||||
* 新分析的产品将自动转换并保存关注数
|
||||
|
||||
* 已有产品将通过额外步骤更新关注数
|
||||
|
||||
* 使用Ollama API确保转换准确性
|
||||
|
||||
## 关键技术点
|
||||
|
||||
* SQLite数据库表结构修改
|
||||
|
||||
* Ollama API调用与结果解析
|
||||
|
||||
* 文本到数字的智能转换
|
||||
|
||||
* 现有代码的无缝集成
|
||||
|
||||
* 批量数据处理与更新
|
||||
|
||||
5900
2025年11月26日19179.txt
5900
2025年11月26日19179.txt
File diff suppressed because it is too large
Load Diff
5795
2025年11月27日19551.txt
5795
2025年11月27日19551.txt
File diff suppressed because it is too large
Load Diff
5790
2026年1月15日1991.txt
Normal file
5790
2026年1月15日1991.txt
Normal file
File diff suppressed because it is too large
Load Diff
5820
2026年1月17日16419.txt
Normal file
5820
2026年1月17日16419.txt
Normal file
File diff suppressed because it is too large
Load Diff
5800
2026年1月18日9249.txt
Normal file
5800
2026年1月18日9249.txt
Normal file
File diff suppressed because it is too large
Load Diff
5840
2026年1月21日19238.txt
Normal file
5840
2026年1月21日19238.txt
Normal file
File diff suppressed because it is too large
Load Diff
5795
2026年1月22日18556.txt
Normal file
5795
2026年1月22日18556.txt
Normal file
File diff suppressed because it is too large
Load Diff
5855
2026年1月29日20470.txt
Normal file
5855
2026年1月29日20470.txt
Normal file
File diff suppressed because it is too large
Load Diff
5795
2026年1月31日91239.txt
Normal file
5795
2026年1月31日91239.txt
Normal file
File diff suppressed because it is too large
Load Diff
5800
2026年3月10日183431.txt
Normal file
5800
2026年3月10日183431.txt
Normal file
File diff suppressed because it is too large
Load Diff
5810
2026年3月8日18119.txt
Normal file
5810
2026年3月8日18119.txt
Normal file
File diff suppressed because it is too large
Load Diff
286
README.md
286
README.md
@@ -1,21 +1,60 @@
|
||||
# TopHub数据处理系统
|
||||
# TopHub数据处理与产品分析系统
|
||||
|
||||
本项目用于处理TopHub网站抓取的临时文件,对数据进行分类并存储到SQLite数据库中。
|
||||
本项目包含两个核心功能模块:
|
||||
1. TopHub网站数据抓取与处理系统
|
||||
2. ProductHunt产品抓取与AI分析系统
|
||||
|
||||
## 功能特点
|
||||
|
||||
1. **文件解析**:读取临时文件(格式为"日期+时间.txt"),每5行作为一个数据单元
|
||||
2. **数据提取**:从每个数据单元中提取标题和链接
|
||||
3. **智能分类**:调用本地API(Ollama)对标题进行自动分类
|
||||
4. **去重处理**:检查标题+日期是否已存在于数据库中,避免重复录入
|
||||
5. **进度显示**:使用进度条显示处理进度
|
||||
6. **分类标准化**:将相似分类合并为标准分类
|
||||
### TopHub数据抓取与处理
|
||||
- **网站抓取**:从tophub.today网站抓取数据,支持节点ID范围遍历
|
||||
- **智能过滤**:根据过滤列表自动跳过指定栏目内容
|
||||
- **数据存储**:将抓取数据保存到SQLite数据库
|
||||
- **分类处理**:调用本地API进行智能分类
|
||||
- **去重处理**:避免重复数据录入
|
||||
- **分类标准化**:相似分类自动合并
|
||||
|
||||
### ProductHunt产品分析
|
||||
- **产品抓取**:从ProductHunt抓取产品详细信息
|
||||
- **AI分析**:调用Ollama API分析产品开发难度
|
||||
- **数据管理**:完整的产品数据库管理
|
||||
- **关注数转换**:将文本形式的关注数转换为数字
|
||||
- **难度评分**:自动计算产品开发难度分数
|
||||
- **缺失数据补充**:自动补全缺失的产品链接和评分
|
||||
|
||||
### 数据可视化
|
||||
- **GUI查看器**:使用PySide6构建的可视化数据查看器
|
||||
- **搜索筛选**:支持关键词搜索和分类筛选
|
||||
- **分类统计**:实时显示分类统计信息
|
||||
- **数据操作**:支持批量删除、标记感兴趣和评分调整
|
||||
|
||||
## 文件说明
|
||||
|
||||
### 核心脚本
|
||||
|
||||
1. **process_temp_files.py** - 主处理脚本
|
||||
1. **tophub_scraper.py** - TopHub网站数据抓取脚本
|
||||
- 从tophub.today网站抓取数据
|
||||
- 根据过滤列表过滤内容
|
||||
- 保存数据到临时文件
|
||||
- 调用数据导入脚本
|
||||
|
||||
2. **product/integrated_product_system.py** - 全功能产品抓取与分析系统
|
||||
- 整合产品抓取和AI分析功能
|
||||
- 从tophub数据库查询ProductHunt链接
|
||||
- 使用Playwright抓取产品详细信息
|
||||
- 调用Ollama API分析产品开发难度
|
||||
- 管理产品数据库
|
||||
- 提供完整的工作流程
|
||||
|
||||
3. **db_viewer.py** - TopHub数据查看器
|
||||
- PySide6界面应用程序
|
||||
- 显示SQLite数据库中的抓取数据
|
||||
- 支持搜索、筛选和分类统计
|
||||
- 支持链接点击和数据操作
|
||||
|
||||
### 辅助脚本
|
||||
|
||||
1. **process_temp_files.py** - 临时文件处理脚本
|
||||
- 解析临时文件
|
||||
- 调用API进行分类
|
||||
- 存储到数据库
|
||||
@@ -28,30 +67,76 @@
|
||||
- 将相似分类合并为标准分类
|
||||
- 提供分类映射规则
|
||||
|
||||
### 辅助脚本
|
||||
4. **run_viewer.py** - 数据库查看器启动脚本
|
||||
- 检查依赖包
|
||||
- 启动SQLite数据库查看器
|
||||
|
||||
1. **check_db.py** - 数据库结构检查脚本
|
||||
2. **test_api.py** - API测试脚本
|
||||
3. **view_categories.py** - 查看分类示例脚本
|
||||
5. **check_db.py** - 数据库结构检查脚本
|
||||
6. **test_api.py** - API测试脚本
|
||||
7. **view_categories.py** - 查看分类示例脚本
|
||||
|
||||
## 使用方法
|
||||
|
||||
### 1. 处理临时文件
|
||||
### 1. TopHub数据抓取
|
||||
|
||||
```bash
|
||||
python process_temp_files.py
|
||||
python tophub_scraper.py
|
||||
```
|
||||
|
||||
该脚本会:
|
||||
- 扫描当前目录下的所有临时文件(格式为"日期+时间.txt")
|
||||
- 解析文件内容,提取标题和链接
|
||||
- 调用本地API对标题进行分类
|
||||
- 检查并避免重复数据
|
||||
- 存储到tophub_data.db数据库
|
||||
- 从tophub.today网站抓取数据
|
||||
- 根据过滤列表过滤内容(可配置tophub_ban_column.txt)
|
||||
- 将抓取数据保存为临时文件(格式:YYYY年MM月DD日HHMMSS.txt)
|
||||
- 调用数据导入脚本处理抓取结果
|
||||
|
||||
### 2. 清理和标准化分类
|
||||
### 2. ProductHunt产品抓取与分析
|
||||
|
||||
```bash
|
||||
# 运行完整工作流程:抓取+分析+数据补充
|
||||
python product/integrated_product_system.py
|
||||
|
||||
# 仅进行分析,不抓取数据
|
||||
python product/integrated_product_system.py --analyze-only
|
||||
|
||||
# 限制最大分析产品数量
|
||||
python product/integrated_product_system.py --max-products 100
|
||||
```
|
||||
|
||||
主要功能:
|
||||
- 从tophub数据库查询ProductHunt链接
|
||||
- 使用Playwright抓取产品详细信息
|
||||
- 调用Ollama API分析产品开发难度
|
||||
- 自动计算难度分数
|
||||
- 转换用户关注数为数字格式
|
||||
- 补全缺失的产品链接
|
||||
- 重新分析无效难度评分
|
||||
|
||||
### 3. 数据可视化查看
|
||||
|
||||
```bash
|
||||
# 启动数据库查看器
|
||||
python db_viewer.py
|
||||
```
|
||||
|
||||
或使用启动脚本:
|
||||
|
||||
```bash
|
||||
python run_viewer.py
|
||||
```
|
||||
|
||||
查看器功能:
|
||||
- 显示数据库中的抓取数据
|
||||
- 支持关键词搜索和分类筛选
|
||||
- 实时分类统计显示
|
||||
- 支持链接点击在浏览器中打开
|
||||
- 支持批量删除和评分调整
|
||||
|
||||
### 4. 分类处理
|
||||
|
||||
```bash
|
||||
# 处理临时文件
|
||||
python process_temp_files.py
|
||||
|
||||
# 清理分类中的特殊字符
|
||||
python cleanup_categories.py
|
||||
|
||||
@@ -59,74 +144,118 @@ python cleanup_categories.py
|
||||
python standardize_categories.py
|
||||
```
|
||||
|
||||
### 3. 查看数据
|
||||
|
||||
```bash
|
||||
# 查看分类示例
|
||||
python view_categories.py
|
||||
|
||||
# 检查数据库结构
|
||||
python check_db.py
|
||||
```
|
||||
|
||||
## 数据库结构
|
||||
|
||||
数据库文件为`tophub_data.db`,包含以下表:
|
||||
### 1. TopHub数据数据库 (tophub_data.db)
|
||||
|
||||
1. **tophub_entries** - 主数据表
|
||||
- id: 主键
|
||||
- text_content: 标题内容(非空)
|
||||
- link: 链接
|
||||
- category: 分类
|
||||
- scrape_time: 抓取时间
|
||||
包含TopHub网站抓取的原始数据:
|
||||
|
||||
2. **classification_progress** - 分类进度表
|
||||
- id: 主键
|
||||
- total_count: 总数量
|
||||
- processed_count: 已处理数量
|
||||
- last_updated: 最后更新时间
|
||||
- **articles** - 主数据表
|
||||
- id: 主键
|
||||
- title: 标题内容
|
||||
- url: 链接
|
||||
- category: 分类
|
||||
- source_date: 来源日期
|
||||
- score: 评分
|
||||
- is_interested: 是否感兴趣
|
||||
|
||||
- **classification_progress** - 分类进度表
|
||||
- id: 主键
|
||||
- total_count: 总数量
|
||||
- processed_count: 已处理数量
|
||||
- last_updated: 最后更新时间
|
||||
|
||||
### 2. 产品分析数据库 (products.db)
|
||||
|
||||
包含ProductHunt产品的详细信息和分析结果:
|
||||
|
||||
- **products** - 产品信息表
|
||||
- id: 主键
|
||||
- url: 产品链接(唯一)
|
||||
- name: 产品名称
|
||||
- introduction: 产品简介
|
||||
- user_count: 用户数量
|
||||
- maker_link: 制作者链接
|
||||
- maker_statement: 制作者声明
|
||||
- created_at: 创建时间
|
||||
- updated_at: 更新时间
|
||||
|
||||
- **product_analysis** - 产品分析结果表
|
||||
- id: 主键
|
||||
- original_name: 原始产品名称
|
||||
- product_intro: 产品简介
|
||||
- development_difficulty: 开发难度描述
|
||||
- ai_response: AI原始响应
|
||||
- difficulty_score: 难度分数
|
||||
- product_link: 产品链接
|
||||
- follows: 关注数
|
||||
- created_at: 创建时间
|
||||
|
||||
## API配置
|
||||
|
||||
脚本使用本地Ollama API进行分类:
|
||||
- API地址:http://localhost:11434/api/generate
|
||||
- 模型:gemma3:4b
|
||||
- 请求格式:JSON
|
||||
项目使用本地Ollama API进行AI相关任务:
|
||||
- **API地址**:http://localhost:11434/api/generate
|
||||
- **模型**:qwen3:8b
|
||||
- **请求格式**:JSON
|
||||
|
||||
主要用途:
|
||||
1. **TopHub数据分类**:对抓取的标题进行智能分类
|
||||
2. **产品开发难度分析**:分析ProductHunt产品的开发难度
|
||||
3. **用户关注数转换**:将文本形式的关注数转换为数字
|
||||
4. **难度评分计算**:自动计算产品开发难度分数
|
||||
|
||||
## 核心依赖
|
||||
|
||||
### 基础依赖
|
||||
- requests: HTTP请求处理
|
||||
- sqlite3: 数据库操作
|
||||
- loguru: 日志记录
|
||||
- tqdm: 进度条显示
|
||||
|
||||
### 产品分析依赖
|
||||
- asyncio: 异步编程
|
||||
- playwright: 网页抓取
|
||||
- PySide6: GUI界面(仅用于查看器)
|
||||
|
||||
## 日志文件
|
||||
|
||||
系统会生成以下日志文件:
|
||||
- **tophub_scraper.log** - TopHub抓取日志
|
||||
- **integrated_product_system.log** - 产品分析系统日志
|
||||
- **process_temp_files.log** - 临时文件处理日志
|
||||
- **cleanup_categories.log** - 分类清理日志
|
||||
- **standardize_categories.log** - 分类标准化日志
|
||||
|
||||
## 分类标准
|
||||
|
||||
系统支持以下标准分类:
|
||||
|
||||
1. 科技 - 新质科技、互联网等
|
||||
2. 社会 - 社会新闻、生活服务等
|
||||
3. 体育 - 体育新闻、足球等
|
||||
4. 历史 - 历史事件、历史人物等
|
||||
5. 安全 - 安全漏洞、安全科技等
|
||||
6. 军事 - 军事新闻、国防等
|
||||
7. 金融 - 金融新闻、市场分析等
|
||||
8. 购物 - 电商、购物等
|
||||
9. 游戏 - 游戏新闻等
|
||||
10. 娱乐 - 娱乐八卦、音乐等
|
||||
11. 健康 - 健康医疗、健康生活等
|
||||
1. 科技 - 新质科技、互联网、人工智能等
|
||||
2. 社会 - 社会新闻、生活服务、热点事件等
|
||||
3. 体育 - 体育新闻、足球、篮球等
|
||||
4. 历史 - 历史事件、历史人物、考古发现等
|
||||
5. 安全 - 安全漏洞、网络安全、数据安全等
|
||||
6. 军事 - 军事新闻、国防、武器装备等
|
||||
7. 金融 - 金融新闻、市场分析、投资等
|
||||
8. 购物 - 电商、购物、消费等
|
||||
9. 游戏 - 游戏新闻、游戏开发、游戏测评等
|
||||
10. 娱乐 - 娱乐八卦、音乐、影视等
|
||||
11. 健康 - 健康医疗、健康生活、健身等
|
||||
12. 其他 - 其他未分类内容
|
||||
|
||||
## 注意事项
|
||||
|
||||
1. 确保本地Ollama服务已启动并可访问
|
||||
2. 临时文件格式必须为"日期+时间.txt"
|
||||
3. 每个数据单元包含5行:节点ID、分类、标题、链接和分隔线
|
||||
4. 数据库文件会自动创建,无需手动创建
|
||||
|
||||
## 日志文件
|
||||
|
||||
系统会生成以下日志文件:
|
||||
- process_temp_files.log - 主处理日志
|
||||
- cleanup_categories.log - 分类清理日志
|
||||
- standardize_categories.log - 分类标准化日志
|
||||
1. **Ollama服务**:确保本地Ollama服务已启动并可访问(默认端口11434)
|
||||
2. **Chrome浏览器**:产品抓取功能需要已运行的Chrome浏览器实例(调试端口9222)
|
||||
3. **临时文件格式**:TopHub抓取生成的临时文件格式为"YYYY年MM月DD日HHMMSS.txt"
|
||||
4. **数据单元结构**:每个数据单元包含5行:节点ID、分类、标题、链接和分隔线
|
||||
5. **数据库自动创建**:所有数据库文件会自动创建,无需手动创建
|
||||
6. **依赖安装**:使用GUI查看器前,请安装依赖:`pip install -r requirements_gui.txt`
|
||||
7. **过滤列表配置**:可通过编辑tophub_ban_column.txt文件配置需要过滤的栏目
|
||||
|
||||
## 示例
|
||||
|
||||
### 临时文件格式示例
|
||||
### TopHub抓取临时文件示例
|
||||
|
||||
```
|
||||
节点ID: 102
|
||||
@@ -141,9 +270,18 @@ python check_db.py
|
||||
--------------------------------------------------
|
||||
```
|
||||
|
||||
### 处理结果示例
|
||||
### 产品分析结果示例
|
||||
|
||||
```
|
||||
标题 '女机器人' 分类为: 科技
|
||||
标题 '这个应该属于底盘不行吗' 分类为: 其他
|
||||
```
|
||||
产品 'AI Assistant' 分析完成
|
||||
- 难度描述: 中等难度,需要一定的AI开发经验
|
||||
- 难度分数: 60/100
|
||||
- 关注数: 1500
|
||||
```
|
||||
|
||||
### 数据库查看器界面
|
||||
|
||||
- 显示所有抓取数据,支持实时搜索和筛选
|
||||
- 分类统计显示在顶部
|
||||
- 点击链接可直接在浏览器中打开
|
||||
- 右键菜单支持批量操作和评分调整
|
||||
@@ -1,106 +0,0 @@
|
||||
2025-11-26 23:10:42.930 | INFO | __main__:check_database_structure:31 - 找到数据库文件: ['product\\product.db', 'product\\products.db']
|
||||
2025-11-26 23:10:42.931 | INFO | __main__:check_database_structure:34 -
|
||||
检查数据库: product\product.db
|
||||
2025-11-26 23:10:42.932 | INFO | __main__:check_database_structure:48 - 数据库中的表:
|
||||
2025-11-26 23:10:42.932 | INFO | __main__:check_database_structure:51 - - products
|
||||
2025-11-26 23:10:42.932 | INFO | __main__:check_database_structure:57 - 表结构:
|
||||
2025-11-26 23:10:42.932 | INFO | __main__:check_database_structure:59 - id (INTEGER)
|
||||
2025-11-26 23:10:42.934 | INFO | __main__:check_database_structure:59 - url (TEXT)
|
||||
2025-11-26 23:10:42.935 | INFO | __main__:check_database_structure:59 - name (TEXT)
|
||||
2025-11-26 23:10:42.935 | INFO | __main__:check_database_structure:59 - introduction (TEXT)
|
||||
2025-11-26 23:10:42.935 | INFO | __main__:check_database_structure:59 - user_count (TEXT)
|
||||
2025-11-26 23:10:42.935 | INFO | __main__:check_database_structure:59 - maker_link (TEXT)
|
||||
2025-11-26 23:10:42.935 | INFO | __main__:check_database_structure:59 - maker_statement (TEXT)
|
||||
2025-11-26 23:10:42.935 | INFO | __main__:check_database_structure:59 - created_at (TEXT)
|
||||
2025-11-26 23:10:42.935 | INFO | __main__:check_database_structure:59 - updated_at (TEXT)
|
||||
2025-11-26 23:10:42.935 | SUCCESS | __main__:check_database_structure:64 - 表 products 包含name和introduction字段
|
||||
2025-11-26 23:10:42.935 | INFO | __main__:check_database_structure:69 - 示例数据:
|
||||
2025-11-26 23:10:42.935 | ERROR | __main__:check_database_structure:78 - 检查数据库 product\product.db 时出错: 'NoneType' object is not subscriptable
|
||||
2025-11-26 23:10:42.935 | INFO | __main__:check_database_structure:34 -
|
||||
检查数据库: product\products.db
|
||||
2025-11-26 23:10:42.936 | INFO | __main__:check_database_structure:48 - 数据库中的表:
|
||||
2025-11-26 23:10:42.936 | INFO | __main__:check_database_structure:51 - - products
|
||||
2025-11-26 23:10:42.936 | INFO | __main__:check_database_structure:57 - 表结构:
|
||||
2025-11-26 23:10:42.937 | INFO | __main__:check_database_structure:59 - id (INTEGER)
|
||||
2025-11-26 23:10:42.937 | INFO | __main__:check_database_structure:59 - url (TEXT)
|
||||
2025-11-26 23:10:42.937 | INFO | __main__:check_database_structure:59 - name (TEXT)
|
||||
2025-11-26 23:10:42.937 | INFO | __main__:check_database_structure:59 - introduction (TEXT)
|
||||
2025-11-26 23:10:42.937 | INFO | __main__:check_database_structure:59 - user_count (TEXT)
|
||||
2025-11-26 23:10:42.937 | INFO | __main__:check_database_structure:59 - maker_link (TEXT)
|
||||
2025-11-26 23:10:42.937 | INFO | __main__:check_database_structure:59 - maker_statement (TEXT)
|
||||
2025-11-26 23:10:42.937 | INFO | __main__:check_database_structure:59 - created_at (TEXT)
|
||||
2025-11-26 23:10:42.937 | INFO | __main__:check_database_structure:59 - updated_at (TEXT)
|
||||
2025-11-26 23:10:42.937 | SUCCESS | __main__:check_database_structure:64 - 表 products 包含name和introduction字段
|
||||
2025-11-26 23:10:42.937 | INFO | __main__:check_database_structure:69 - 示例数据:
|
||||
2025-11-26 23:10:42.938 | INFO | __main__:check_database_structure:71 - 示例1: name='Pixley AI', introduction='Pixley is the first platform that lets children tu...'
|
||||
2025-11-26 23:10:42.938 | INFO | __main__:check_database_structure:71 - 示例2: name='Burner', introduction='Burner is a small, secure computer that keeps your...'
|
||||
2025-11-26 23:10:42.938 | INFO | __main__:check_database_structure:71 - 示例3: name='American Ratings Lead Magnet Portal', introduction='Build verified business credibility with the Ameri...'
|
||||
2025-11-26 23:10:42.938 | INFO | __main__:check_database_structure:51 - - sqlite_sequence
|
||||
2025-11-26 23:10:42.938 | INFO | __main__:check_database_structure:57 - 表结构:
|
||||
2025-11-26 23:10:42.938 | INFO | __main__:check_database_structure:59 - name ()
|
||||
2025-11-26 23:10:42.938 | INFO | __main__:check_database_structure:59 - seq ()
|
||||
2025-11-26 23:10:42.938 | WARNING | __main__:check_database_structure:73 - 表 sqlite_sequence 缺少name或introduction字段
|
||||
2025-11-26 23:10:42.938 | INFO | __main__:check_database_structure:51 - - product_analysis
|
||||
2025-11-26 23:10:42.938 | INFO | __main__:check_database_structure:57 - 表结构:
|
||||
2025-11-26 23:10:42.939 | INFO | __main__:check_database_structure:59 - id (INTEGER)
|
||||
2025-11-26 23:10:42.939 | INFO | __main__:check_database_structure:59 - original_id (INTEGER)
|
||||
2025-11-26 23:10:42.939 | INFO | __main__:check_database_structure:59 - original_name (TEXT)
|
||||
2025-11-26 23:10:42.939 | INFO | __main__:check_database_structure:59 - product_name (TEXT)
|
||||
2025-11-26 23:10:42.939 | INFO | __main__:check_database_structure:59 - product_intro (TEXT)
|
||||
2025-11-26 23:10:42.939 | INFO | __main__:check_database_structure:59 - development_difficulty (TEXT)
|
||||
2025-11-26 23:10:42.939 | INFO | __main__:check_database_structure:59 - ai_response (TEXT)
|
||||
2025-11-26 23:10:42.939 | INFO | __main__:check_database_structure:59 - created_at (TIMESTAMP)
|
||||
2025-11-26 23:10:42.939 | WARNING | __main__:check_database_structure:73 - 表 product_analysis 缺少name或introduction字段
|
||||
2025-11-26 23:10:49.516 | INFO | __main__:check_database_structure:31 - 找到数据库文件: ['product\\product.db', 'product\\products.db']
|
||||
2025-11-26 23:10:49.516 | INFO | __main__:check_database_structure:34 -
|
||||
检查数据库: product\product.db
|
||||
2025-11-26 23:10:49.517 | INFO | __main__:check_database_structure:48 - 数据库中的表:
|
||||
2025-11-26 23:10:49.518 | INFO | __main__:check_database_structure:51 - - products
|
||||
2025-11-26 23:10:49.519 | INFO | __main__:check_database_structure:57 - 表结构:
|
||||
2025-11-26 23:10:49.519 | INFO | __main__:check_database_structure:59 - id (INTEGER)
|
||||
2025-11-26 23:10:49.519 | INFO | __main__:check_database_structure:59 - url (TEXT)
|
||||
2025-11-26 23:10:49.519 | INFO | __main__:check_database_structure:59 - name (TEXT)
|
||||
2025-11-26 23:10:49.520 | INFO | __main__:check_database_structure:59 - introduction (TEXT)
|
||||
2025-11-26 23:10:49.521 | INFO | __main__:check_database_structure:59 - user_count (TEXT)
|
||||
2025-11-26 23:10:49.521 | INFO | __main__:check_database_structure:59 - maker_link (TEXT)
|
||||
2025-11-26 23:10:49.521 | INFO | __main__:check_database_structure:59 - maker_statement (TEXT)
|
||||
2025-11-26 23:10:49.521 | INFO | __main__:check_database_structure:59 - created_at (TEXT)
|
||||
2025-11-26 23:10:49.521 | INFO | __main__:check_database_structure:59 - updated_at (TEXT)
|
||||
2025-11-26 23:10:49.521 | SUCCESS | __main__:check_database_structure:64 - 表 products 包含name和introduction字段
|
||||
2025-11-26 23:10:49.522 | INFO | __main__:check_database_structure:69 - 示例数据:
|
||||
2025-11-26 23:10:49.522 | ERROR | __main__:check_database_structure:78 - 检查数据库 product\product.db 时出错: 'NoneType' object is not subscriptable
|
||||
2025-11-26 23:10:49.522 | INFO | __main__:check_database_structure:34 -
|
||||
检查数据库: product\products.db
|
||||
2025-11-26 23:10:49.523 | INFO | __main__:check_database_structure:48 - 数据库中的表:
|
||||
2025-11-26 23:10:49.523 | INFO | __main__:check_database_structure:51 - - products
|
||||
2025-11-26 23:10:49.523 | INFO | __main__:check_database_structure:57 - 表结构:
|
||||
2025-11-26 23:10:49.523 | INFO | __main__:check_database_structure:59 - id (INTEGER)
|
||||
2025-11-26 23:10:49.523 | INFO | __main__:check_database_structure:59 - url (TEXT)
|
||||
2025-11-26 23:10:49.523 | INFO | __main__:check_database_structure:59 - name (TEXT)
|
||||
2025-11-26 23:10:49.523 | INFO | __main__:check_database_structure:59 - introduction (TEXT)
|
||||
2025-11-26 23:10:49.523 | INFO | __main__:check_database_structure:59 - user_count (TEXT)
|
||||
2025-11-26 23:10:49.523 | INFO | __main__:check_database_structure:59 - maker_link (TEXT)
|
||||
2025-11-26 23:10:49.524 | INFO | __main__:check_database_structure:59 - maker_statement (TEXT)
|
||||
2025-11-26 23:10:49.524 | INFO | __main__:check_database_structure:59 - created_at (TEXT)
|
||||
2025-11-26 23:10:49.524 | INFO | __main__:check_database_structure:59 - updated_at (TEXT)
|
||||
2025-11-26 23:10:49.524 | SUCCESS | __main__:check_database_structure:64 - 表 products 包含name和introduction字段
|
||||
2025-11-26 23:10:49.524 | INFO | __main__:check_database_structure:69 - 示例数据:
|
||||
2025-11-26 23:10:49.524 | INFO | __main__:check_database_structure:71 - 示例1: name='Pixley AI', introduction='Pixley is the first platform that lets children tu...'
|
||||
2025-11-26 23:10:49.524 | INFO | __main__:check_database_structure:71 - 示例2: name='Burner', introduction='Burner is a small, secure computer that keeps your...'
|
||||
2025-11-26 23:10:49.524 | INFO | __main__:check_database_structure:71 - 示例3: name='American Ratings Lead Magnet Portal', introduction='Build verified business credibility with the Ameri...'
|
||||
2025-11-26 23:10:49.524 | INFO | __main__:check_database_structure:51 - - sqlite_sequence
|
||||
2025-11-26 23:10:49.525 | INFO | __main__:check_database_structure:57 - 表结构:
|
||||
2025-11-26 23:10:49.525 | INFO | __main__:check_database_structure:59 - name ()
|
||||
2025-11-26 23:10:49.525 | INFO | __main__:check_database_structure:59 - seq ()
|
||||
2025-11-26 23:10:49.525 | WARNING | __main__:check_database_structure:73 - 表 sqlite_sequence 缺少name或introduction字段
|
||||
2025-11-26 23:10:49.525 | INFO | __main__:check_database_structure:51 - - product_analysis
|
||||
2025-11-26 23:10:49.525 | INFO | __main__:check_database_structure:57 - 表结构:
|
||||
2025-11-26 23:10:49.525 | INFO | __main__:check_database_structure:59 - id (INTEGER)
|
||||
2025-11-26 23:10:49.525 | INFO | __main__:check_database_structure:59 - original_id (INTEGER)
|
||||
2025-11-26 23:10:49.525 | INFO | __main__:check_database_structure:59 - original_name (TEXT)
|
||||
2025-11-26 23:10:49.525 | INFO | __main__:check_database_structure:59 - product_name (TEXT)
|
||||
2025-11-26 23:10:49.526 | INFO | __main__:check_database_structure:59 - product_intro (TEXT)
|
||||
2025-11-26 23:10:49.526 | INFO | __main__:check_database_structure:59 - development_difficulty (TEXT)
|
||||
2025-11-26 23:10:49.526 | INFO | __main__:check_database_structure:59 - ai_response (TEXT)
|
||||
2025-11-26 23:10:49.526 | INFO | __main__:check_database_structure:59 - created_at (TIMESTAMP)
|
||||
2025-11-26 23:10:49.526 | WARNING | __main__:check_database_structure:73 - 表 product_analysis 缺少name或introduction字段
|
||||
1669
db_modify_zhipu.log
1669
db_modify_zhipu.log
File diff suppressed because it is too large
Load Diff
Binary file not shown.
|
Before Width: | Height: | Size: 281 KiB After Width: | Height: | Size: 526 KiB |
25084
integrated_product_system.log
Normal file
25084
integrated_product_system.log
Normal file
File diff suppressed because it is too large
Load Diff
Binary file not shown.
|
Before Width: | Height: | Size: 166 KiB After Width: | Height: | Size: 231 KiB |
213
product/README.md
Normal file
213
product/README.md
Normal file
@@ -0,0 +1,213 @@
|
||||
# 全功能产品抓取与分析系统
|
||||
|
||||
这是一个整合了产品抓取和AI分析功能的完整系统,将原来的 `integrated_scraper.py` 和 `product_ai_analysis.py` 合并为一个统一的系统。
|
||||
|
||||
## 功能特性
|
||||
|
||||
### 数据抓取功能
|
||||
- 从tophub_data.db数据库中查询ProductHunt链接
|
||||
- 使用playwright连接Chrome浏览器抓取产品信息
|
||||
- 自动去重,避免重复抓取
|
||||
- 支持批量抓取和进度显示
|
||||
- 保存产品信息到products表
|
||||
|
||||
### AI分析功能
|
||||
- 调用Ollama AI API(qwen3:8b模型)分析产品开发难度
|
||||
- 自动解析AI响应,提取产品名称、简介和开发难度
|
||||
- 保存分析结果到product_analysis表
|
||||
- 支持断点续分析,避免重复分析
|
||||
- 自动延时保护,避免API过载
|
||||
|
||||
### 系统特性
|
||||
- 统一的配置管理(config.py)
|
||||
- 完整的日志记录(loguru)
|
||||
- 进度条显示(tqdm)
|
||||
- 错误处理和重试机制
|
||||
- 模块化设计,易于扩展
|
||||
|
||||
## 文件结构
|
||||
|
||||
```
|
||||
product/
|
||||
├── integrated_product_system.py # 主系统文件(核心功能)
|
||||
├── run_system.py # 简化命令行界面
|
||||
├── config.py # 配置文件
|
||||
├── README.md # 使用说明
|
||||
└── playwright-get-data.py # playwright抓取模块(依赖文件)
|
||||
```
|
||||
|
||||
## 使用方法
|
||||
|
||||
### 1. 基本使用(完整模式)
|
||||
```bash
|
||||
# 运行完整工作流程(抓取+分析)
|
||||
python run_system.py --mode full
|
||||
|
||||
# 或者使用主系统文件
|
||||
python integrated_product_system.py
|
||||
```
|
||||
|
||||
### 2. 仅抓取模式
|
||||
```bash
|
||||
# 仅运行抓取功能
|
||||
python run_system.py --mode scraping
|
||||
|
||||
# 指定抓取数量限制
|
||||
python run_system.py --mode scraping --limit 50
|
||||
|
||||
# 不跳过重复URL
|
||||
python run_system.py --mode scraping --no-skip-duplicates
|
||||
```
|
||||
|
||||
### 3. 仅分析模式
|
||||
```bash
|
||||
# 仅运行AI分析功能
|
||||
python run_system.py --mode analysis
|
||||
|
||||
# 限制分析数量
|
||||
python run_system.py --mode analysis --max-products 100
|
||||
```
|
||||
|
||||
### 4. 高级选项
|
||||
```bash
|
||||
# 指定数据库路径
|
||||
python run_system.py --tophub-db /path/to/tophub_data.db --product-db /path/to/products.db
|
||||
|
||||
# 指定Chrome调试端口
|
||||
python run_system.py --debug-port 9222
|
||||
|
||||
# 指定日志文件和级别
|
||||
python run_system.py --log-file my_log.log --log-level DEBUG
|
||||
|
||||
# 指定特定URL进行抓取
|
||||
python run_system.py --mode scraping --urls https://www.producthunt.com/posts/example-product
|
||||
```
|
||||
|
||||
## 更新日志
|
||||
|
||||
### v1.0.1 (当前版本)
|
||||
- ✅ 修复异步调用问题,支持在已有事件循环中运行
|
||||
- ✅ 优化错误处理和事件循环管理
|
||||
- ✅ 测试验证所有运行模式正常工作
|
||||
|
||||
### v1.0.0
|
||||
- ✨ 合并integrated_scraper.py和product_ai_analysis.py功能
|
||||
- ✨ 添加统一的配置管理
|
||||
- ✨ 提供简化的命令行界面
|
||||
- ✨ 增强错误处理和日志记录
|
||||
- ✨ 支持多种运行模式
|
||||
|
||||
## 数据库结构
|
||||
|
||||
### products表(产品信息)
|
||||
```sql
|
||||
CREATE TABLE products (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
url TEXT NOT NULL UNIQUE,
|
||||
name TEXT,
|
||||
introduction TEXT,
|
||||
user_count TEXT,
|
||||
maker_link TEXT,
|
||||
maker_statement TEXT,
|
||||
created_at TEXT NOT NULL,
|
||||
updated_at TEXT NOT NULL
|
||||
);
|
||||
```
|
||||
|
||||
### product_analysis表(AI分析结果)
|
||||
```sql
|
||||
CREATE TABLE product_analysis (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
original_id INTEGER,
|
||||
original_name TEXT,
|
||||
product_name TEXT,
|
||||
product_intro TEXT,
|
||||
development_difficulty TEXT,
|
||||
difficulty_score INTEGER,
|
||||
ai_response TEXT,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
FOREIGN KEY (original_id) REFERENCES products (id)
|
||||
);
|
||||
```
|
||||
|
||||
## 配置说明
|
||||
|
||||
编辑 `config.py` 文件可以修改系统配置:
|
||||
|
||||
- **DATABASE_CONFIG**: 数据库路径配置
|
||||
- **CHROME_CONFIG**: Chrome浏览器配置
|
||||
- **AI_CONFIG**: AI API配置(Ollama)
|
||||
- **SCRAPING_CONFIG**: 抓取配置
|
||||
- **LOGGING_CONFIG**: 日志配置
|
||||
- **ANALYSIS_CONFIG**: 分析配置
|
||||
|
||||
## 系统要求
|
||||
|
||||
- Python 3.7+
|
||||
- Chrome浏览器(已运行,调试端口开启)
|
||||
- Ollama服务(已运行,qwen3:8b模型已安装)
|
||||
- SQLite数据库
|
||||
|
||||
## 依赖库
|
||||
|
||||
```bash
|
||||
pip install loguru tqdm requests playwright
|
||||
```
|
||||
|
||||
## 运行步骤
|
||||
|
||||
1. **确保Chrome浏览器已运行并开启调试端口**
|
||||
```bash
|
||||
# Windows
|
||||
chrome.exe --remote-debugging-port=9222
|
||||
|
||||
# macOS
|
||||
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222
|
||||
```
|
||||
|
||||
2. **确保Ollama服务已运行**
|
||||
```bash
|
||||
# 启动Ollama服务
|
||||
ollama serve
|
||||
|
||||
# 安装qwen3:8b模型(如果未安装)
|
||||
ollama pull qwen3:8b
|
||||
```
|
||||
|
||||
3. **确保tophub_data.db数据库存在**
|
||||
- 数据库应包含articles表,且有url字段
|
||||
|
||||
4. **运行系统**
|
||||
```bash
|
||||
python run_system.py
|
||||
```
|
||||
|
||||
## 常见问题
|
||||
|
||||
### Q: 系统运行时提示Chrome连接失败?
|
||||
A: 确保Chrome浏览器已运行并开启了调试端口(默认9222)。
|
||||
|
||||
### Q: AI分析时提示API调用失败?
|
||||
A: 确保Ollama服务已运行,且qwen3:8b模型已安装。
|
||||
|
||||
### Q: 如何查看抓取和分析的进度?
|
||||
A: 系统会自动显示进度条,同时也会在日志文件中记录详细信息。
|
||||
|
||||
### Q: 如何只分析特定数量的产品?
|
||||
A: 使用 `--max-products` 参数,例如:`python run_system.py --max-products 50`
|
||||
|
||||
### Q: 如何重新分析已分析过的产品?
|
||||
A: 系统默认会跳过已分析的产品,如需重新分析,请删除product_analysis表中对应记录。
|
||||
|
||||
## 更新日志
|
||||
|
||||
### v1.0.0 (当前版本)
|
||||
- ✨ 合并integrated_scraper.py和product_ai_analysis.py功能
|
||||
- ✨ 添加统一的配置管理
|
||||
- ✨ 提供简化的命令行界面
|
||||
- ✨ 增强错误处理和日志记录
|
||||
- ✨ 支持多种运行模式
|
||||
|
||||
## 联系支持
|
||||
|
||||
如有问题,请查看日志文件获取详细信息,或检查系统配置是否正确。
|
||||
BIN
product/__pycache__/config.cpython-313.pyc
Normal file
BIN
product/__pycache__/config.cpython-313.pyc
Normal file
Binary file not shown.
BIN
product/__pycache__/integrated_product_system.cpython-313.pyc
Normal file
BIN
product/__pycache__/integrated_product_system.cpython-313.pyc
Normal file
Binary file not shown.
BIN
product/__pycache__/web_sqlite_viewer.cpython-313.pyc
Normal file
BIN
product/__pycache__/web_sqlite_viewer.cpython-313.pyc
Normal file
Binary file not shown.
@@ -1,312 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
高级ProductHunt抓取器 - 处理Cloudflare Turnstile挑战
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import sqlite3
|
||||
from loguru import logger
|
||||
import os
|
||||
from urllib.parse import urlparse
|
||||
|
||||
class AdvancedProductHuntScraper:
|
||||
def __init__(self, db_path="test_product.db"):
|
||||
self.db_path = db_path
|
||||
self.init_database()
|
||||
|
||||
def init_database(self):
|
||||
"""初始化数据库"""
|
||||
conn = sqlite3.connect(self.db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 创建products表
|
||||
cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS products (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
name TEXT,
|
||||
url TEXT UNIQUE,
|
||||
introduction TEXT,
|
||||
user_count INTEGER,
|
||||
maker_link TEXT,
|
||||
maker_statement TEXT,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||||
)
|
||||
""")
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
logger.info(f"数据库已初始化: {self.db_path}")
|
||||
|
||||
def check_duplicate(self, url):
|
||||
"""检查URL是否已存在"""
|
||||
conn = sqlite3.connect(self.db_path)
|
||||
cursor = conn.cursor()
|
||||
cursor.execute("SELECT id FROM products WHERE url = ?", (url,))
|
||||
result = cursor.fetchone()
|
||||
conn.close()
|
||||
return result is not None
|
||||
|
||||
def save_product_info(self, product_info):
|
||||
"""保存产品信息到数据库"""
|
||||
conn = sqlite3.connect(self.db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 检查是否已存在
|
||||
cursor.execute("SELECT id FROM products WHERE url = ?", (product_info['url'],))
|
||||
existing = cursor.fetchone()
|
||||
|
||||
if existing:
|
||||
# 更新现有记录
|
||||
cursor.execute("""
|
||||
UPDATE products SET
|
||||
name = ?, introduction = ?, user_count = ?,
|
||||
maker_link = ?, maker_statement = ?, updated_at = CURRENT_TIMESTAMP
|
||||
WHERE url = ?
|
||||
""", (
|
||||
product_info['name'], product_info['introduction'],
|
||||
product_info['user_count'], product_info['maker_link'],
|
||||
product_info['maker_statement'], product_info['url']
|
||||
))
|
||||
logger.info(f"更新产品信息: {product_info['name']}")
|
||||
else:
|
||||
# 插入新记录
|
||||
cursor.execute("""
|
||||
INSERT INTO products (name, url, introduction, user_count, maker_link, maker_statement)
|
||||
VALUES (?, ?, ?, ?, ?, ?)
|
||||
""", (
|
||||
product_info['name'], product_info['url'], product_info['introduction'],
|
||||
product_info['user_count'], product_info['maker_link'], product_info['maker_statement']
|
||||
))
|
||||
logger.info(f"保存产品信息: {product_info['name']}")
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
async def scrape_with_stealth(self, url):
|
||||
"""使用隐身模式抓取产品信息"""
|
||||
try:
|
||||
from playwright.async_api import async_playwright
|
||||
|
||||
logger.info(f"开始高级抓取: {url}")
|
||||
|
||||
# 创建Playwright实例
|
||||
playwright = await async_playwright().start()
|
||||
|
||||
# 使用更隐蔽的浏览器配置
|
||||
browser = await playwright.chromium.launch(
|
||||
headless=False, # 非无头模式以便观察
|
||||
args=[
|
||||
'--disable-blink-features=AutomationControlled',
|
||||
'--disable-features=VizDisplayCompositor',
|
||||
'--disable-background-timer-throttling',
|
||||
'--disable-backgrounding-occluded-windows',
|
||||
'--disable-renderer-backgrounding',
|
||||
'--disable-web-security',
|
||||
'--disable-features=TranslateUI',
|
||||
'--disable-ipc-flooding-protection',
|
||||
'--no-sandbox',
|
||||
'--disable-setuid-sandbox'
|
||||
]
|
||||
)
|
||||
|
||||
# 创建上下文和页面
|
||||
context = await browser.new_context(
|
||||
viewport={'width': 1920, 'height': 1080},
|
||||
user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
|
||||
extra_http_headers={
|
||||
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
|
||||
'Accept-Language': 'en-US,en;q=0.9',
|
||||
'Accept-Encoding': 'gzip, deflate, br',
|
||||
'DNT': '1',
|
||||
'Connection': 'keep-alive',
|
||||
'Upgrade-Insecure-Requests': '1',
|
||||
}
|
||||
)
|
||||
|
||||
page = await context.new_page()
|
||||
|
||||
# 隐藏自动化特征
|
||||
await page.add_init_script("""
|
||||
Object.defineProperty(navigator, 'webdriver', {
|
||||
get: () => undefined,
|
||||
});
|
||||
Object.defineProperty(navigator, 'plugins', {
|
||||
get: () => [1, 2, 3, 4, 5],
|
||||
});
|
||||
Object.defineProperty(navigator, 'languages', {
|
||||
get: () => ['en-US', 'en'],
|
||||
});
|
||||
""")
|
||||
|
||||
# 设置超时时间
|
||||
page.set_default_timeout(300000) # 5分钟
|
||||
|
||||
# 导航到页面
|
||||
await page.goto(url, wait_until="domcontentloaded")
|
||||
|
||||
# 检查页面状态
|
||||
page_title = await page.title()
|
||||
logger.info(f"页面标题: {page_title}")
|
||||
|
||||
# 检查是否是Cloudflare挑战页面
|
||||
if "请稍候" in page_title or "Checking" in page_title or "Verifying" in page_title:
|
||||
logger.info("检测到Cloudflare挑战页面,等待用户手动验证...")
|
||||
|
||||
# 等待用户手动完成验证
|
||||
try:
|
||||
# 等待页面标题变化或特定元素出现
|
||||
await page.wait_for_function(
|
||||
"""() => {
|
||||
const title = document.title;
|
||||
return !title.includes('请稍候') &&
|
||||
!title.includes('Checking') &&
|
||||
!title.includes('Verifying') &&
|
||||
title !== '请稍候…';
|
||||
}""",
|
||||
timeout=300000 # 5分钟
|
||||
)
|
||||
logger.info("Cloudflare挑战已完成")
|
||||
except Exception as e:
|
||||
logger.warning(f"等待Cloudflare挑战超时: {e}")
|
||||
|
||||
# 如果超时,尝试刷新页面
|
||||
await page.reload(wait_until="domcontentloaded")
|
||||
logger.info("已刷新页面")
|
||||
|
||||
# 等待页面加载
|
||||
await page.wait_for_timeout(5000)
|
||||
|
||||
# 获取当前页面URL
|
||||
current_url = page.url
|
||||
logger.info(f"当前页面URL: {current_url}")
|
||||
|
||||
# 检查是否重定向到其他页面
|
||||
if current_url != url:
|
||||
logger.warning(f"页面已重定向: {url} -> {current_url}")
|
||||
|
||||
# 尝试提取产品信息
|
||||
product_info = {'url': url}
|
||||
|
||||
# 提取产品名称
|
||||
name_selectors = [
|
||||
"h1",
|
||||
"[data-test='product-name']",
|
||||
".product-name",
|
||||
"title"
|
||||
]
|
||||
|
||||
for selector in name_selectors:
|
||||
try:
|
||||
element = await page.query_selector(selector)
|
||||
if element:
|
||||
name = await element.text_content()
|
||||
if name and name.strip() and name.strip() != "www.producthunt.com":
|
||||
product_info['name'] = name.strip()
|
||||
logger.info(f"提取到产品名称: {product_info['name']}")
|
||||
break
|
||||
except Exception as e:
|
||||
logger.debug(f"选择器 {selector} 失败: {e}")
|
||||
|
||||
if 'name' not in product_info:
|
||||
# 从URL中提取产品名称
|
||||
parsed_url = urlparse(url)
|
||||
path_parts = parsed_url.path.split('/')
|
||||
if len(path_parts) >= 3 and path_parts[-2] == 'products':
|
||||
product_info['name'] = path_parts[-1].replace('-', ' ').title()
|
||||
logger.info(f"从URL提取产品名称: {product_info['name']}")
|
||||
else:
|
||||
product_info['name'] = "Unknown Product"
|
||||
logger.warning("无法提取产品名称")
|
||||
|
||||
# 提取其他信息(简化版本)
|
||||
product_info['introduction'] = None
|
||||
product_info['user_count'] = None
|
||||
product_info['maker_link'] = None
|
||||
product_info['maker_statement'] = None
|
||||
|
||||
# 关闭浏览器
|
||||
await browser.close()
|
||||
await playwright.stop()
|
||||
|
||||
logger.success(f"抓取完成: {product_info['name']}")
|
||||
return product_info
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"抓取失败: {e}")
|
||||
return {'url': url, 'name': 'Error', 'introduction': None, 'user_count': None, 'maker_link': None, 'maker_statement': None}
|
||||
|
||||
async def run_test(self):
|
||||
"""运行测试"""
|
||||
# 从tophub_data.db获取ProductHunt链接
|
||||
tophub_db_path = os.path.join(os.path.dirname(self.db_path), "..", "tophub_data.db")
|
||||
|
||||
conn = sqlite3.connect(tophub_db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 查询包含producthunt.com的链接
|
||||
cursor.execute("""
|
||||
SELECT url FROM articles
|
||||
WHERE url LIKE '%producthunt.com%'
|
||||
LIMIT 3
|
||||
""")
|
||||
|
||||
urls = [row[0] for row in cursor.fetchall()]
|
||||
conn.close()
|
||||
|
||||
logger.info(f"找到 {len(urls)} 个ProductHunt链接")
|
||||
|
||||
# 处理每个URL
|
||||
for url in urls:
|
||||
logger.info(f"处理URL: {url}")
|
||||
|
||||
# 检查是否重复(注释掉跳过逻辑以强制重新抓取)
|
||||
# if self.check_duplicate(url):
|
||||
# logger.info(f"链接已存在,跳过: {url}")
|
||||
# continue
|
||||
|
||||
# 抓取产品信息
|
||||
product_info = await self.scrape_with_stealth(url)
|
||||
|
||||
# 保存到数据库
|
||||
self.save_product_info(product_info)
|
||||
|
||||
# 统计结果
|
||||
conn = sqlite3.connect(self.db_path)
|
||||
cursor = conn.cursor()
|
||||
cursor.execute("SELECT COUNT(*) FROM products")
|
||||
count = cursor.fetchone()[0]
|
||||
|
||||
cursor.execute("SELECT name, url FROM products")
|
||||
products = cursor.fetchall()
|
||||
conn.close()
|
||||
|
||||
logger.success("测试任务完成")
|
||||
|
||||
print("\n=== 测试结果统计 ===")
|
||||
print(f"数据库中的产品数量: {count}")
|
||||
print("已抓取的产品:")
|
||||
for name, url in products:
|
||||
print(f" - {name}: {url}")
|
||||
|
||||
async def main():
|
||||
"""主函数"""
|
||||
# 配置日志
|
||||
logger.remove()
|
||||
logger.add(
|
||||
"advanced_scraper.log",
|
||||
level="DEBUG",
|
||||
format="{time:YYYY-MM-DD HH:mm:ss} | {level:<8} | {name}:{function}:{line} - {message}",
|
||||
rotation="10 MB",
|
||||
retention="7 days"
|
||||
)
|
||||
|
||||
# 创建抓取器实例
|
||||
scraper = AdvancedProductHuntScraper()
|
||||
|
||||
# 运行测试
|
||||
await scraper.run_test()
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
@@ -1,245 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
ProductHunt API抓取器 - 通过API获取产品信息
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import sqlite3
|
||||
import requests
|
||||
from loguru import logger
|
||||
import os
|
||||
import json
|
||||
from urllib.parse import urlparse
|
||||
|
||||
class ProductHuntAPIScraper:
|
||||
def __init__(self, db_path="test_product.db"):
|
||||
self.db_path = db_path
|
||||
self.init_database()
|
||||
|
||||
def init_database(self):
|
||||
"""初始化数据库"""
|
||||
conn = sqlite3.connect(self.db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 创建products表
|
||||
cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS products (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
name TEXT,
|
||||
url TEXT UNIQUE,
|
||||
introduction TEXT,
|
||||
user_count INTEGER,
|
||||
maker_link TEXT,
|
||||
maker_statement TEXT,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||||
)
|
||||
""")
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
logger.info(f"数据库已初始化: {self.db_path}")
|
||||
|
||||
def save_product_info(self, product_info):
|
||||
"""保存产品信息到数据库"""
|
||||
conn = sqlite3.connect(self.db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 检查是否已存在
|
||||
cursor.execute("SELECT id FROM products WHERE url = ?", (product_info['url'],))
|
||||
existing = cursor.fetchone()
|
||||
|
||||
if existing:
|
||||
# 更新现有记录
|
||||
cursor.execute("""
|
||||
UPDATE products SET
|
||||
name = ?, introduction = ?, user_count = ?,
|
||||
maker_link = ?, maker_statement = ?, updated_at = CURRENT_TIMESTAMP
|
||||
WHERE url = ?
|
||||
""", (
|
||||
product_info['name'], product_info['introduction'],
|
||||
product_info['user_count'], product_info['maker_link'],
|
||||
product_info['maker_statement'], product_info['url']
|
||||
))
|
||||
logger.info(f"更新产品信息: {product_info['name']}")
|
||||
else:
|
||||
# 插入新记录
|
||||
cursor.execute("""
|
||||
INSERT INTO products (name, url, introduction, user_count, maker_link, maker_statement)
|
||||
VALUES (?, ?, ?, ?, ?, ?)
|
||||
""", (
|
||||
product_info['name'], product_info['url'], product_info['introduction'],
|
||||
product_info['user_count'], product_info['maker_link'], product_info['maker_statement']
|
||||
))
|
||||
logger.info(f"保存产品信息: {product_info['name']}")
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
def extract_product_name_from_url(self, url):
|
||||
"""从URL中提取产品名称"""
|
||||
try:
|
||||
parsed_url = urlparse(url)
|
||||
path_parts = parsed_url.path.split('/')
|
||||
|
||||
# 查找products路径段
|
||||
for i, part in enumerate(path_parts):
|
||||
if part == 'products' and i + 1 < len(path_parts):
|
||||
product_slug = path_parts[i + 1]
|
||||
# 将slug转换为可读的名称
|
||||
name = product_slug.replace('-', ' ').title()
|
||||
return name
|
||||
|
||||
# 如果找不到products路径段,使用最后一个路径段
|
||||
if path_parts:
|
||||
last_part = path_parts[-1]
|
||||
if last_part:
|
||||
name = last_part.replace('-', ' ').title()
|
||||
return name
|
||||
|
||||
return "Unknown Product"
|
||||
except Exception as e:
|
||||
logger.error(f"从URL提取产品名称失败: {e}")
|
||||
return "Unknown Product"
|
||||
|
||||
def get_product_info_from_api(self, url):
|
||||
"""尝试通过API获取产品信息"""
|
||||
try:
|
||||
# 从URL中提取产品slug
|
||||
parsed_url = urlparse(url)
|
||||
path_parts = parsed_url.path.split('/')
|
||||
|
||||
product_slug = None
|
||||
for i, part in enumerate(path_parts):
|
||||
if part == 'products' and i + 1 < len(path_parts):
|
||||
product_slug = path_parts[i + 1]
|
||||
break
|
||||
|
||||
if not product_slug:
|
||||
logger.warning(f"无法从URL中提取产品slug: {url}")
|
||||
return None
|
||||
|
||||
# 尝试使用ProductHunt的GraphQL API(需要API密钥)
|
||||
# 这里我们使用一个简化的方法,只提取基本信息
|
||||
|
||||
product_info = {
|
||||
'url': url,
|
||||
'name': self.extract_product_name_from_url(url),
|
||||
'introduction': f"Product from ProductHunt: {product_slug}",
|
||||
'user_count': None, # 需要API访问
|
||||
'maker_link': None, # 需要API访问
|
||||
'maker_statement': None # 需要API访问
|
||||
}
|
||||
|
||||
logger.info(f"通过API获取产品信息: {product_info['name']}")
|
||||
return product_info
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"API获取产品信息失败: {e}")
|
||||
return None
|
||||
|
||||
def get_product_info_fallback(self, url):
|
||||
"""备用方法:从URL中提取基本信息"""
|
||||
try:
|
||||
product_name = self.extract_product_name_from_url(url)
|
||||
|
||||
product_info = {
|
||||
'url': url,
|
||||
'name': product_name,
|
||||
'introduction': f"Product from ProductHunt: {product_name}",
|
||||
'user_count': None,
|
||||
'maker_link': None,
|
||||
'maker_statement': None
|
||||
}
|
||||
|
||||
logger.info(f"使用备用方法获取产品信息: {product_info['name']}")
|
||||
return product_info
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"备用方法获取产品信息失败: {e}")
|
||||
return None
|
||||
|
||||
def run_test(self):
|
||||
"""运行测试"""
|
||||
# 从tophub_data.db获取ProductHunt链接
|
||||
tophub_db_path = os.path.join(os.path.dirname(self.db_path), "..", "tophub_data.db")
|
||||
|
||||
conn = sqlite3.connect(tophub_db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 查询包含producthunt.com的链接
|
||||
cursor.execute("""
|
||||
SELECT url FROM articles
|
||||
WHERE url LIKE '%producthunt.com%'
|
||||
LIMIT 3
|
||||
""")
|
||||
|
||||
urls = [row[0] for row in cursor.fetchall()]
|
||||
conn.close()
|
||||
|
||||
logger.info(f"找到 {len(urls)} 个ProductHunt链接")
|
||||
|
||||
# 处理每个URL
|
||||
for url in urls:
|
||||
logger.info(f"处理URL: {url}")
|
||||
|
||||
# 尝试通过API获取产品信息
|
||||
product_info = self.get_product_info_from_api(url)
|
||||
|
||||
# 如果API失败,使用备用方法
|
||||
if not product_info:
|
||||
product_info = self.get_product_info_fallback(url)
|
||||
|
||||
# 如果两种方法都失败,创建基本产品信息
|
||||
if not product_info:
|
||||
product_info = {
|
||||
'url': url,
|
||||
'name': 'Unknown Product',
|
||||
'introduction': 'Unable to fetch product information',
|
||||
'user_count': None,
|
||||
'maker_link': None,
|
||||
'maker_statement': None
|
||||
}
|
||||
|
||||
# 保存到数据库
|
||||
self.save_product_info(product_info)
|
||||
|
||||
# 统计结果
|
||||
conn = sqlite3.connect(self.db_path)
|
||||
cursor = conn.cursor()
|
||||
cursor.execute("SELECT COUNT(*) FROM products")
|
||||
count = cursor.fetchone()[0]
|
||||
|
||||
cursor.execute("SELECT name, url FROM products")
|
||||
products = cursor.fetchall()
|
||||
conn.close()
|
||||
|
||||
logger.success("测试任务完成")
|
||||
|
||||
print("\n=== 测试结果统计 ===")
|
||||
print(f"数据库中的产品数量: {count}")
|
||||
print("已抓取的产品:")
|
||||
for name, url in products:
|
||||
print(f" - {name}: {url}")
|
||||
|
||||
def main():
|
||||
"""主函数"""
|
||||
# 配置日志
|
||||
logger.remove()
|
||||
logger.add(
|
||||
"api_scraper.log",
|
||||
level="DEBUG",
|
||||
format="{time:YYYY-MM-DD HH:mm:ss} | {level:<8} | {name}:{function}:{line} - {message}",
|
||||
rotation="10 MB",
|
||||
retention="7 days"
|
||||
)
|
||||
|
||||
# 创建抓取器实例
|
||||
scraper = ProductHuntAPIScraper()
|
||||
|
||||
# 运行测试
|
||||
scraper.run_test()
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
52
product/config.py
Normal file
52
product/config.py
Normal file
@@ -0,0 +1,52 @@
|
||||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
全功能产品系统配置文件
|
||||
"""
|
||||
|
||||
import os
|
||||
|
||||
# 数据库配置
|
||||
DATABASE_CONFIG = {
|
||||
'tophub_db_path': os.path.join(os.path.dirname(os.path.dirname(__file__)), "tophub_data.db"),
|
||||
'product_db_path': os.path.join(os.path.dirname(__file__), "products.db"),
|
||||
}
|
||||
|
||||
# Chrome调试配置
|
||||
CHROME_CONFIG = {
|
||||
'debug_port': 9222,
|
||||
'headless': False,
|
||||
'timeout': 30,
|
||||
}
|
||||
|
||||
# AI分析配置
|
||||
AI_CONFIG = {
|
||||
'api_url': "http://localhost:11434/api/generate",
|
||||
'model': "qwen3:8b",
|
||||
'timeout': 60,
|
||||
'retry_count': 3,
|
||||
'retry_delay': 5,
|
||||
}
|
||||
|
||||
# 抓取配置
|
||||
SCRAPING_CONFIG = {
|
||||
'default_limit': 0, # 0表示不限制
|
||||
'skip_duplicates': True,
|
||||
'batch_size': 10,
|
||||
'delay_between_requests': 2,
|
||||
}
|
||||
|
||||
# 日志配置
|
||||
LOGGING_CONFIG = {
|
||||
'log_file': "integrated_product_system.log",
|
||||
'log_level': "INFO",
|
||||
'log_rotation': "10 MB",
|
||||
'log_format': "<green>{time:YYYY-MM-DD HH:mm:ss}</green> | <level>{level: <8}</level> | <cyan>{name}</cyan>:<cyan>{function}</cyan>:<cyan>{line}</cyan> - <level>{message}</level>",
|
||||
}
|
||||
|
||||
# 分析配置
|
||||
ANALYSIS_CONFIG = {
|
||||
'max_products': None, # None表示分析所有产品
|
||||
'batch_size': 1, # 每次分析的产品数量
|
||||
'delay_between_analyses': 2, # 分析间隔(秒)
|
||||
}
|
||||
@@ -1,818 +0,0 @@
|
||||
2025-11-27 22:15:02.065 | INFO | __main__:__init__:38 - 初始化产品难度评分器,数据库: products.db
|
||||
2025-11-27 22:15:02.066 | INFO | __main__:score_products:190 - 开始产品难度评分
|
||||
2025-11-27 22:15:02.066 | SUCCESS | __main__:connect_to_database:44 - 成功连接到数据库: products.db
|
||||
2025-11-27 22:15:02.071 | SUCCESS | __main__:add_difficulty_score_column:62 - 成功添加difficulty_score字段
|
||||
2025-11-27 22:15:02.074 | INFO | __main__:get_unscored_products:93 - 找到 251 个未评分的产品
|
||||
2025-11-27 22:15:02.074 | INFO | __main__:score_products:207 - 准备评分 251 个产品
|
||||
2025-11-27 22:15:02.074 | INFO | __main__:score_products:212 -
|
||||
评分进度: 1/251 - 产品ID: 1
|
||||
2025-11-27 22:15:02.075 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:15:22.897 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 90
|
||||
2025-11-27 22:15:22.900 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 1 的难度评分为: 90
|
||||
2025-11-27 22:15:22.900 | SUCCESS | __main__:score_products:221 - 评分完成: 90分
|
||||
2025-11-27 22:15:22.900 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:15:24.901 | INFO | __main__:score_products:212 -
|
||||
评分进度: 2/251 - 产品ID: 2
|
||||
2025-11-27 22:15:24.901 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:15:42.061 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:15:42.066 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 2 的难度评分为: 85
|
||||
2025-11-27 22:15:42.066 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:15:42.066 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:15:44.067 | INFO | __main__:score_products:212 -
|
||||
评分进度: 3/251 - 产品ID: 3
|
||||
2025-11-27 22:15:44.068 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:15:59.877 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:15:59.882 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 3 的难度评分为: 75
|
||||
2025-11-27 22:15:59.882 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:15:59.882 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:16:01.883 | INFO | __main__:score_products:212 -
|
||||
评分进度: 4/251 - 产品ID: 4
|
||||
2025-11-27 22:16:01.884 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:16:12.907 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:16:12.912 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 4 的难度评分为: 95
|
||||
2025-11-27 22:16:12.912 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:16:12.912 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:16:14.913 | INFO | __main__:score_products:212 -
|
||||
评分进度: 5/251 - 产品ID: 5
|
||||
2025-11-27 22:16:14.914 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:16:30.206 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:16:30.211 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 5 的难度评分为: 75
|
||||
2025-11-27 22:16:30.211 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:16:30.211 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:16:32.212 | INFO | __main__:score_products:212 -
|
||||
评分进度: 6/251 - 产品ID: 6
|
||||
2025-11-27 22:16:32.213 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:16:37.802 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 60
|
||||
2025-11-27 22:16:37.806 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 6 的难度评分为: 60
|
||||
2025-11-27 22:16:37.806 | SUCCESS | __main__:score_products:221 - 评分完成: 60分
|
||||
2025-11-27 22:16:37.806 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:16:39.807 | INFO | __main__:score_products:212 -
|
||||
评分进度: 7/251 - 产品ID: 7
|
||||
2025-11-27 22:16:39.807 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:16:52.409 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:16:52.414 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 7 的难度评分为: 85
|
||||
2025-11-27 22:16:52.414 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:16:52.414 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:16:54.414 | INFO | __main__:score_products:212 -
|
||||
评分进度: 8/251 - 产品ID: 8
|
||||
2025-11-27 22:16:54.416 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:17:04.041 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:17:04.045 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 8 的难度评分为: 95
|
||||
2025-11-27 22:17:04.045 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:17:04.045 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:17:06.045 | INFO | __main__:score_products:212 -
|
||||
评分进度: 9/251 - 产品ID: 9
|
||||
2025-11-27 22:17:06.046 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:17:24.896 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 60
|
||||
2025-11-27 22:17:24.900 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 9 的难度评分为: 60
|
||||
2025-11-27 22:17:24.900 | SUCCESS | __main__:score_products:221 - 评分完成: 60分
|
||||
2025-11-27 22:17:24.900 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:17:26.901 | INFO | __main__:score_products:212 -
|
||||
评分进度: 10/251 - 产品ID: 10
|
||||
2025-11-27 22:17:26.901 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:17:42.131 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:17:42.135 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 10 的难度评分为: 85
|
||||
2025-11-27 22:17:42.135 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:17:42.136 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:17:44.136 | INFO | __main__:score_products:212 -
|
||||
评分进度: 11/251 - 产品ID: 11
|
||||
2025-11-27 22:17:44.137 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:17:58.158 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:17:58.162 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 11 的难度评分为: 95
|
||||
2025-11-27 22:17:58.162 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:17:58.162 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:18:00.163 | INFO | __main__:score_products:212 -
|
||||
评分进度: 12/251 - 产品ID: 12
|
||||
2025-11-27 22:18:00.164 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:18:08.974 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 100
|
||||
2025-11-27 22:18:08.977 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 12 的难度评分为: 100
|
||||
2025-11-27 22:18:08.977 | SUCCESS | __main__:score_products:221 - 评分完成: 100分
|
||||
2025-11-27 22:18:08.977 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:18:10.978 | INFO | __main__:score_products:212 -
|
||||
评分进度: 13/251 - 产品ID: 13
|
||||
2025-11-27 22:18:10.979 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:18:21.194 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 90
|
||||
2025-11-27 22:18:21.198 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 13 的难度评分为: 90
|
||||
2025-11-27 22:18:21.198 | SUCCESS | __main__:score_products:221 - 评分完成: 90分
|
||||
2025-11-27 22:18:21.198 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:18:23.200 | INFO | __main__:score_products:212 -
|
||||
评分进度: 14/251 - 产品ID: 14
|
||||
2025-11-27 22:18:23.201 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:18:29.891 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:18:29.895 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 14 的难度评分为: 95
|
||||
2025-11-27 22:18:29.895 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:18:29.895 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:18:31.895 | INFO | __main__:score_products:212 -
|
||||
评分进度: 15/251 - 产品ID: 15
|
||||
2025-11-27 22:18:31.896 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:18:45.906 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:18:45.910 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 15 的难度评分为: 75
|
||||
2025-11-27 22:18:45.910 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:18:45.910 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:18:47.911 | INFO | __main__:score_products:212 -
|
||||
评分进度: 16/251 - 产品ID: 16
|
||||
2025-11-27 22:18:47.912 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:18:59.078 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:18:59.082 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 16 的难度评分为: 75
|
||||
2025-11-27 22:18:59.082 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:18:59.082 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:19:01.083 | INFO | __main__:score_products:212 -
|
||||
评分进度: 17/251 - 产品ID: 17
|
||||
2025-11-27 22:19:01.083 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:19:11.227 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 60
|
||||
2025-11-27 22:19:11.231 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 17 的难度评分为: 60
|
||||
2025-11-27 22:19:11.231 | SUCCESS | __main__:score_products:221 - 评分完成: 60分
|
||||
2025-11-27 22:19:11.231 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:19:13.232 | INFO | __main__:score_products:212 -
|
||||
评分进度: 18/251 - 产品ID: 18
|
||||
2025-11-27 22:19:13.232 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:19:27.810 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:19:27.813 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 18 的难度评分为: 75
|
||||
2025-11-27 22:19:27.813 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:19:27.813 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:19:29.814 | INFO | __main__:score_products:212 -
|
||||
评分进度: 19/251 - 产品ID: 19
|
||||
2025-11-27 22:19:29.814 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:19:38.474 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:19:38.478 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 19 的难度评分为: 85
|
||||
2025-11-27 22:19:38.478 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:19:38.478 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:19:40.478 | INFO | __main__:score_products:212 -
|
||||
评分进度: 20/251 - 产品ID: 20
|
||||
2025-11-27 22:19:40.479 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:19:56.459 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:19:56.463 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 20 的难度评分为: 75
|
||||
2025-11-27 22:19:56.463 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:19:56.463 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:19:58.464 | INFO | __main__:score_products:212 -
|
||||
评分进度: 21/251 - 产品ID: 21
|
||||
2025-11-27 22:19:58.464 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:20:08.851 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:20:08.855 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 21 的难度评分为: 85
|
||||
2025-11-27 22:20:08.855 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:20:08.856 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:20:10.857 | INFO | __main__:score_products:212 -
|
||||
评分进度: 22/251 - 产品ID: 22
|
||||
2025-11-27 22:20:10.858 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:20:28.350 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:20:28.355 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 22 的难度评分为: 95
|
||||
2025-11-27 22:20:28.355 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:20:28.355 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:20:30.356 | INFO | __main__:score_products:212 -
|
||||
评分进度: 23/251 - 产品ID: 23
|
||||
2025-11-27 22:20:30.356 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:20:46.974 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:20:46.979 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 23 的难度评分为: 95
|
||||
2025-11-27 22:20:46.979 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:20:46.979 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:20:48.979 | INFO | __main__:score_products:212 -
|
||||
评分进度: 24/251 - 产品ID: 24
|
||||
2025-11-27 22:20:48.979 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:21:02.432 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 65
|
||||
2025-11-27 22:21:02.437 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 24 的难度评分为: 65
|
||||
2025-11-27 22:21:02.437 | SUCCESS | __main__:score_products:221 - 评分完成: 65分
|
||||
2025-11-27 22:21:02.437 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:21:04.438 | INFO | __main__:score_products:212 -
|
||||
评分进度: 25/251 - 产品ID: 25
|
||||
2025-11-27 22:21:04.438 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:21:10.182 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:21:10.187 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 25 的难度评分为: 85
|
||||
2025-11-27 22:21:10.187 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:21:10.187 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:21:12.188 | INFO | __main__:score_products:212 -
|
||||
评分进度: 26/251 - 产品ID: 26
|
||||
2025-11-27 22:21:12.189 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:21:25.692 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:21:25.696 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 26 的难度评分为: 85
|
||||
2025-11-27 22:21:25.696 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:21:25.697 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:21:27.697 | INFO | __main__:score_products:212 -
|
||||
评分进度: 27/251 - 产品ID: 27
|
||||
2025-11-27 22:21:27.698 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:21:42.789 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:21:42.793 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 27 的难度评分为: 95
|
||||
2025-11-27 22:21:42.793 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:21:42.794 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:21:44.794 | INFO | __main__:score_products:212 -
|
||||
评分进度: 28/251 - 产品ID: 28
|
||||
2025-11-27 22:21:44.795 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:21:58.897 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:21:58.902 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 28 的难度评分为: 95
|
||||
2025-11-27 22:21:58.902 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:21:58.902 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:22:00.903 | INFO | __main__:score_products:212 -
|
||||
评分进度: 29/251 - 产品ID: 29
|
||||
2025-11-27 22:22:00.903 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:22:10.583 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:22:10.587 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 29 的难度评分为: 85
|
||||
2025-11-27 22:22:10.587 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:22:10.587 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:22:12.588 | INFO | __main__:score_products:212 -
|
||||
评分进度: 30/251 - 产品ID: 30
|
||||
2025-11-27 22:22:12.589 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:22:30.462 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:22:30.467 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 30 的难度评分为: 75
|
||||
2025-11-27 22:22:30.467 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:22:30.467 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:22:32.467 | INFO | __main__:score_products:212 -
|
||||
评分进度: 31/251 - 产品ID: 31
|
||||
2025-11-27 22:22:32.468 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:22:41.026 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:22:41.032 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 31 的难度评分为: 75
|
||||
2025-11-27 22:22:41.032 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:22:41.032 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:22:43.033 | INFO | __main__:score_products:212 -
|
||||
评分进度: 32/251 - 产品ID: 32
|
||||
2025-11-27 22:22:43.034 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:22:51.204 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:22:51.208 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 32 的难度评分为: 85
|
||||
2025-11-27 22:22:51.208 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:22:51.208 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:22:53.209 | INFO | __main__:score_products:212 -
|
||||
评分进度: 33/251 - 产品ID: 33
|
||||
2025-11-27 22:22:53.209 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:23:07.564 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 90
|
||||
2025-11-27 22:23:07.568 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 33 的难度评分为: 90
|
||||
2025-11-27 22:23:07.568 | SUCCESS | __main__:score_products:221 - 评分完成: 90分
|
||||
2025-11-27 22:23:07.568 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:23:09.569 | INFO | __main__:score_products:212 -
|
||||
评分进度: 34/251 - 产品ID: 34
|
||||
2025-11-27 22:23:09.570 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:23:21.371 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:23:21.375 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 34 的难度评分为: 75
|
||||
2025-11-27 22:23:21.375 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:23:21.375 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:23:23.375 | INFO | __main__:score_products:212 -
|
||||
评分进度: 35/251 - 产品ID: 35
|
||||
2025-11-27 22:23:23.376 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:23:38.365 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:23:38.368 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 35 的难度评分为: 75
|
||||
2025-11-27 22:23:38.369 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:23:38.369 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:23:40.369 | INFO | __main__:score_products:212 -
|
||||
评分进度: 36/251 - 产品ID: 36
|
||||
2025-11-27 22:23:40.369 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:23:50.821 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:23:50.826 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 36 的难度评分为: 85
|
||||
2025-11-27 22:23:50.826 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:23:50.826 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:23:52.827 | INFO | __main__:score_products:212 -
|
||||
评分进度: 37/251 - 产品ID: 37
|
||||
2025-11-27 22:23:52.827 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:24:07.978 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:24:07.983 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 37 的难度评分为: 95
|
||||
2025-11-27 22:24:07.983 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:24:07.983 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:24:09.983 | INFO | __main__:score_products:212 -
|
||||
评分进度: 38/251 - 产品ID: 38
|
||||
2025-11-27 22:24:09.984 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:24:31.439 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:24:31.443 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 38 的难度评分为: 85
|
||||
2025-11-27 22:24:31.443 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:24:31.443 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:24:33.444 | INFO | __main__:score_products:212 -
|
||||
评分进度: 39/251 - 产品ID: 39
|
||||
2025-11-27 22:24:33.445 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:25:04.537 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:25:04.541 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 39 的难度评分为: 85
|
||||
2025-11-27 22:25:04.541 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:25:04.541 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:25:06.541 | INFO | __main__:score_products:212 -
|
||||
评分进度: 40/251 - 产品ID: 40
|
||||
2025-11-27 22:25:06.542 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:25:18.764 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:25:18.767 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 40 的难度评分为: 85
|
||||
2025-11-27 22:25:18.767 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:25:18.767 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:25:20.768 | INFO | __main__:score_products:212 -
|
||||
评分进度: 41/251 - 产品ID: 41
|
||||
2025-11-27 22:25:20.769 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:25:36.627 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:25:36.632 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 41 的难度评分为: 75
|
||||
2025-11-27 22:25:36.632 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:25:36.632 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:25:38.632 | INFO | __main__:score_products:212 -
|
||||
评分进度: 42/251 - 产品ID: 42
|
||||
2025-11-27 22:25:38.633 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:26:02.058 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:26:02.063 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 42 的难度评分为: 85
|
||||
2025-11-27 22:26:02.063 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:26:02.063 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:26:04.064 | INFO | __main__:score_products:212 -
|
||||
评分进度: 43/251 - 产品ID: 43
|
||||
2025-11-27 22:26:04.064 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:26:15.507 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:26:15.511 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 43 的难度评分为: 95
|
||||
2025-11-27 22:26:15.511 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:26:15.511 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:26:17.512 | INFO | __main__:score_products:212 -
|
||||
评分进度: 44/251 - 产品ID: 44
|
||||
2025-11-27 22:26:17.512 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:26:31.613 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:26:31.617 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 44 的难度评分为: 85
|
||||
2025-11-27 22:26:31.617 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:26:31.617 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:26:33.618 | INFO | __main__:score_products:212 -
|
||||
评分进度: 45/251 - 产品ID: 45
|
||||
2025-11-27 22:26:33.619 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:26:54.906 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:26:54.910 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 45 的难度评分为: 85
|
||||
2025-11-27 22:26:54.910 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:26:54.910 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:26:56.911 | INFO | __main__:score_products:212 -
|
||||
评分进度: 46/251 - 产品ID: 46
|
||||
2025-11-27 22:26:56.911 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:27:09.484 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:27:09.489 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 46 的难度评分为: 85
|
||||
2025-11-27 22:27:09.489 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:27:09.489 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:27:11.490 | INFO | __main__:score_products:212 -
|
||||
评分进度: 47/251 - 产品ID: 47
|
||||
2025-11-27 22:27:11.491 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:27:25.136 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 90
|
||||
2025-11-27 22:27:25.140 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 47 的难度评分为: 90
|
||||
2025-11-27 22:27:25.141 | SUCCESS | __main__:score_products:221 - 评分完成: 90分
|
||||
2025-11-27 22:27:25.141 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:27:27.141 | INFO | __main__:score_products:212 -
|
||||
评分进度: 48/251 - 产品ID: 48
|
||||
2025-11-27 22:27:27.142 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:27:52.128 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 90
|
||||
2025-11-27 22:27:52.131 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 48 的难度评分为: 90
|
||||
2025-11-27 22:27:52.131 | SUCCESS | __main__:score_products:221 - 评分完成: 90分
|
||||
2025-11-27 22:27:52.131 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:27:54.132 | INFO | __main__:score_products:212 -
|
||||
评分进度: 49/251 - 产品ID: 49
|
||||
2025-11-27 22:27:54.133 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:28:10.443 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:28:10.447 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 49 的难度评分为: 95
|
||||
2025-11-27 22:28:10.447 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:28:10.448 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:28:12.448 | INFO | __main__:score_products:212 -
|
||||
评分进度: 50/251 - 产品ID: 50
|
||||
2025-11-27 22:28:12.448 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:28:24.343 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:28:24.348 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 50 的难度评分为: 95
|
||||
2025-11-27 22:28:24.348 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:28:24.348 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:28:26.349 | INFO | __main__:score_products:212 -
|
||||
评分进度: 51/251 - 产品ID: 51
|
||||
2025-11-27 22:28:26.350 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:28:41.099 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:28:41.104 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 51 的难度评分为: 85
|
||||
2025-11-27 22:28:41.104 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:28:41.104 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:28:43.105 | INFO | __main__:score_products:212 -
|
||||
评分进度: 52/251 - 产品ID: 52
|
||||
2025-11-27 22:28:43.106 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:28:55.393 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:28:55.397 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 52 的难度评分为: 75
|
||||
2025-11-27 22:28:55.397 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:28:55.397 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:28:57.398 | INFO | __main__:score_products:212 -
|
||||
评分进度: 53/251 - 产品ID: 53
|
||||
2025-11-27 22:28:57.398 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:29:10.087 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:29:10.091 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 53 的难度评分为: 75
|
||||
2025-11-27 22:29:10.091 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:29:10.091 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:29:12.092 | INFO | __main__:score_products:212 -
|
||||
评分进度: 54/251 - 产品ID: 54
|
||||
2025-11-27 22:29:12.092 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:29:23.753 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:29:23.755 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 54 的难度评分为: 85
|
||||
2025-11-27 22:29:23.756 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:29:23.756 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:29:25.756 | INFO | __main__:score_products:212 -
|
||||
评分进度: 55/251 - 产品ID: 55
|
||||
2025-11-27 22:29:25.756 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:29:37.465 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:29:37.469 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 55 的难度评分为: 75
|
||||
2025-11-27 22:29:37.469 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:29:37.469 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:29:39.469 | INFO | __main__:score_products:212 -
|
||||
评分进度: 56/251 - 产品ID: 56
|
||||
2025-11-27 22:29:39.470 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:29:53.805 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 70
|
||||
2025-11-27 22:29:53.810 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 56 的难度评分为: 70
|
||||
2025-11-27 22:29:53.810 | SUCCESS | __main__:score_products:221 - 评分完成: 70分
|
||||
2025-11-27 22:29:53.811 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:29:55.812 | INFO | __main__:score_products:212 -
|
||||
评分进度: 57/251 - 产品ID: 57
|
||||
2025-11-27 22:29:55.812 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:30:11.152 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:30:11.156 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 57 的难度评分为: 85
|
||||
2025-11-27 22:30:11.156 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:30:11.156 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:30:13.157 | INFO | __main__:score_products:212 -
|
||||
评分进度: 58/251 - 产品ID: 58
|
||||
2025-11-27 22:30:13.157 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:30:21.557 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:30:21.561 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 58 的难度评分为: 95
|
||||
2025-11-27 22:30:21.561 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:30:21.561 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:30:23.562 | INFO | __main__:score_products:212 -
|
||||
评分进度: 59/251 - 产品ID: 59
|
||||
2025-11-27 22:30:23.562 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:30:34.610 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 90
|
||||
2025-11-27 22:30:34.613 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 59 的难度评分为: 90
|
||||
2025-11-27 22:30:34.613 | SUCCESS | __main__:score_products:221 - 评分完成: 90分
|
||||
2025-11-27 22:30:34.613 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:30:36.613 | INFO | __main__:score_products:212 -
|
||||
评分进度: 60/251 - 产品ID: 60
|
||||
2025-11-27 22:30:36.614 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:30:53.797 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 70
|
||||
2025-11-27 22:30:53.801 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 60 的难度评分为: 70
|
||||
2025-11-27 22:30:53.801 | SUCCESS | __main__:score_products:221 - 评分完成: 70分
|
||||
2025-11-27 22:30:53.801 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:30:55.802 | INFO | __main__:score_products:212 -
|
||||
评分进度: 61/251 - 产品ID: 61
|
||||
2025-11-27 22:30:55.802 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:31:07.842 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:31:07.846 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 61 的难度评分为: 75
|
||||
2025-11-27 22:31:07.846 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:31:07.847 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:31:09.847 | INFO | __main__:score_products:212 -
|
||||
评分进度: 62/251 - 产品ID: 62
|
||||
2025-11-27 22:31:09.847 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:31:17.957 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:31:17.961 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 62 的难度评分为: 85
|
||||
2025-11-27 22:31:17.961 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:31:17.961 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:31:19.962 | INFO | __main__:score_products:212 -
|
||||
评分进度: 63/251 - 产品ID: 63
|
||||
2025-11-27 22:31:19.963 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:31:35.601 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:31:35.606 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 63 的难度评分为: 75
|
||||
2025-11-27 22:31:35.606 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:31:35.606 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:31:37.606 | INFO | __main__:score_products:212 -
|
||||
评分进度: 64/251 - 产品ID: 64
|
||||
2025-11-27 22:31:37.607 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:31:54.718 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:31:54.722 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 64 的难度评分为: 85
|
||||
2025-11-27 22:31:54.722 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:31:54.723 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:31:56.723 | INFO | __main__:score_products:212 -
|
||||
评分进度: 65/251 - 产品ID: 65
|
||||
2025-11-27 22:31:56.724 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:32:06.981 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 65
|
||||
2025-11-27 22:32:06.987 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 65 的难度评分为: 65
|
||||
2025-11-27 22:32:06.987 | SUCCESS | __main__:score_products:221 - 评分完成: 65分
|
||||
2025-11-27 22:32:06.987 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:32:08.987 | INFO | __main__:score_products:212 -
|
||||
评分进度: 66/251 - 产品ID: 66
|
||||
2025-11-27 22:32:08.988 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:32:22.253 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:32:22.257 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 66 的难度评分为: 75
|
||||
2025-11-27 22:32:22.257 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:32:22.257 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:32:24.258 | INFO | __main__:score_products:212 -
|
||||
评分进度: 67/251 - 产品ID: 67
|
||||
2025-11-27 22:32:24.258 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:32:42.900 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:32:42.906 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 67 的难度评分为: 85
|
||||
2025-11-27 22:32:42.906 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:32:42.906 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:32:44.906 | INFO | __main__:score_products:212 -
|
||||
评分进度: 68/251 - 产品ID: 68
|
||||
2025-11-27 22:32:44.907 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:32:58.072 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 60
|
||||
2025-11-27 22:32:58.078 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 68 的难度评分为: 60
|
||||
2025-11-27 22:32:58.078 | SUCCESS | __main__:score_products:221 - 评分完成: 60分
|
||||
2025-11-27 22:32:58.078 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:33:00.078 | INFO | __main__:score_products:212 -
|
||||
评分进度: 69/251 - 产品ID: 69
|
||||
2025-11-27 22:33:00.079 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:33:17.223 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 100
|
||||
2025-11-27 22:33:17.228 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 69 的难度评分为: 100
|
||||
2025-11-27 22:33:17.228 | SUCCESS | __main__:score_products:221 - 评分完成: 100分
|
||||
2025-11-27 22:33:17.228 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:33:19.229 | INFO | __main__:score_products:212 -
|
||||
评分进度: 70/251 - 产品ID: 70
|
||||
2025-11-27 22:33:19.230 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:33:35.768 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:33:35.773 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 70 的难度评分为: 85
|
||||
2025-11-27 22:33:35.773 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:33:35.773 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:33:37.774 | INFO | __main__:score_products:212 -
|
||||
评分进度: 71/251 - 产品ID: 71
|
||||
2025-11-27 22:33:37.774 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:33:50.953 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:33:50.957 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 71 的难度评分为: 75
|
||||
2025-11-27 22:33:50.957 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:33:50.957 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:33:52.958 | INFO | __main__:score_products:212 -
|
||||
评分进度: 72/251 - 产品ID: 72
|
||||
2025-11-27 22:33:52.959 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:34:06.272 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:34:06.278 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 72 的难度评分为: 75
|
||||
2025-11-27 22:34:06.278 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:34:06.278 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:34:08.278 | INFO | __main__:score_products:212 -
|
||||
评分进度: 73/251 - 产品ID: 73
|
||||
2025-11-27 22:34:08.279 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:34:27.380 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 90
|
||||
2025-11-27 22:34:27.387 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 73 的难度评分为: 90
|
||||
2025-11-27 22:34:27.387 | SUCCESS | __main__:score_products:221 - 评分完成: 90分
|
||||
2025-11-27 22:34:27.387 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:34:29.387 | INFO | __main__:score_products:212 -
|
||||
评分进度: 74/251 - 产品ID: 74
|
||||
2025-11-27 22:34:29.388 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:34:41.841 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:34:41.844 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 74 的难度评分为: 85
|
||||
2025-11-27 22:34:41.844 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:34:41.844 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:34:43.845 | INFO | __main__:score_products:212 -
|
||||
评分进度: 75/251 - 产品ID: 75
|
||||
2025-11-27 22:34:43.845 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:34:54.980 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:34:54.984 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 75 的难度评分为: 75
|
||||
2025-11-27 22:34:54.984 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:34:54.984 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:34:56.984 | INFO | __main__:score_products:212 -
|
||||
评分进度: 76/251 - 产品ID: 76
|
||||
2025-11-27 22:34:56.985 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:35:08.186 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:35:08.191 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 76 的难度评分为: 75
|
||||
2025-11-27 22:35:08.191 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:35:08.191 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:35:10.192 | INFO | __main__:score_products:212 -
|
||||
评分进度: 77/251 - 产品ID: 77
|
||||
2025-11-27 22:35:10.193 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:35:15.593 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:35:15.597 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 77 的难度评分为: 85
|
||||
2025-11-27 22:35:15.597 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:35:15.597 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:35:17.597 | INFO | __main__:score_products:212 -
|
||||
评分进度: 78/251 - 产品ID: 78
|
||||
2025-11-27 22:35:17.598 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:35:30.231 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:35:30.235 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 78 的难度评分为: 75
|
||||
2025-11-27 22:35:30.235 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:35:30.235 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:35:32.235 | INFO | __main__:score_products:212 -
|
||||
评分进度: 79/251 - 产品ID: 79
|
||||
2025-11-27 22:35:32.236 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:35:45.524 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:35:45.528 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 79 的难度评分为: 75
|
||||
2025-11-27 22:35:45.528 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:35:45.528 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:35:47.528 | INFO | __main__:score_products:212 -
|
||||
评分进度: 80/251 - 产品ID: 80
|
||||
2025-11-27 22:35:47.529 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:36:01.332 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 65
|
||||
2025-11-27 22:36:01.335 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 80 的难度评分为: 65
|
||||
2025-11-27 22:36:01.335 | SUCCESS | __main__:score_products:221 - 评分完成: 65分
|
||||
2025-11-27 22:36:01.335 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:36:03.336 | INFO | __main__:score_products:212 -
|
||||
评分进度: 81/251 - 产品ID: 81
|
||||
2025-11-27 22:36:03.337 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:36:15.964 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:36:15.967 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 81 的难度评分为: 85
|
||||
2025-11-27 22:36:15.967 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:36:15.967 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:36:17.967 | INFO | __main__:score_products:212 -
|
||||
评分进度: 82/251 - 产品ID: 82
|
||||
2025-11-27 22:36:17.968 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:36:33.251 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:36:33.255 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 82 的难度评分为: 95
|
||||
2025-11-27 22:36:33.256 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:36:33.256 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:36:35.256 | INFO | __main__:score_products:212 -
|
||||
评分进度: 83/251 - 产品ID: 83
|
||||
2025-11-27 22:36:35.256 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:36:49.059 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 90
|
||||
2025-11-27 22:36:49.063 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 83 的难度评分为: 90
|
||||
2025-11-27 22:36:49.063 | SUCCESS | __main__:score_products:221 - 评分完成: 90分
|
||||
2025-11-27 22:36:49.063 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:36:51.064 | INFO | __main__:score_products:212 -
|
||||
评分进度: 84/251 - 产品ID: 84
|
||||
2025-11-27 22:36:51.064 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:37:05.285 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:37:05.288 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 84 的难度评分为: 85
|
||||
2025-11-27 22:37:05.289 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:37:05.289 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:37:07.289 | INFO | __main__:score_products:212 -
|
||||
评分进度: 85/251 - 产品ID: 85
|
||||
2025-11-27 22:37:07.290 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:37:19.469 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 90
|
||||
2025-11-27 22:37:19.473 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 85 的难度评分为: 90
|
||||
2025-11-27 22:37:19.473 | SUCCESS | __main__:score_products:221 - 评分完成: 90分
|
||||
2025-11-27 22:37:19.473 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:37:21.473 | INFO | __main__:score_products:212 -
|
||||
评分进度: 86/251 - 产品ID: 86
|
||||
2025-11-27 22:37:21.474 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:37:34.519 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:37:34.522 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 86 的难度评分为: 85
|
||||
2025-11-27 22:37:34.523 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:37:34.523 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:37:36.523 | INFO | __main__:score_products:212 -
|
||||
评分进度: 87/251 - 产品ID: 87
|
||||
2025-11-27 22:37:36.524 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:37:50.313 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:37:50.317 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 87 的难度评分为: 85
|
||||
2025-11-27 22:37:50.317 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:37:50.317 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:37:52.317 | INFO | __main__:score_products:212 -
|
||||
评分进度: 88/251 - 产品ID: 88
|
||||
2025-11-27 22:37:52.318 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:37:59.835 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:37:59.839 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 88 的难度评分为: 75
|
||||
2025-11-27 22:37:59.839 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:37:59.839 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:38:01.839 | INFO | __main__:score_products:212 -
|
||||
评分进度: 89/251 - 产品ID: 89
|
||||
2025-11-27 22:38:01.840 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:38:17.211 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 65
|
||||
2025-11-27 22:38:17.215 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 89 的难度评分为: 65
|
||||
2025-11-27 22:38:17.215 | SUCCESS | __main__:score_products:221 - 评分完成: 65分
|
||||
2025-11-27 22:38:17.215 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:38:19.216 | INFO | __main__:score_products:212 -
|
||||
评分进度: 90/251 - 产品ID: 90
|
||||
2025-11-27 22:38:19.216 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:38:41.217 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:38:41.221 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 90 的难度评分为: 75
|
||||
2025-11-27 22:38:41.221 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:38:41.221 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:38:43.222 | INFO | __main__:score_products:212 -
|
||||
评分进度: 91/251 - 产品ID: 91
|
||||
2025-11-27 22:38:43.223 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:38:56.247 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 70
|
||||
2025-11-27 22:38:56.252 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 91 的难度评分为: 70
|
||||
2025-11-27 22:38:56.252 | SUCCESS | __main__:score_products:221 - 评分完成: 70分
|
||||
2025-11-27 22:38:56.252 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:38:58.252 | INFO | __main__:score_products:212 -
|
||||
评分进度: 92/251 - 产品ID: 92
|
||||
2025-11-27 22:38:58.253 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:39:05.522 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 100
|
||||
2025-11-27 22:39:05.527 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 92 的难度评分为: 100
|
||||
2025-11-27 22:39:05.527 | SUCCESS | __main__:score_products:221 - 评分完成: 100分
|
||||
2025-11-27 22:39:05.527 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:39:07.527 | INFO | __main__:score_products:212 -
|
||||
评分进度: 93/251 - 产品ID: 93
|
||||
2025-11-27 22:39:07.528 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:39:22.890 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 60
|
||||
2025-11-27 22:39:22.894 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 93 的难度评分为: 60
|
||||
2025-11-27 22:39:22.895 | SUCCESS | __main__:score_products:221 - 评分完成: 60分
|
||||
2025-11-27 22:39:22.895 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:39:24.895 | INFO | __main__:score_products:212 -
|
||||
评分进度: 94/251 - 产品ID: 94
|
||||
2025-11-27 22:39:24.895 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:39:42.951 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:39:42.956 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 94 的难度评分为: 75
|
||||
2025-11-27 22:39:42.956 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:39:42.956 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:39:44.957 | INFO | __main__:score_products:212 -
|
||||
评分进度: 95/251 - 产品ID: 95
|
||||
2025-11-27 22:39:44.958 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:39:58.088 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:39:58.093 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 95 的难度评分为: 85
|
||||
2025-11-27 22:39:58.093 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:39:58.094 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:40:00.094 | INFO | __main__:score_products:212 -
|
||||
评分进度: 96/251 - 产品ID: 96
|
||||
2025-11-27 22:40:00.095 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:40:09.793 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:40:09.797 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 96 的难度评分为: 75
|
||||
2025-11-27 22:40:09.797 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:40:09.797 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:40:11.797 | INFO | __main__:score_products:212 -
|
||||
评分进度: 97/251 - 产品ID: 97
|
||||
2025-11-27 22:40:11.798 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:40:27.589 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:40:27.593 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 97 的难度评分为: 75
|
||||
2025-11-27 22:40:27.594 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:40:27.594 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:40:29.594 | INFO | __main__:score_products:212 -
|
||||
评分进度: 98/251 - 产品ID: 98
|
||||
2025-11-27 22:40:29.595 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:40:42.639 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:40:42.645 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 98 的难度评分为: 95
|
||||
2025-11-27 22:40:42.645 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:40:42.645 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:40:44.646 | INFO | __main__:score_products:212 -
|
||||
评分进度: 99/251 - 产品ID: 99
|
||||
2025-11-27 22:40:44.646 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:40:54.784 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:40:54.788 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 99 的难度评分为: 85
|
||||
2025-11-27 22:40:54.788 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:40:54.788 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:40:56.788 | INFO | __main__:score_products:212 -
|
||||
评分进度: 100/251 - 产品ID: 100
|
||||
2025-11-27 22:40:56.789 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:41:12.314 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:41:12.318 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 100 的难度评分为: 85
|
||||
2025-11-27 22:41:12.318 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:41:12.318 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:41:14.319 | INFO | __main__:score_products:212 -
|
||||
评分进度: 101/251 - 产品ID: 101
|
||||
2025-11-27 22:41:14.320 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:41:21.103 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 60
|
||||
2025-11-27 22:41:21.107 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 101 的难度评分为: 60
|
||||
2025-11-27 22:41:21.107 | SUCCESS | __main__:score_products:221 - 评分完成: 60分
|
||||
2025-11-27 22:41:21.107 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:41:23.108 | INFO | __main__:score_products:212 -
|
||||
评分进度: 102/251 - 产品ID: 102
|
||||
2025-11-27 22:41:23.109 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:41:33.685 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:41:33.689 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 102 的难度评分为: 95
|
||||
2025-11-27 22:41:33.689 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:41:33.689 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:41:35.690 | INFO | __main__:score_products:212 -
|
||||
评分进度: 103/251 - 产品ID: 103
|
||||
2025-11-27 22:41:35.690 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:41:46.143 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:41:46.147 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 103 的难度评分为: 85
|
||||
2025-11-27 22:41:46.147 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:41:46.147 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:41:48.148 | INFO | __main__:score_products:212 -
|
||||
评分进度: 104/251 - 产品ID: 104
|
||||
2025-11-27 22:41:48.148 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:41:59.316 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:41:59.321 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 104 的难度评分为: 85
|
||||
2025-11-27 22:41:59.321 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:41:59.321 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:42:01.321 | INFO | __main__:score_products:212 -
|
||||
评分进度: 105/251 - 产品ID: 105
|
||||
2025-11-27 22:42:01.322 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:42:15.088 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:42:15.093 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 105 的难度评分为: 75
|
||||
2025-11-27 22:42:15.093 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:42:15.093 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:42:17.094 | INFO | __main__:score_products:212 -
|
||||
评分进度: 106/251 - 产品ID: 106
|
||||
2025-11-27 22:42:17.094 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:42:30.720 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 65
|
||||
2025-11-27 22:42:30.724 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 106 的难度评分为: 65
|
||||
2025-11-27 22:42:30.724 | SUCCESS | __main__:score_products:221 - 评分完成: 65分
|
||||
2025-11-27 22:42:30.724 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:42:32.725 | INFO | __main__:score_products:212 -
|
||||
评分进度: 107/251 - 产品ID: 107
|
||||
2025-11-27 22:42:32.726 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:42:42.705 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:42:42.710 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 107 的难度评分为: 85
|
||||
2025-11-27 22:42:42.710 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:42:42.710 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:42:44.711 | INFO | __main__:score_products:212 -
|
||||
评分进度: 108/251 - 产品ID: 108
|
||||
2025-11-27 22:42:44.712 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:42:57.337 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:42:57.341 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 108 的难度评分为: 75
|
||||
2025-11-27 22:42:57.341 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:42:57.341 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:42:59.342 | INFO | __main__:score_products:212 -
|
||||
评分进度: 109/251 - 产品ID: 109
|
||||
2025-11-27 22:42:59.342 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:43:10.384 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:43:10.388 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 109 的难度评分为: 85
|
||||
2025-11-27 22:43:10.388 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:43:10.388 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:43:12.389 | INFO | __main__:score_products:212 -
|
||||
评分进度: 110/251 - 产品ID: 110
|
||||
2025-11-27 22:43:12.389 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:43:24.284 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:43:24.287 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 110 的难度评分为: 75
|
||||
2025-11-27 22:43:24.287 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:43:24.287 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:43:26.288 | INFO | __main__:score_products:212 -
|
||||
评分进度: 111/251 - 产品ID: 111
|
||||
2025-11-27 22:43:26.289 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:43:36.921 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:43:36.925 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 111 的难度评分为: 85
|
||||
2025-11-27 22:43:36.925 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:43:36.925 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:43:38.926 | INFO | __main__:score_products:212 -
|
||||
评分进度: 112/251 - 产品ID: 112
|
||||
2025-11-27 22:43:38.926 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:43:46.973 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:43:46.978 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 112 的难度评分为: 85
|
||||
2025-11-27 22:43:46.978 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:43:46.978 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:43:48.979 | INFO | __main__:score_products:212 -
|
||||
评分进度: 113/251 - 产品ID: 113
|
||||
2025-11-27 22:43:48.979 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:44:06.897 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 95
|
||||
2025-11-27 22:44:06.901 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 113 的难度评分为: 95
|
||||
2025-11-27 22:44:06.901 | SUCCESS | __main__:score_products:221 - 评分完成: 95分
|
||||
2025-11-27 22:44:06.901 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:44:08.902 | INFO | __main__:score_products:212 -
|
||||
评分进度: 114/251 - 产品ID: 114
|
||||
2025-11-27 22:44:08.902 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:44:31.885 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 85
|
||||
2025-11-27 22:44:31.890 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 114 的难度评分为: 85
|
||||
2025-11-27 22:44:31.890 | SUCCESS | __main__:score_products:221 - 评分完成: 85分
|
||||
2025-11-27 22:44:31.890 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:44:33.891 | INFO | __main__:score_products:212 -
|
||||
评分进度: 115/251 - 产品ID: 115
|
||||
2025-11-27 22:44:33.891 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:45:10.222 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 90
|
||||
2025-11-27 22:45:10.226 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 115 的难度评分为: 90
|
||||
2025-11-27 22:45:10.226 | SUCCESS | __main__:score_products:221 - 评分完成: 90分
|
||||
2025-11-27 22:45:10.227 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
2025-11-27 22:45:12.227 | INFO | __main__:score_products:212 -
|
||||
评分进度: 116/251 - 产品ID: 116
|
||||
2025-11-27 22:45:12.228 | INFO | __main__:call_ollama_for_scoring:139 - 调用Ollama API进行难度评分
|
||||
2025-11-27 22:45:44.910 | SUCCESS | __main__:call_ollama_for_scoring:157 - 获得评分: 75
|
||||
2025-11-27 22:45:44.914 | SUCCESS | __main__:update_difficulty_score:182 - 更新产品ID 116 的难度评分为: 75
|
||||
2025-11-27 22:45:44.914 | SUCCESS | __main__:score_products:221 - 评分完成: 75分
|
||||
2025-11-27 22:45:44.914 | INFO | __main__:score_products:226 - 等待2秒后继续...
|
||||
@@ -1,250 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
产品难度评分脚本
|
||||
读取product_analysis表,增加难度评分字段,使用Ollama API进行智能评分
|
||||
"""
|
||||
|
||||
import sqlite3
|
||||
import os
|
||||
import time
|
||||
from typing import List, Tuple, Optional
|
||||
from loguru import logger
|
||||
import requests
|
||||
import json
|
||||
|
||||
class DifficultyScorer:
|
||||
"""产品难度评分器"""
|
||||
|
||||
def __init__(self, db_path: str = "products.db"):
|
||||
"""
|
||||
初始化评分器
|
||||
|
||||
Args:
|
||||
db_path: 数据库文件路径
|
||||
"""
|
||||
self.db_path = db_path
|
||||
self.api_url = "http://localhost:11434/api/generate"
|
||||
|
||||
# 检查数据库文件是否存在
|
||||
if not os.path.exists(db_path):
|
||||
current_dir_db = os.path.join(os.path.dirname(__file__), db_path)
|
||||
if os.path.exists(current_dir_db):
|
||||
self.db_path = current_dir_db
|
||||
logger.info(f"使用当前目录下的数据库文件: {current_dir_db}")
|
||||
else:
|
||||
raise FileNotFoundError(f"数据库文件不存在: {db_path} 和 {current_dir_db}")
|
||||
|
||||
logger.info(f"初始化产品难度评分器,数据库: {self.db_path}")
|
||||
|
||||
def connect_to_database(self) -> sqlite3.Connection:
|
||||
"""连接到SQLite数据库"""
|
||||
try:
|
||||
conn = sqlite3.connect(self.db_path)
|
||||
logger.success(f"成功连接到数据库: {self.db_path}")
|
||||
return conn
|
||||
except Exception as e:
|
||||
logger.error(f"连接数据库失败: {e}")
|
||||
raise
|
||||
|
||||
def add_difficulty_score_column(self, conn: sqlite3.Connection):
|
||||
"""添加难度评分字段"""
|
||||
try:
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 检查字段是否已存在
|
||||
cursor.execute("PRAGMA table_info(product_analysis)")
|
||||
columns = [row[1] for row in cursor.fetchall()]
|
||||
|
||||
if 'difficulty_score' not in columns:
|
||||
cursor.execute("ALTER TABLE product_analysis ADD COLUMN difficulty_score INTEGER")
|
||||
conn.commit()
|
||||
logger.success("成功添加difficulty_score字段")
|
||||
else:
|
||||
logger.info("difficulty_score字段已存在")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"添加难度评分字段失败: {e}")
|
||||
raise
|
||||
|
||||
def get_unscored_products(self, conn: sqlite3.Connection) -> List[Tuple]:
|
||||
"""
|
||||
获取未评分的产品数据
|
||||
|
||||
Args:
|
||||
conn: 数据库连接
|
||||
|
||||
Returns:
|
||||
产品数据列表,每个元素为(id, ai_response)
|
||||
"""
|
||||
try:
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 查询未评分的产品
|
||||
cursor.execute("""
|
||||
SELECT id, ai_response
|
||||
FROM product_analysis
|
||||
WHERE difficulty_score IS NULL
|
||||
AND ai_response IS NOT NULL
|
||||
AND ai_response != ''
|
||||
""")
|
||||
|
||||
products = cursor.fetchall()
|
||||
logger.info(f"找到 {len(products)} 个未评分的产品")
|
||||
|
||||
return products
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"获取未评分产品数据失败: {e}")
|
||||
raise
|
||||
|
||||
def call_ollama_for_scoring(self, ai_response: str) -> Optional[int]:
|
||||
"""
|
||||
调用Ollama API进行难度评分
|
||||
|
||||
Args:
|
||||
ai_response: AI响应内容
|
||||
|
||||
Returns:
|
||||
评分(0-100),失败时返回None
|
||||
"""
|
||||
try:
|
||||
# 构建评分提示
|
||||
prompt = f"""
|
||||
请根据以下产品开发难度描述,给出一个0-100分的难度评分:
|
||||
|
||||
难度描述:{ai_response}
|
||||
|
||||
评分标准:
|
||||
- 90-100分:个人开发极其困难,需要大量专业知识和团队协作
|
||||
- 70-89分:相对困难,需要较强的技术能力和较多时间
|
||||
- 50-69分:中等难度,需要一定的技术基础
|
||||
- 30-49分:相对简单,有基础即可开发
|
||||
- 10-29分:非常简单,入门级别
|
||||
- 0-9分:极其简单,几乎无难度
|
||||
|
||||
请只返回一个数字,不要有任何其他文字。
|
||||
"""
|
||||
|
||||
data = {
|
||||
"model": "qwen3:8b",
|
||||
"prompt": prompt.strip(),
|
||||
"stream": False
|
||||
}
|
||||
|
||||
headers = {
|
||||
"Content-Type": "application/json"
|
||||
}
|
||||
|
||||
logger.info(f"调用Ollama API进行难度评分")
|
||||
|
||||
response = requests.post(
|
||||
self.api_url,
|
||||
headers=headers,
|
||||
data=json.dumps(data, ensure_ascii=False),
|
||||
timeout=60
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
result = response.json()
|
||||
score_text = result.get("response", "").strip()
|
||||
|
||||
# 尝试解析评分
|
||||
try:
|
||||
score = int(score_text)
|
||||
# 确保评分在有效范围内
|
||||
score = max(0, min(100, score))
|
||||
logger.success(f"获得评分: {score}")
|
||||
return score
|
||||
except ValueError:
|
||||
logger.error(f"无法解析评分: {score_text}")
|
||||
return None
|
||||
else:
|
||||
logger.error(f"API调用失败: {response.status_code}, {response.text}")
|
||||
return None
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"调用Ollama API时出错: {e}")
|
||||
return None
|
||||
|
||||
def update_difficulty_score(self, conn: sqlite3.Connection, product_id: int, score: int):
|
||||
"""更新产品难度评分"""
|
||||
try:
|
||||
cursor = conn.cursor()
|
||||
|
||||
cursor.execute("""
|
||||
UPDATE product_analysis
|
||||
SET difficulty_score = ?
|
||||
WHERE id = ?
|
||||
""", (score, product_id))
|
||||
|
||||
conn.commit()
|
||||
logger.success(f"更新产品ID {product_id} 的难度评分为: {score}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"更新难度评分失败: {e}")
|
||||
raise
|
||||
|
||||
def score_products(self):
|
||||
"""评分所有未评分的产品"""
|
||||
logger.info("开始产品难度评分")
|
||||
|
||||
conn = None
|
||||
try:
|
||||
# 连接数据库
|
||||
conn = self.connect_to_database()
|
||||
|
||||
# 添加难度评分字段
|
||||
self.add_difficulty_score_column(conn)
|
||||
|
||||
# 获取未评分的产品
|
||||
products = self.get_unscored_products(conn)
|
||||
|
||||
if not products:
|
||||
logger.info("没有需要评分的产品")
|
||||
return
|
||||
|
||||
logger.info(f"准备评分 {len(products)} 个产品")
|
||||
|
||||
# 逐个评分
|
||||
success_count = 0
|
||||
for i, (product_id, ai_response) in enumerate(products, 1):
|
||||
logger.info(f"\n评分进度: {i}/{len(products)} - 产品ID: {product_id}")
|
||||
|
||||
# 调用AI进行评分
|
||||
score = self.call_ollama_for_scoring(ai_response)
|
||||
|
||||
if score is not None:
|
||||
# 更新数据库
|
||||
self.update_difficulty_score(conn, product_id, score)
|
||||
success_count += 1
|
||||
logger.success(f"评分完成: {score}分")
|
||||
else:
|
||||
logger.error(f"评分失败: 产品ID {product_id}")
|
||||
|
||||
# 延时避免API过载
|
||||
logger.info("等待2秒后继续...")
|
||||
time.sleep(2)
|
||||
|
||||
logger.success(f"评分完成! 成功评分 {success_count} 个产品")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"评分过程中出错: {e}")
|
||||
finally:
|
||||
if conn:
|
||||
conn.close()
|
||||
logger.info("数据库连接已关闭")
|
||||
|
||||
def main():
|
||||
"""主函数"""
|
||||
# 配置日志
|
||||
logger.add("difficulty_scorer.log", rotation="10 MB", level="INFO")
|
||||
|
||||
# 创建评分器
|
||||
scorer = DifficultyScorer()
|
||||
|
||||
# 开始评分
|
||||
scorer.score_products()
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
1121
product/integrated_product_system.py
Normal file
1121
product/integrated_product_system.py
Normal file
File diff suppressed because it is too large
Load Diff
@@ -1,353 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
全功能ProductHunt数据抓取器
|
||||
使用playwright-get-data.py中的专业功能绕过Cloudflare挑战
|
||||
"""
|
||||
|
||||
import sqlite3
|
||||
import asyncio
|
||||
import os
|
||||
import argparse
|
||||
from datetime import datetime
|
||||
from loguru import logger
|
||||
from tqdm import tqdm
|
||||
import sys
|
||||
|
||||
# 导入playwright-get-data.py中的功能
|
||||
import importlib.util
|
||||
|
||||
# 动态导入playwright-get-data.py
|
||||
playwright_data_path = os.path.join(os.path.dirname(__file__), "playwright-get-data.py")
|
||||
spec = importlib.util.spec_from_file_location("playwright_get_data", playwright_data_path)
|
||||
playwright_get_data = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(playwright_get_data)
|
||||
ProductHuntScraper = playwright_get_data.ProductHuntScraper
|
||||
|
||||
# 配置日志
|
||||
logger.remove()
|
||||
logger.add(sys.stderr, level="INFO", format="<green>{time:YYYY-MM-DD HH:mm:ss}</green> | <level>{level: <8}</level> | <cyan>{name}</cyan>:<cyan>{function}</cyan>:<cyan>{line}</cyan> - <level>{message}</level>")
|
||||
|
||||
class ProductHuntScraperFull:
|
||||
"""全功能ProductHunt数据抓取器"""
|
||||
|
||||
def __init__(self, tophub_db_path=None, product_db_path=None, debug_port=9222, limit=0, skip_duplicates=True):
|
||||
"""
|
||||
初始化抓取器
|
||||
|
||||
Args:
|
||||
tophub_db_path: tophub数据库路径
|
||||
product_db_path: 产品数据库路径
|
||||
debug_port: Chrome调试端口
|
||||
limit: 抓取链接数量限制
|
||||
skip_duplicates: 是否跳过已存在的URL
|
||||
"""
|
||||
if tophub_db_path:
|
||||
self.tophub_db_path = tophub_db_path
|
||||
else:
|
||||
self.tophub_db_path = os.path.join(os.path.dirname(os.path.dirname(__file__)), "tophub_data.db")
|
||||
|
||||
if product_db_path:
|
||||
self.product_db_path = product_db_path
|
||||
else:
|
||||
self.product_db_path = os.path.join(os.path.dirname(__file__), "products.db")
|
||||
|
||||
self.debug_port = debug_port
|
||||
self.limit = limit
|
||||
self.skip_duplicates = skip_duplicates
|
||||
self.product_urls = []
|
||||
|
||||
def query_producthunt_urls(self, limit=None):
|
||||
"""查询包含producthunt.com的链接"""
|
||||
if limit is None:
|
||||
limit = self.limit
|
||||
|
||||
logger.info(f"正在查询tophub_data.db数据库,限制: {limit}条")
|
||||
|
||||
try:
|
||||
conn = sqlite3.connect(self.tophub_db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 查询包含producthunt.com的链接(去掉LIMIT限制)
|
||||
cursor.execute("SELECT url FROM articles WHERE url LIKE '%producthunt.com%'")
|
||||
|
||||
urls = [row[0] for row in cursor.fetchall()]
|
||||
|
||||
conn.close()
|
||||
|
||||
logger.success(f"找到 {len(urls)} 个包含producthunt.com的链接")
|
||||
return urls
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"查询数据库失败: {e}")
|
||||
return []
|
||||
|
||||
def init_product_database(self):
|
||||
"""初始化产品数据库"""
|
||||
logger.info("正在初始化产品数据库...")
|
||||
|
||||
try:
|
||||
conn = sqlite3.connect(self.product_db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 创建产品信息表
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS products (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
url TEXT NOT NULL UNIQUE,
|
||||
name TEXT,
|
||||
introduction TEXT,
|
||||
user_count TEXT,
|
||||
maker_link TEXT,
|
||||
maker_statement TEXT,
|
||||
created_at TEXT NOT NULL,
|
||||
updated_at TEXT NOT NULL
|
||||
)
|
||||
''')
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
logger.success("产品数据库初始化完成")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"初始化数据库失败: {e}")
|
||||
|
||||
def check_duplicate(self, url):
|
||||
"""检查URL是否已存在"""
|
||||
try:
|
||||
conn = sqlite3.connect(self.product_db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
cursor.execute("SELECT COUNT(*) FROM products WHERE url = ?", (url,))
|
||||
count = cursor.fetchone()[0]
|
||||
|
||||
conn.close()
|
||||
return count > 0
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"检查重复失败: {e}")
|
||||
return False
|
||||
|
||||
def save_product_info(self, product_info):
|
||||
"""保存产品信息到数据库"""
|
||||
try:
|
||||
conn = sqlite3.connect(self.product_db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
current_time = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
|
||||
|
||||
# 检查是否已存在
|
||||
cursor.execute("SELECT id FROM products WHERE url = ?", (product_info['url'],))
|
||||
existing = cursor.fetchone()
|
||||
|
||||
if existing:
|
||||
# 更新现有记录
|
||||
cursor.execute('''
|
||||
UPDATE products SET
|
||||
name = ?, introduction = ?, user_count = ?,
|
||||
maker_link = ?, maker_statement = ?, updated_at = ?
|
||||
WHERE url = ?
|
||||
''', (
|
||||
product_info.get('name'),
|
||||
product_info.get('introduction'),
|
||||
product_info.get('user_count'),
|
||||
product_info.get('maker_link'),
|
||||
product_info.get('maker_statement'),
|
||||
current_time,
|
||||
product_info['url']
|
||||
))
|
||||
logger.info(f"更新产品信息: {product_info.get('name', '未知')}")
|
||||
else:
|
||||
# 插入新记录
|
||||
cursor.execute('''
|
||||
INSERT INTO products
|
||||
(url, name, introduction, user_count, maker_link, maker_statement, created_at, updated_at)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?)
|
||||
''', (
|
||||
product_info['url'],
|
||||
product_info.get('name'),
|
||||
product_info.get('introduction'),
|
||||
product_info.get('user_count'),
|
||||
product_info.get('maker_link'),
|
||||
product_info.get('maker_statement'),
|
||||
current_time,
|
||||
current_time
|
||||
))
|
||||
logger.info(f"新增产品信息: {product_info.get('name', '未知')}")
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"保存产品信息失败: {e}")
|
||||
return False
|
||||
|
||||
async def scrape_product_info(self, url):
|
||||
"""使用playwright-get-data.py中的专业功能抓取产品信息"""
|
||||
try:
|
||||
logger.info(f"开始抓取: {url}")
|
||||
|
||||
# 创建ProductHuntScraper实例
|
||||
scraper = ProductHuntScraper(debug_port=self.debug_port)
|
||||
|
||||
# 连接到已运行的Chrome实例
|
||||
connected = await scraper.connect_to_existing_chrome()
|
||||
if not connected:
|
||||
logger.error("连接Chrome失败,跳过此URL")
|
||||
return None
|
||||
|
||||
# 导航到ProductHunt页面
|
||||
navigated = await scraper.navigate_to_producthunt(url)
|
||||
if not navigated:
|
||||
logger.error("导航到页面失败,跳过此URL")
|
||||
await scraper.close()
|
||||
return None
|
||||
|
||||
# 提取产品信息
|
||||
product_info = await scraper.extract_product_info()
|
||||
if product_info:
|
||||
product_info['url'] = url
|
||||
logger.success(f"成功提取产品信息: {product_info.get('name', '未知')}")
|
||||
else:
|
||||
logger.error("提取产品信息失败")
|
||||
|
||||
# 关闭连接
|
||||
await scraper.close()
|
||||
|
||||
return product_info
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"抓取产品信息失败: {e}")
|
||||
return None
|
||||
|
||||
async def run_scraping(self, urls=None):
|
||||
"""运行抓取任务"""
|
||||
logger.info("=== 开始ProductHunt数据抓取 ===")
|
||||
|
||||
# 初始化数据库
|
||||
self.init_product_database()
|
||||
|
||||
# 获取要抓取的URL列表
|
||||
if urls is None:
|
||||
self.product_urls = self.query_producthunt_urls()
|
||||
else:
|
||||
self.product_urls = urls
|
||||
|
||||
if not self.product_urls:
|
||||
logger.error("未找到要抓取的ProductHunt链接")
|
||||
return False
|
||||
|
||||
logger.info(f"找到 {len(self.product_urls)} 个ProductHunt链接")
|
||||
|
||||
# 统计抓取结果
|
||||
success_count = 0
|
||||
skip_count = 0
|
||||
error_count = 0
|
||||
|
||||
# 使用进度条显示处理进度
|
||||
with tqdm(total=len(self.product_urls), desc="抓取ProductHunt链接") as pbar:
|
||||
for url in self.product_urls:
|
||||
logger.info(f"处理URL: {url}")
|
||||
|
||||
# 检查是否已存在
|
||||
if self.skip_duplicates and self.check_duplicate(url):
|
||||
logger.info(f"URL已存在,跳过: {url}")
|
||||
skip_count += 1
|
||||
pbar.update(1)
|
||||
continue
|
||||
|
||||
# 抓取产品信息
|
||||
product_info = await self.scrape_product_info(url)
|
||||
|
||||
if product_info:
|
||||
# 保存到数据库
|
||||
success = self.save_product_info(product_info)
|
||||
if success:
|
||||
logger.success(f"成功保存产品信息: {product_info.get('name', '未知')}")
|
||||
success_count += 1
|
||||
else:
|
||||
logger.error(f"保存产品信息失败: {url}")
|
||||
error_count += 1
|
||||
else:
|
||||
logger.error(f"抓取产品信息失败: {url}")
|
||||
error_count += 1
|
||||
|
||||
pbar.update(1)
|
||||
|
||||
# 显示抓取结果统计
|
||||
self.show_scraping_results(success_count, skip_count, error_count)
|
||||
|
||||
logger.success("=== ProductHunt数据抓取完成 ===")
|
||||
return True
|
||||
|
||||
def show_scraping_results(self, success_count, skip_count, error_count):
|
||||
"""显示抓取结果统计"""
|
||||
try:
|
||||
conn = sqlite3.connect(self.product_db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 统计数据库中的产品数量
|
||||
cursor.execute("SELECT COUNT(*) FROM products")
|
||||
total_count = cursor.fetchone()[0]
|
||||
|
||||
# 获取最新抓取的产品信息
|
||||
cursor.execute("SELECT name, url FROM products ORDER BY updated_at DESC LIMIT 10")
|
||||
recent_products = cursor.fetchall()
|
||||
|
||||
conn.close()
|
||||
|
||||
logger.info("=== 抓取结果统计 ===")
|
||||
logger.info(f"成功抓取: {success_count} 个产品")
|
||||
logger.info(f"跳过重复: {skip_count} 个链接")
|
||||
logger.info(f"抓取失败: {error_count} 个链接")
|
||||
logger.info(f"数据库中的产品总数: {total_count}")
|
||||
|
||||
if recent_products:
|
||||
logger.info("最新抓取的产品:")
|
||||
for name, url in recent_products:
|
||||
logger.info(f" - {name}: {url}")
|
||||
else:
|
||||
logger.info("数据库中暂无产品记录")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"显示抓取结果失败: {e}")
|
||||
|
||||
def parse_arguments():
|
||||
"""解析命令行参数"""
|
||||
parser = argparse.ArgumentParser(description="全功能ProductHunt数据抓取器")
|
||||
parser.add_argument("--tophub-db", help="tophub数据库路径", default=None)
|
||||
parser.add_argument("--product-db", help="产品数据库路径", default=None)
|
||||
parser.add_argument("--debug-port", type=int, help="Chrome调试端口", default=9222)
|
||||
parser.add_argument("--limit", type=int, help="抓取链接数量限制", default=0)
|
||||
parser.add_argument("--no-skip-duplicates", action="store_true", help="不跳过重复URL")
|
||||
parser.add_argument("--urls", nargs="+", help="指定要抓取的URL列表")
|
||||
parser.add_argument("--log-file", help="日志文件路径", default="producthunt_scraper.log")
|
||||
|
||||
return parser.parse_args()
|
||||
|
||||
async def main():
|
||||
"""主函数"""
|
||||
args = parse_arguments()
|
||||
|
||||
# 配置日志文件输出
|
||||
logger.add(args.log_file, level="INFO", rotation="10 MB")
|
||||
|
||||
# 创建抓取器实例
|
||||
scraper = ProductHuntScraperFull(
|
||||
tophub_db_path=args.tophub_db,
|
||||
product_db_path=args.product_db,
|
||||
debug_port=args.debug_port,
|
||||
limit=args.limit,
|
||||
skip_duplicates=not args.no_skip_duplicates
|
||||
)
|
||||
|
||||
# 运行抓取任务
|
||||
if args.urls:
|
||||
await scraper.run_scraping(urls=args.urls)
|
||||
else:
|
||||
await scraper.run_scraping()
|
||||
|
||||
if __name__ == "__main__":
|
||||
# 运行异步主函数
|
||||
asyncio.run(main())
|
||||
BIN
product/product.db
Normal file
BIN
product/product.db
Normal file
Binary file not shown.
File diff suppressed because it is too large
Load Diff
@@ -1,342 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
产品AI分析脚本
|
||||
读取SQLite数据库中的产品信息,调用Ollama AI API进行分析,并将结果存储到新表中
|
||||
"""
|
||||
|
||||
import sqlite3
|
||||
import os
|
||||
import time
|
||||
from typing import List, Tuple, Optional
|
||||
from loguru import logger
|
||||
|
||||
# 智谱AI API相关
|
||||
import requests
|
||||
import json
|
||||
|
||||
class ProductAIAnalyzer:
|
||||
"""产品AI分析器"""
|
||||
|
||||
def __init__(self, api_key: str = "", db_path: str = "products.db"):
|
||||
"""
|
||||
初始化分析器
|
||||
|
||||
Args:
|
||||
api_key: API密钥(Ollama不需要,保留参数以保持兼容性)
|
||||
db_path: 数据库文件路径
|
||||
"""
|
||||
self.api_key = api_key
|
||||
self.db_path = db_path
|
||||
self.api_url = "http://localhost:11434/api/generate"
|
||||
|
||||
# 检查数据库文件是否存在,支持相对路径和绝对路径
|
||||
if not os.path.exists(db_path):
|
||||
# 尝试在当前目录下查找
|
||||
current_dir_db = os.path.join(os.path.dirname(__file__), db_path)
|
||||
if os.path.exists(current_dir_db):
|
||||
self.db_path = current_dir_db
|
||||
logger.info(f"使用当前目录下的数据库文件: {current_dir_db}")
|
||||
else:
|
||||
raise FileNotFoundError(f"数据库文件不存在: {db_path} 和 {current_dir_db}")
|
||||
|
||||
logger.info(f"初始化产品AI分析器,数据库: {self.db_path}")
|
||||
|
||||
def connect_to_database(self) -> sqlite3.Connection:
|
||||
"""连接到SQLite数据库"""
|
||||
try:
|
||||
conn = sqlite3.connect(self.db_path)
|
||||
logger.success(f"成功连接到数据库: {self.db_path}")
|
||||
return conn
|
||||
except Exception as e:
|
||||
logger.error(f"连接数据库失败: {e}")
|
||||
raise
|
||||
|
||||
def get_product_data(self, conn: sqlite3.Connection) -> List[Tuple]:
|
||||
"""
|
||||
从数据库获取产品数据
|
||||
|
||||
Args:
|
||||
conn: 数据库连接
|
||||
|
||||
Returns:
|
||||
产品数据列表,每个元素为(id, name, introduction)
|
||||
"""
|
||||
try:
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 查询products表中的name和introduction字段
|
||||
cursor.execute("""
|
||||
SELECT id, name, introduction
|
||||
FROM products
|
||||
WHERE name IS NOT NULL AND introduction IS NOT NULL
|
||||
AND name != '' AND introduction != ''
|
||||
""")
|
||||
|
||||
products = cursor.fetchall()
|
||||
logger.info(f"从数据库获取到 {len(products)} 个产品")
|
||||
|
||||
# 显示前几个产品作为示例
|
||||
for i, (id, name, intro) in enumerate(products[:3], 1):
|
||||
logger.info(f"示例产品{i}: ID={id}, 名称='{name}', 简介='{intro[:50]}...'")
|
||||
|
||||
return products
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"获取产品数据失败: {e}")
|
||||
raise
|
||||
|
||||
def call_ollama_ai_api(self, name: str, introduction: str) -> Optional[str]:
|
||||
"""
|
||||
调用Ollama AI API进行分析
|
||||
|
||||
Args:
|
||||
name: 产品名称
|
||||
introduction: 产品简介
|
||||
|
||||
Returns:
|
||||
API响应内容,失败时返回None
|
||||
"""
|
||||
try:
|
||||
# 构建请求数据 - 使用Ollama API格式
|
||||
prompt = f"这个是【{name}】,简介内容是【{introduction}】。请把产品的简介翻译成中文,并返回假设一个人加上AI辅助能否开发这个产品,请详细回答。返回的内容是产品名称/产品简介/开发难度。返回的例子一:notion/这个是笔记产品等等/一个人开发难度较高"
|
||||
|
||||
data = {
|
||||
"model": "qwen3:8b",
|
||||
"prompt": prompt,
|
||||
"stream": False
|
||||
}
|
||||
|
||||
headers = {
|
||||
"Content-Type": "application/json"
|
||||
}
|
||||
|
||||
logger.info(f"调用Ollama AI API分析产品: {name}")
|
||||
|
||||
response = requests.post(
|
||||
self.api_url,
|
||||
headers=headers,
|
||||
data=json.dumps(data, ensure_ascii=False),
|
||||
timeout=60
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
result = response.json()
|
||||
content = result.get("response", "")
|
||||
logger.success(f"API调用成功: {name}")
|
||||
return content
|
||||
else:
|
||||
logger.error(f"API调用失败: {response.status_code}, {response.text}")
|
||||
return None
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"调用Ollama AI API时出错: {e}")
|
||||
return None
|
||||
|
||||
def parse_ai_response(self, response: str) -> Tuple[str, str, str]:
|
||||
"""
|
||||
解析AI响应内容
|
||||
|
||||
Args:
|
||||
response: AI响应内容
|
||||
|
||||
Returns:
|
||||
(产品名称, 产品简介, 开发难度)
|
||||
"""
|
||||
try:
|
||||
# 使用/分割响应内容
|
||||
parts = response.split('/')
|
||||
|
||||
if len(parts) >= 3:
|
||||
product_name = parts[0].strip()
|
||||
product_intro = parts[1].strip()
|
||||
difficulty = parts[2].strip()
|
||||
|
||||
logger.info(f"解析结果: 名称='{product_name}', 简介='{product_intro[:30]}...', 难度='{difficulty}'")
|
||||
return product_name, product_intro, difficulty
|
||||
else:
|
||||
logger.warning(f"响应格式不符合预期: {response}")
|
||||
# 如果格式不符合,返回原始内容
|
||||
return "", response, ""
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"解析AI响应失败: {e}")
|
||||
return "", response, ""
|
||||
|
||||
def create_analysis_table(self, conn: sqlite3.Connection):
|
||||
"""创建分析结果表"""
|
||||
try:
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 创建分析结果表
|
||||
cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS product_analysis (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
original_id INTEGER,
|
||||
original_name TEXT,
|
||||
product_name TEXT,
|
||||
product_intro TEXT,
|
||||
development_difficulty TEXT,
|
||||
ai_response TEXT,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
FOREIGN KEY (original_id) REFERENCES products (id)
|
||||
)
|
||||
""")
|
||||
|
||||
conn.commit()
|
||||
logger.success("创建分析结果表成功")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"创建分析结果表失败: {e}")
|
||||
raise
|
||||
|
||||
def save_analysis_result(self, conn: sqlite3.Connection,
|
||||
original_id: int, original_name: str,
|
||||
product_name: str, product_intro: str,
|
||||
difficulty: str, ai_response: str):
|
||||
"""保存分析结果到数据库"""
|
||||
try:
|
||||
cursor = conn.cursor()
|
||||
|
||||
cursor.execute("""
|
||||
INSERT INTO product_analysis
|
||||
(original_id, original_name, product_name, product_intro, development_difficulty, ai_response)
|
||||
VALUES (?, ?, ?, ?, ?, ?)
|
||||
""", (original_id, original_name, product_name, product_intro, difficulty, ai_response))
|
||||
|
||||
conn.commit()
|
||||
logger.success(f"保存分析结果成功: {product_name}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"保存分析结果失败: {e}")
|
||||
raise
|
||||
|
||||
def check_product_exists(self, conn: sqlite3.Connection, original_name: str) -> bool:
|
||||
"""
|
||||
检查产品是否已存在于分析结果表中
|
||||
|
||||
Args:
|
||||
conn: 数据库连接
|
||||
original_name: 原始产品名称
|
||||
|
||||
Returns:
|
||||
如果产品已存在返回True,否则返回False
|
||||
"""
|
||||
try:
|
||||
cursor = conn.cursor()
|
||||
cursor.execute("""
|
||||
SELECT COUNT(*) FROM product_analysis
|
||||
WHERE original_name = ?
|
||||
""", (original_name,))
|
||||
|
||||
count = cursor.fetchone()[0]
|
||||
exists = count > 0
|
||||
|
||||
if exists:
|
||||
logger.info(f"产品 '{original_name}' 已存在,跳过分析")
|
||||
|
||||
return exists
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"检查产品存在性失败: {e}")
|
||||
return False
|
||||
|
||||
def analyze_products(self, max_products: int = None):
|
||||
"""
|
||||
分析产品数据
|
||||
|
||||
Args:
|
||||
max_products: 最大分析产品数量,None表示分析所有产品
|
||||
"""
|
||||
if max_products is None:
|
||||
logger.info("开始分析所有产品数据")
|
||||
else:
|
||||
logger.info(f"开始分析产品数据,最大数量: {max_products}")
|
||||
|
||||
conn = None
|
||||
try:
|
||||
# 连接数据库
|
||||
conn = self.connect_to_database()
|
||||
|
||||
# 创建分析结果表
|
||||
self.create_analysis_table(conn)
|
||||
|
||||
# 获取产品数据
|
||||
products = self.get_product_data(conn)
|
||||
|
||||
if not products:
|
||||
logger.warning("没有找到可分析的产品数据")
|
||||
return
|
||||
|
||||
# 限制分析数量
|
||||
if max_products is not None:
|
||||
products_to_analyze = products[:max_products]
|
||||
else:
|
||||
products_to_analyze = products
|
||||
|
||||
logger.info(f"准备分析 {len(products_to_analyze)} 个产品")
|
||||
|
||||
# 逐个分析产品
|
||||
success_count = 0
|
||||
skip_count = 0
|
||||
for i, (original_id, name, introduction) in enumerate(products_to_analyze, 1):
|
||||
logger.info(f"\n分析进度: {i}/{len(products_to_analyze)} - {name}")
|
||||
|
||||
# 检查产品是否已存在
|
||||
if self.check_product_exists(conn, name):
|
||||
skip_count += 1
|
||||
logger.info(f"跳过已存在产品,当前进度: {i}/{len(products_to_analyze)}")
|
||||
continue
|
||||
|
||||
# 显示API调用状态
|
||||
logger.info(f"正在提交API请求... 进度: {i}/{len(products_to_analyze)}")
|
||||
|
||||
# 调用AI API
|
||||
ai_response = self.call_ollama_ai_api(name, introduction)
|
||||
|
||||
if ai_response:
|
||||
# 显示数据处理状态
|
||||
logger.info(f"API调用成功,正在处理数据...")
|
||||
|
||||
# 解析响应
|
||||
product_name, product_intro, difficulty = self.parse_ai_response(ai_response)
|
||||
|
||||
# 保存结果
|
||||
self.save_analysis_result(conn, original_id, name,
|
||||
product_name, product_intro, difficulty, ai_response)
|
||||
success_count += 1
|
||||
|
||||
# 显示完成状态
|
||||
logger.success(f"产品 '{name}' 分析完成,进度: {i}/{len(products_to_analyze)}")
|
||||
else:
|
||||
logger.error(f"分析失败: {name}")
|
||||
|
||||
# 处理完数据后延时2秒
|
||||
logger.info("数据处理完成,等待2秒后继续...")
|
||||
time.sleep(2)
|
||||
|
||||
logger.success(f"分析完成! 成功分析 {success_count} 个产品,跳过 {skip_count} 个已存在产品")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"分析过程中出错: {e}")
|
||||
finally:
|
||||
if conn:
|
||||
conn.close()
|
||||
logger.info("数据库连接已关闭")
|
||||
|
||||
def main():
|
||||
"""主函数"""
|
||||
# 配置日志
|
||||
logger.add("product_ai_analysis.log", rotation="10 MB", level="INFO")
|
||||
|
||||
# Ollama不需要API密钥
|
||||
api_key = ""
|
||||
|
||||
# 创建分析器
|
||||
analyzer = ProductAIAnalyzer(api_key)
|
||||
|
||||
# 开始分析(默认分析所有产品)
|
||||
analyzer.analyze_products(max_products=None)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,7 +0,0 @@
|
||||
{
|
||||
"name": "Raycast",
|
||||
"introduction": "A collection of powerful productivity tools all within an extendable launcher. Fast, ergonomic and reliable.",
|
||||
"user_count": "17K followers",
|
||||
"maker_link": "https://www.producthunt.com/products/raycast/launches/product-hunt-for-raycast",
|
||||
"maker_statement": "Raycast for Windows"
|
||||
}
|
||||
@@ -1,407 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
ProductHunt数据抓取器
|
||||
从tophub_data.db查询包含producthunt.com的链接,然后使用Playwright抓取产品信息并保存到product.db
|
||||
"""
|
||||
|
||||
import sqlite3
|
||||
import asyncio
|
||||
import os
|
||||
from datetime import datetime
|
||||
from loguru import logger
|
||||
from tqdm import tqdm
|
||||
import sys
|
||||
|
||||
# 配置日志
|
||||
logger.remove()
|
||||
logger.add(sys.stderr, level="INFO", format="<green>{time:YYYY-MM-DD HH:mm:ss}</green> | <level>{level: <8}</level> | <cyan>{name}</cyan>:<cyan>{function}</cyan>:<cyan>{line}</cyan> - <level>{message}</level>")
|
||||
|
||||
class ProductHuntScraper:
|
||||
"""ProductHunt数据抓取器"""
|
||||
|
||||
def __init__(self):
|
||||
self.tophub_db_path = os.path.join(os.path.dirname(os.path.dirname(__file__)), "tophub_data.db")
|
||||
self.product_db_path = os.path.join(os.path.dirname(__file__), "product.db")
|
||||
self.product_urls = []
|
||||
|
||||
def query_producthunt_urls(self):
|
||||
"""查询包含producthunt.com的链接"""
|
||||
logger.info("正在查询tophub_data.db数据库...")
|
||||
|
||||
try:
|
||||
conn = sqlite3.connect(self.tophub_db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 查询包含producthunt.com的链接
|
||||
cursor.execute("SELECT url FROM articles WHERE url LIKE '%producthunt.com%'")
|
||||
urls = [row[0] for row in cursor.fetchall()]
|
||||
|
||||
conn.close()
|
||||
|
||||
logger.success(f"找到 {len(urls)} 个包含producthunt.com的链接")
|
||||
return urls
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"查询数据库失败: {e}")
|
||||
return []
|
||||
|
||||
def init_product_database(self):
|
||||
"""初始化product.db数据库"""
|
||||
logger.info("正在初始化product.db数据库...")
|
||||
|
||||
try:
|
||||
conn = sqlite3.connect(self.product_db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 创建产品信息表
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS products (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
url TEXT NOT NULL UNIQUE,
|
||||
name TEXT,
|
||||
introduction TEXT,
|
||||
user_count TEXT,
|
||||
maker_link TEXT,
|
||||
maker_statement TEXT,
|
||||
created_at TEXT NOT NULL,
|
||||
updated_at TEXT NOT NULL
|
||||
)
|
||||
''')
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
logger.success("product.db数据库初始化完成")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"初始化数据库失败: {e}")
|
||||
|
||||
def check_duplicate(self, url):
|
||||
"""检查URL是否已存在"""
|
||||
try:
|
||||
conn = sqlite3.connect(self.product_db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
cursor.execute("SELECT COUNT(*) FROM products WHERE url = ?", (url,))
|
||||
count = cursor.fetchone()[0]
|
||||
|
||||
conn.close()
|
||||
return count > 0
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"检查重复失败: {e}")
|
||||
return False
|
||||
|
||||
def save_product_info(self, product_info):
|
||||
"""保存产品信息到数据库"""
|
||||
try:
|
||||
conn = sqlite3.connect(self.product_db_path)
|
||||
cursor = conn.cursor()
|
||||
|
||||
current_time = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
|
||||
|
||||
# 检查是否已存在
|
||||
cursor.execute("SELECT id FROM products WHERE url = ?", (product_info['url'],))
|
||||
existing = cursor.fetchone()
|
||||
|
||||
if existing:
|
||||
# 更新现有记录
|
||||
cursor.execute('''
|
||||
UPDATE products SET
|
||||
name = ?, introduction = ?, user_count = ?,
|
||||
maker_link = ?, maker_statement = ?, updated_at = ?
|
||||
WHERE url = ?
|
||||
''', (
|
||||
product_info.get('name'),
|
||||
product_info.get('introduction'),
|
||||
product_info.get('user_count'),
|
||||
product_info.get('maker_link'),
|
||||
product_info.get('maker_statement'),
|
||||
current_time,
|
||||
product_info['url']
|
||||
))
|
||||
logger.info(f"更新产品信息: {product_info.get('name', '未知')}")
|
||||
else:
|
||||
# 插入新记录
|
||||
cursor.execute('''
|
||||
INSERT INTO products
|
||||
(url, name, introduction, user_count, maker_link, maker_statement, created_at, updated_at)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?)
|
||||
''', (
|
||||
product_info['url'],
|
||||
product_info.get('name'),
|
||||
product_info.get('introduction'),
|
||||
product_info.get('user_count'),
|
||||
product_info.get('maker_link'),
|
||||
product_info.get('maker_statement'),
|
||||
current_time,
|
||||
current_time
|
||||
))
|
||||
logger.info(f"新增产品信息: {product_info.get('name', '未知')}")
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"保存产品信息失败: {e}")
|
||||
return False
|
||||
|
||||
async def scrape_product_info(self, url):
|
||||
"""使用Playwright抓取产品信息"""
|
||||
try:
|
||||
# 导入Playwright相关模块
|
||||
from playwright.async_api import async_playwright
|
||||
|
||||
logger.info(f"开始抓取: {url}")
|
||||
|
||||
# 创建Playwright实例
|
||||
playwright = await async_playwright().start()
|
||||
browser = await playwright.chromium.launch(headless=True)
|
||||
page = await browser.new_page()
|
||||
|
||||
# 设置超时时间
|
||||
page.set_default_timeout(120000) # 增加超时时间以处理Cloudflare挑战
|
||||
|
||||
# 导航到页面
|
||||
await page.goto(url, wait_until="domcontentloaded")
|
||||
|
||||
# 检查是否是Cloudflare挑战页面
|
||||
page_title = await page.title()
|
||||
if "请稍候" in page_title or "Checking" in page_title or "Verifying" in page_title:
|
||||
logger.info("检测到Cloudflare挑战页面,等待验证完成...")
|
||||
|
||||
# 等待Cloudflare挑战完成
|
||||
try:
|
||||
# 等待页面标题变化或特定元素出现
|
||||
await page.wait_for_function(
|
||||
"""() => {
|
||||
const title = document.title;
|
||||
return !title.includes('请稍候') &&
|
||||
!title.includes('Checking') &&
|
||||
!title.includes('Verifying') &&
|
||||
title !== '请稍候…';
|
||||
}""",
|
||||
timeout=300000 # 5分钟
|
||||
)
|
||||
logger.info("Cloudflare挑战已完成")
|
||||
except Exception as e:
|
||||
logger.warning(f"等待Cloudflare挑战超时: {e}")
|
||||
|
||||
# 等待页面加载
|
||||
await page.wait_for_timeout(3000)
|
||||
|
||||
product_info = {'url': url}
|
||||
|
||||
# 提取产品名称 - 改进的XPath选择器
|
||||
try:
|
||||
# 尝试多种选择器
|
||||
name_selectors = [
|
||||
"xpath=//h1",
|
||||
"xpath=//h1[@data-test='product-name']",
|
||||
"xpath=//h1[contains(@class, 'text')]",
|
||||
"xpath=//title"
|
||||
]
|
||||
|
||||
for selector in name_selectors:
|
||||
name_element = await page.query_selector(selector)
|
||||
if name_element:
|
||||
name_text = (await name_element.text_content()).strip()
|
||||
# 过滤掉页面标题中的无关内容
|
||||
if name_text and 'Product Hunt' not in name_text and len(name_text) > 5:
|
||||
product_info['name'] = name_text
|
||||
logger.info(f"提取到产品名称: {product_info['name']}")
|
||||
break
|
||||
|
||||
if 'name' not in product_info:
|
||||
logger.warning("未找到有效的产品名称元素")
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"提取产品名称失败: {e}")
|
||||
|
||||
# 提取产品简介 - 改进的XPath选择器
|
||||
try:
|
||||
intro_selectors = [
|
||||
"xpath=//*[@class='relative text-16 font-normal text-gray-700']//div",
|
||||
"xpath=//p[contains(@class, 'description')]",
|
||||
"xpath=//div[contains(@class, 'description')]",
|
||||
"xpath=//meta[@name='description']"
|
||||
]
|
||||
|
||||
for selector in intro_selectors:
|
||||
intro_element = await page.query_selector(selector)
|
||||
if intro_element:
|
||||
intro_text = (await intro_element.text_content()).strip()
|
||||
if intro_text:
|
||||
product_info['introduction'] = intro_text
|
||||
logger.info(f"提取到产品简介: {product_info['introduction'][:100]}...")
|
||||
break
|
||||
|
||||
if 'introduction' not in product_info:
|
||||
logger.warning("未找到产品简介元素")
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"提取产品简介失败: {e}")
|
||||
|
||||
# 提取用户数 - 改进的XPath选择器
|
||||
try:
|
||||
user_count_selectors = [
|
||||
"xpath=//*[@class='flex flex-row gap-2']//div/div[2]/span/p",
|
||||
"xpath=//span[contains(text(), 'users')]",
|
||||
"xpath=//span[contains(text(), 'upvotes')]",
|
||||
"xpath=//div[contains(@class, 'stats')]"
|
||||
]
|
||||
|
||||
for selector in user_count_selectors:
|
||||
user_count_element = await page.query_selector(selector)
|
||||
if user_count_element:
|
||||
user_count_text = (await user_count_element.text_content()).strip()
|
||||
if user_count_text:
|
||||
product_info['user_count'] = user_count_text
|
||||
logger.info(f"提取到用户数: {product_info['user_count']}")
|
||||
break
|
||||
|
||||
if 'user_count' not in product_info:
|
||||
logger.warning("未找到用户数元素")
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"提取用户数失败: {e}")
|
||||
|
||||
# 提取制作人链接 - 改进的XPath选择器
|
||||
try:
|
||||
maker_link_selectors = [
|
||||
"xpath=//span[contains(@class, 'absolute')]",
|
||||
"xpath=//a[contains(@href, 'hunter')]",
|
||||
"xpath=//a[contains(text(), 'hunter')]",
|
||||
"xpath=//a[contains(@class, 'maker')]"
|
||||
]
|
||||
|
||||
for selector in maker_link_selectors:
|
||||
maker_element = await page.query_selector(selector)
|
||||
if maker_element:
|
||||
# 如果是span,找父级a标签
|
||||
if 'span' in selector:
|
||||
a_element = await maker_element.evaluate_handle('(element) => element.closest("a")')
|
||||
if a_element:
|
||||
maker_link = await a_element.get_attribute('href')
|
||||
else:
|
||||
maker_link = await maker_element.get_attribute('href')
|
||||
|
||||
if maker_link and not maker_link.startswith('http'):
|
||||
base_url = "https://www.producthunt.com"
|
||||
if maker_link.startswith('/'):
|
||||
maker_link = base_url + maker_link
|
||||
else:
|
||||
maker_link = base_url + '/' + maker_link
|
||||
|
||||
if maker_link:
|
||||
product_info['maker_link'] = maker_link
|
||||
logger.info(f"提取到制作人链接: {maker_link}")
|
||||
break
|
||||
|
||||
if 'maker_link' not in product_info:
|
||||
logger.warning("未找到制作人链接元素")
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"提取制作人链接失败: {e}")
|
||||
|
||||
# 提取制作人发言(简化版本)
|
||||
try:
|
||||
if product_info.get('maker_link'):
|
||||
# 在新页面中打开制作人链接
|
||||
new_page = await browser.new_page()
|
||||
await new_page.goto(product_info['maker_link'], wait_until="domcontentloaded")
|
||||
await new_page.wait_for_timeout(5000)
|
||||
|
||||
# 尝试多种选择器提取发言内容
|
||||
statement_selectors = [
|
||||
"xpath=//*[@id='comment-4597755']/div/div[2]/div/div/div",
|
||||
"xpath=//div[contains(@class, 'comment')]",
|
||||
"xpath=//p[contains(@class, 'comment')]",
|
||||
"xpath=//article"
|
||||
]
|
||||
|
||||
for selector in statement_selectors:
|
||||
comment_element = await new_page.query_selector(selector)
|
||||
if comment_element:
|
||||
statement_text = (await comment_element.text_content()).strip()
|
||||
if statement_text and len(statement_text) > 10:
|
||||
product_info['maker_statement'] = statement_text
|
||||
logger.info(f"提取到制作人发言: {product_info['maker_statement'][:100]}...")
|
||||
break
|
||||
|
||||
await new_page.close()
|
||||
else:
|
||||
logger.warning("没有制作人链接,跳过提取制作人发言")
|
||||
except Exception as e:
|
||||
logger.warning(f"提取制作人发言失败: {e}")
|
||||
|
||||
# 关闭浏览器
|
||||
await browser.close()
|
||||
await playwright.stop()
|
||||
|
||||
logger.success(f"抓取完成: {product_info.get('name', '未知')}")
|
||||
return product_info
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"抓取产品信息失败: {e}")
|
||||
return {'url': url}
|
||||
|
||||
async def process_urls(self):
|
||||
"""处理所有URL"""
|
||||
# 查询URL
|
||||
self.product_urls = self.query_producthunt_urls()
|
||||
|
||||
if not self.product_urls:
|
||||
logger.warning("未找到包含producthunt.com的链接")
|
||||
return
|
||||
|
||||
# 初始化数据库
|
||||
self.init_product_database()
|
||||
|
||||
logger.info(f"开始处理 {len(self.product_urls)} 个产品链接")
|
||||
|
||||
# 创建进度条
|
||||
with tqdm(total=len(self.product_urls), desc="处理进度") as pbar:
|
||||
for url in self.product_urls:
|
||||
try:
|
||||
# 检查是否已存在
|
||||
if self.check_duplicate(url):
|
||||
logger.info(f"跳过已存在的链接: {url}")
|
||||
pbar.update(1)
|
||||
continue
|
||||
|
||||
# 抓取产品信息
|
||||
product_info = await self.scrape_product_info(url)
|
||||
|
||||
# 保存到数据库
|
||||
if product_info:
|
||||
self.save_product_info(product_info)
|
||||
|
||||
pbar.update(1)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"处理链接失败 {url}: {e}")
|
||||
pbar.update(1)
|
||||
|
||||
def run(self):
|
||||
"""运行主程序"""
|
||||
logger.info("开始ProductHunt数据抓取任务")
|
||||
|
||||
try:
|
||||
# 运行异步任务
|
||||
asyncio.run(self.process_urls())
|
||||
logger.success("任务完成")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"程序执行失败: {e}")
|
||||
|
||||
|
||||
def main():
|
||||
"""主函数"""
|
||||
scraper = ProductHuntScraper()
|
||||
scraper.run()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
Binary file not shown.
127
product/run_system.py
Normal file
127
product/run_system.py
Normal file
@@ -0,0 +1,127 @@
|
||||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
全功能产品系统运行脚本
|
||||
提供简化的命令行界面
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import os
|
||||
import sys
|
||||
from loguru import logger
|
||||
|
||||
# 导入主系统
|
||||
from integrated_product_system import IntegratedProductSystem
|
||||
from config import DATABASE_CONFIG, CHROME_CONFIG, AI_CONFIG, SCRAPING_CONFIG, LOGGING_CONFIG, ANALYSIS_CONFIG
|
||||
|
||||
|
||||
def setup_logging(log_file=None, log_level="INFO"):
|
||||
"""设置日志配置"""
|
||||
if log_file is None:
|
||||
log_file = LOGGING_CONFIG['log_file']
|
||||
|
||||
logger.remove()
|
||||
logger.add(sys.stderr, level=log_level, format=LOGGING_CONFIG['log_format'])
|
||||
logger.add(log_file, level=log_level, rotation=LOGGING_CONFIG['log_rotation'])
|
||||
|
||||
logger.info("日志系统初始化完成")
|
||||
|
||||
|
||||
def print_system_info():
|
||||
"""打印系统信息"""
|
||||
logger.info("=== 全功能产品抓取与分析系统 ===")
|
||||
logger.info(f"数据库路径: {DATABASE_CONFIG['product_db_path']}")
|
||||
logger.info(f"Chrome调试端口: {CHROME_CONFIG['debug_port']}")
|
||||
logger.info(f"AI模型: {AI_CONFIG['model']}")
|
||||
logger.info(f"API地址: {AI_CONFIG['api_url']}")
|
||||
logger.info("=" * 40)
|
||||
|
||||
|
||||
async def run_scraping_mode(args):
|
||||
"""运行抓取模式"""
|
||||
logger.info("运行抓取模式...")
|
||||
|
||||
system = IntegratedProductSystem(
|
||||
tophub_db_path=args.tophub_db or DATABASE_CONFIG['tophub_db_path'],
|
||||
product_db_path=args.product_db or DATABASE_CONFIG['product_db_path'],
|
||||
debug_port=args.debug_port or CHROME_CONFIG['debug_port'],
|
||||
limit=args.limit or SCRAPING_CONFIG['default_limit'],
|
||||
skip_duplicates=args.skip_duplicates if hasattr(args, 'skip_duplicates') else SCRAPING_CONFIG['skip_duplicates']
|
||||
)
|
||||
|
||||
# 初始化数据库
|
||||
system.init_database()
|
||||
|
||||
# 运行抓取
|
||||
await system.run_scraping(urls=args.urls)
|
||||
|
||||
|
||||
async def run_analysis_mode(args):
|
||||
"""运行分析模式"""
|
||||
logger.info("运行分析模式...")
|
||||
|
||||
system = IntegratedProductSystem(
|
||||
product_db_path=args.product_db or DATABASE_CONFIG['product_db_path']
|
||||
)
|
||||
|
||||
# 初始化数据库
|
||||
system.init_database()
|
||||
|
||||
# 运行分析
|
||||
system.analyze_products(max_products=args.max_products)
|
||||
|
||||
|
||||
async def run_full_mode(args):
|
||||
"""运行完整模式(抓取+分析)"""
|
||||
logger.info("运行完整模式(抓取+分析)...")
|
||||
|
||||
system = IntegratedProductSystem(
|
||||
tophub_db_path=args.tophub_db or DATABASE_CONFIG['tophub_db_path'],
|
||||
product_db_path=args.product_db or DATABASE_CONFIG['product_db_path'],
|
||||
debug_port=args.debug_port or CHROME_CONFIG['debug_port'],
|
||||
limit=args.limit or SCRAPING_CONFIG['default_limit'],
|
||||
skip_duplicates=args.skip_duplicates if hasattr(args, 'skip_duplicates') else SCRAPING_CONFIG['skip_duplicates']
|
||||
)
|
||||
|
||||
# 运行完整工作流程
|
||||
system.run_full_workflow(max_products=args.max_products)
|
||||
|
||||
|
||||
def main():
|
||||
"""主函数"""
|
||||
parser = argparse.ArgumentParser(description="全功能产品抓取与分析系统")
|
||||
|
||||
# 通用参数
|
||||
parser.add_argument("--mode", choices=["scraping", "analysis", "full"], default="full",
|
||||
help="运行模式: scraping(仅抓取), analysis(仅分析), full(抓取+分析)")
|
||||
parser.add_argument("--tophub-db", help="tophub数据库路径")
|
||||
parser.add_argument("--product-db", help="产品数据库路径")
|
||||
parser.add_argument("--debug-port", type=int, help="Chrome调试端口")
|
||||
parser.add_argument("--limit", type=int, help="抓取链接数量限制")
|
||||
parser.add_argument("--max-products", type=int, help="最大分析产品数量")
|
||||
parser.add_argument("--log-file", help="日志文件路径")
|
||||
parser.add_argument("--log-level", choices=["DEBUG", "INFO", "WARNING", "ERROR"],
|
||||
default="INFO", help="日志级别")
|
||||
parser.add_argument("--no-skip-duplicates", action="store_true", help="不跳过重复URL")
|
||||
parser.add_argument("--urls", nargs="+", help="指定要抓取的URL列表")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# 设置日志
|
||||
setup_logging(args.log_file, args.log_level)
|
||||
|
||||
# 打印系统信息
|
||||
print_system_info()
|
||||
|
||||
# 根据模式运行
|
||||
if args.mode == "scraping":
|
||||
asyncio.run(run_scraping_mode(args))
|
||||
elif args.mode == "analysis":
|
||||
asyncio.run(run_analysis_mode(args))
|
||||
else: # full mode
|
||||
asyncio.run(run_full_mode(args))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -14,9 +14,92 @@ from PySide6.QtWidgets import (QApplication, QMainWindow, QVBoxLayout, QHBoxLayo
|
||||
QWidget, QPushButton, QTableWidget, QTableWidgetItem,
|
||||
QListWidget, QListWidgetItem, QSplitter, QFileDialog,
|
||||
QLabel, QStatusBar, QMessageBox, QHeaderView, QComboBox,
|
||||
QLineEdit, QGroupBox)
|
||||
from PySide6.QtCore import Qt
|
||||
from PySide6.QtGui import QAction
|
||||
QLineEdit, QGroupBox, QTextEdit, QStyledItemDelegate, QMenu,
|
||||
QInputDialog)
|
||||
from PySide6.QtCore import Qt, QSize
|
||||
from PySide6.QtGui import QAction, QFontMetrics
|
||||
|
||||
|
||||
class MultiLineDelegate(QStyledItemDelegate):
|
||||
"""多行文本委托,支持自动调整行高"""
|
||||
|
||||
def __init__(self, parent=None):
|
||||
super().__init__(parent)
|
||||
self.min_height = 30 # 最小行高
|
||||
self.max_height = 200 # 最大行高
|
||||
|
||||
def paint(self, painter, option, index):
|
||||
"""自定义绘制,支持多行文本"""
|
||||
# 保存原始选项
|
||||
opt = option
|
||||
|
||||
# 获取文本内容
|
||||
text = index.data(Qt.DisplayRole)
|
||||
if text is None:
|
||||
text = ""
|
||||
|
||||
# 设置文本换行
|
||||
text = str(text)
|
||||
|
||||
# 计算文本高度
|
||||
metrics = QFontMetrics(option.font)
|
||||
rect = option.rect
|
||||
|
||||
# 计算需要的行数
|
||||
lines = text.count('\n') + 1
|
||||
line_height = metrics.lineSpacing()
|
||||
text_height = lines * line_height + 10 # 添加一些边距
|
||||
|
||||
# 限制高度在最小和最大值之间
|
||||
if text_height < self.min_height:
|
||||
text_height = self.min_height
|
||||
elif text_height > self.max_height:
|
||||
text_height = self.max_height
|
||||
|
||||
# 调整绘制区域高度
|
||||
opt.rect.setHeight(text_height)
|
||||
|
||||
# 调用父类绘制方法
|
||||
super().paint(painter, opt, index)
|
||||
|
||||
def sizeHint(self, option, index):
|
||||
"""返回建议的单元格大小"""
|
||||
# 获取文本内容
|
||||
text = index.data(Qt.DisplayRole)
|
||||
if text is None:
|
||||
text = ""
|
||||
|
||||
text = str(text)
|
||||
|
||||
# 计算文本尺寸
|
||||
metrics = QFontMetrics(option.font)
|
||||
|
||||
# 计算行数
|
||||
lines = text.count('\n') + 1
|
||||
line_height = metrics.lineSpacing()
|
||||
text_height = lines * line_height + 10 # 添加边距
|
||||
|
||||
# 计算文本宽度(考虑换行)
|
||||
if '\n' in text:
|
||||
# 多行文本,计算最长行的宽度
|
||||
max_width = 0
|
||||
for line in text.split('\n'):
|
||||
line_width = metrics.horizontalAdvance(line) + 20
|
||||
max_width = max(max_width, line_width)
|
||||
else:
|
||||
# 单行文本
|
||||
max_width = metrics.horizontalAdvance(text) + 20
|
||||
|
||||
# 限制高度
|
||||
if text_height < self.min_height:
|
||||
text_height = self.min_height
|
||||
elif text_height > self.max_height:
|
||||
text_height = self.max_height
|
||||
|
||||
# 最小宽度设置为100像素
|
||||
max_width = max(max_width, 100)
|
||||
|
||||
return QSize(max_width, text_height)
|
||||
|
||||
|
||||
class SQLiteViewer(QMainWindow):
|
||||
@@ -148,6 +231,28 @@ class SQLiteViewer(QMainWindow):
|
||||
right_layout.addWidget(QLabel("表数据:"))
|
||||
self.data_table = QTableWidget()
|
||||
self.data_table.setAlternatingRowColors(True)
|
||||
|
||||
# 设置表格支持多行内容和可调整列宽
|
||||
self.data_table.setItemDelegate(MultiLineDelegate(self.data_table))
|
||||
self.data_table.setWordWrap(True) # 启用自动换行
|
||||
self.data_table.setTextElideMode(Qt.ElideNone) # 不省略文本
|
||||
|
||||
# 设置列头支持拖拽调整大小
|
||||
header = self.data_table.horizontalHeader()
|
||||
header.setSectionsMovable(True) # 允许移动列
|
||||
header.setStretchLastSection(False) # 不自动拉伸最后一列
|
||||
|
||||
# 设置行头自动调整高度
|
||||
self.data_table.verticalHeader().setSectionResizeMode(QHeaderView.ResizeToContents)
|
||||
|
||||
# 启用行高调整功能 - 允许用户手动拖拽调整行高
|
||||
self.data_table.verticalHeader().setSectionsMovable(False) # 行不允许移动
|
||||
self.data_table.verticalHeader().setSectionResizeMode(QHeaderView.Interactive) # 允许手动调整行高
|
||||
|
||||
# 添加右键菜单支持
|
||||
self.data_table.setContextMenuPolicy(Qt.CustomContextMenu)
|
||||
self.data_table.customContextMenuRequested.connect(self.show_table_context_menu)
|
||||
|
||||
right_layout.addWidget(self.data_table)
|
||||
|
||||
splitter.addWidget(left_widget)
|
||||
@@ -273,11 +378,37 @@ class SQLiteViewer(QMainWindow):
|
||||
# 填充数据
|
||||
for row_idx, row_data in enumerate(data):
|
||||
for col_idx, cell_data in enumerate(row_data):
|
||||
item = QTableWidgetItem(str(cell_data) if cell_data is not None else "")
|
||||
# 处理None值和格式化数据
|
||||
if cell_data is None:
|
||||
display_text = ""
|
||||
elif isinstance(cell_data, (int, float)):
|
||||
# 数字类型保持原样,但转换为字符串
|
||||
display_text = str(cell_data)
|
||||
else:
|
||||
# 文本类型,保留原始格式,包括换行符
|
||||
display_text = str(cell_data)
|
||||
|
||||
item = QTableWidgetItem(display_text)
|
||||
item.setToolTip(display_text) # 添加悬停提示
|
||||
self.data_table.setItem(row_idx, col_idx, item)
|
||||
|
||||
# 调整列宽
|
||||
self.data_table.horizontalHeader().setSectionResizeMode(QHeaderView.ResizeToContents)
|
||||
# 调整列宽 - 使用Interactive模式让用户可以手动调整
|
||||
header = self.data_table.horizontalHeader()
|
||||
header.setSectionResizeMode(QHeaderView.Interactive)
|
||||
|
||||
# 设置初始列宽为内容宽度,但有最大宽度限制
|
||||
for col in range(len(column_names)):
|
||||
# 计算该列内容的最大宽度
|
||||
max_width = 0
|
||||
for row in range(min(100, len(data))): # 只检查前100行,避免性能问题
|
||||
item = self.data_table.item(row, col)
|
||||
if item and item.text():
|
||||
text_width = self.data_table.fontMetrics().horizontalAdvance(item.text()) + 20
|
||||
max_width = max(max_width, text_width)
|
||||
|
||||
# 设置列宽,最小100像素,最大400像素
|
||||
column_width = min(max(max_width, 100), 400)
|
||||
self.data_table.setColumnWidth(col, column_width)
|
||||
|
||||
logger.info(f"加载表 {table_name} 数据完成,共 {len(data)} 行")
|
||||
self.status_bar.showMessage(f"表 {table_name}: {len(data)} 行数据")
|
||||
@@ -363,11 +494,37 @@ class SQLiteViewer(QMainWindow):
|
||||
# 填充筛选后的数据
|
||||
for row_idx, row_data in enumerate(data):
|
||||
for col_idx, cell_data in enumerate(row_data):
|
||||
item = QTableWidgetItem(str(cell_data) if cell_data is not None else "")
|
||||
# 处理None值和格式化数据
|
||||
if cell_data is None:
|
||||
display_text = ""
|
||||
elif isinstance(cell_data, (int, float)):
|
||||
# 数字类型保持原样,但转换为字符串
|
||||
display_text = str(cell_data)
|
||||
else:
|
||||
# 文本类型,保留原始格式,包括换行符
|
||||
display_text = str(cell_data)
|
||||
|
||||
item = QTableWidgetItem(display_text)
|
||||
item.setToolTip(display_text) # 添加悬停提示
|
||||
self.data_table.setItem(row_idx, col_idx, item)
|
||||
|
||||
# 调整列宽
|
||||
self.data_table.horizontalHeader().setSectionResizeMode(QHeaderView.ResizeToContents)
|
||||
# 调整列宽 - 使用Interactive模式让用户可以手动调整
|
||||
header = self.data_table.horizontalHeader()
|
||||
header.setSectionResizeMode(QHeaderView.Interactive)
|
||||
|
||||
# 设置初始列宽为内容宽度,但有最大宽度限制
|
||||
for col in range(len(column_names)):
|
||||
# 计算该列内容的最大宽度
|
||||
max_width = 0
|
||||
for row in range(min(100, len(data))): # 只检查前100行,避免性能问题
|
||||
item = self.data_table.item(row, col)
|
||||
if item and item.text():
|
||||
text_width = self.data_table.fontMetrics().horizontalAdvance(item.text()) + 20
|
||||
max_width = max(max_width, text_width)
|
||||
|
||||
# 设置列宽,最小100像素,最大400像素
|
||||
column_width = min(max(max_width, 100), 400)
|
||||
self.data_table.setColumnWidth(col, column_width)
|
||||
|
||||
# 启用清除筛选按钮
|
||||
self.clear_filter_button.setEnabled(True)
|
||||
@@ -413,6 +570,161 @@ class SQLiteViewer(QMainWindow):
|
||||
else:
|
||||
self.load_table_list()
|
||||
|
||||
def show_table_context_menu(self, position):
|
||||
"""显示表格右键菜单"""
|
||||
menu = QMenu()
|
||||
|
||||
# 添加菜单项
|
||||
auto_resize_action = menu.addAction("自动调整列宽")
|
||||
auto_resize_rows_action = menu.addAction("自动调整行高")
|
||||
|
||||
# 添加行高调整子菜单
|
||||
row_height_menu = menu.addMenu("设置行高")
|
||||
increase_height_action = row_height_menu.addAction("增加行高 (+10px)")
|
||||
decrease_height_action = row_height_menu.addAction("减少行高 (-10px)")
|
||||
reset_height_action = row_height_menu.addAction("重置行高为默认值")
|
||||
custom_height_action = row_height_menu.addAction("自定义行高...")
|
||||
|
||||
copy_action = menu.addAction("复制选中内容")
|
||||
|
||||
# 显示菜单
|
||||
action = menu.exec(self.data_table.mapToGlobal(position))
|
||||
|
||||
if action == auto_resize_action:
|
||||
self.auto_resize_columns()
|
||||
elif action == auto_resize_rows_action:
|
||||
self.auto_resize_rows()
|
||||
elif action == increase_height_action:
|
||||
self.adjust_row_height(10)
|
||||
elif action == decrease_height_action:
|
||||
self.adjust_row_height(-10)
|
||||
elif action == reset_height_action:
|
||||
self.reset_row_height()
|
||||
elif action == custom_height_action:
|
||||
self.set_custom_row_height()
|
||||
elif action == copy_action:
|
||||
self.copy_selected_content()
|
||||
|
||||
def auto_resize_columns(self):
|
||||
"""自动调整所有列宽"""
|
||||
logger.info("自动调整列宽")
|
||||
|
||||
# 遍历所有列
|
||||
for col in range(self.data_table.columnCount()):
|
||||
# 计算该列内容的最大宽度
|
||||
max_width = 0
|
||||
for row in range(min(100, self.data_table.rowCount())): # 只检查前100行
|
||||
item = self.data_table.item(row, col)
|
||||
if item and item.text():
|
||||
text_width = self.data_table.fontMetrics().horizontalAdvance(item.text()) + 20
|
||||
max_width = max(max_width, text_width)
|
||||
|
||||
# 设置列宽,最小100像素,最大500像素
|
||||
column_width = min(max(max_width, 100), 500)
|
||||
self.data_table.setColumnWidth(col, column_width)
|
||||
|
||||
self.status_bar.showMessage("已自动调整列宽")
|
||||
|
||||
def auto_resize_rows(self):
|
||||
"""自动调整所有行高"""
|
||||
logger.info("自动调整行高")
|
||||
|
||||
# 触发重新计算行高
|
||||
self.data_table.resizeRowsToContents()
|
||||
|
||||
self.status_bar.showMessage("已自动调整行高")
|
||||
|
||||
def adjust_row_height(self, delta: int):
|
||||
"""调整选中行的行高"""
|
||||
selected_items = self.data_table.selectedItems()
|
||||
if not selected_items:
|
||||
# 如果没有选中行,调整所有行
|
||||
for row in range(self.data_table.rowCount()):
|
||||
current_height = self.data_table.rowHeight(row)
|
||||
new_height = max(current_height + delta, 20) # 最小行高20像素
|
||||
self.data_table.setRowHeight(row, new_height)
|
||||
self.status_bar.showMessage(f"所有行高已调整 {delta:+d} 像素")
|
||||
else:
|
||||
# 调整选中行
|
||||
selected_rows = set(item.row() for item in selected_items)
|
||||
for row in selected_rows:
|
||||
current_height = self.data_table.rowHeight(row)
|
||||
new_height = max(current_height + delta, 20) # 最小行高20像素
|
||||
self.data_table.setRowHeight(row, new_height)
|
||||
self.status_bar.showMessage(f"已调整 {len(selected_rows)} 行的行高 {delta:+d} 像素")
|
||||
|
||||
def reset_row_height(self):
|
||||
"""重置行高为默认值"""
|
||||
logger.info("重置行高为默认值")
|
||||
|
||||
# 重置为默认行高(30像素)
|
||||
default_height = 30
|
||||
for row in range(self.data_table.rowCount()):
|
||||
self.data_table.setRowHeight(row, default_height)
|
||||
|
||||
self.status_bar.showMessage("行高已重置为默认值")
|
||||
|
||||
def set_custom_row_height(self):
|
||||
"""设置自定义行高"""
|
||||
# 获取当前选中行的行高作为默认值
|
||||
selected_items = self.data_table.selectedItems()
|
||||
if selected_items:
|
||||
current_height = self.data_table.rowHeight(selected_items[0].row())
|
||||
else:
|
||||
current_height = 30
|
||||
|
||||
# 显示输入对话框
|
||||
height, ok = QInputDialog.getInt(
|
||||
self,
|
||||
"设置行高",
|
||||
"请输入行高(像素):",
|
||||
current_height, # 默认值
|
||||
20, # 最小值
|
||||
500 # 最大值
|
||||
)
|
||||
|
||||
if ok:
|
||||
if selected_items:
|
||||
# 设置选中行
|
||||
selected_rows = set(item.row() for item in selected_items)
|
||||
for row in selected_rows:
|
||||
self.data_table.setRowHeight(row, height)
|
||||
self.status_bar.showMessage(f"已设置 {len(selected_rows)} 行的行高为 {height} 像素")
|
||||
else:
|
||||
# 设置所有行
|
||||
for row in range(self.data_table.rowCount()):
|
||||
self.data_table.setRowHeight(row, height)
|
||||
self.status_bar.showMessage(f"所有行高已设置为 {height} 像素")
|
||||
|
||||
def copy_selected_content(self):
|
||||
"""复制选中的内容"""
|
||||
selected_items = self.data_table.selectedItems()
|
||||
if not selected_items:
|
||||
return
|
||||
|
||||
# 按行列组织数据
|
||||
rows = {}
|
||||
for item in selected_items:
|
||||
row = item.row()
|
||||
col = item.column()
|
||||
if row not in rows:
|
||||
rows[row] = {}
|
||||
rows[row][col] = item.text()
|
||||
|
||||
# 构建复制的文本
|
||||
text_lines = []
|
||||
for row in sorted(rows.keys()):
|
||||
row_data = []
|
||||
for col in sorted(rows[row].keys()):
|
||||
row_data.append(rows[row][col])
|
||||
text_lines.append('\t'.join(row_data))
|
||||
|
||||
# 复制到剪贴板
|
||||
clipboard = QApplication.clipboard()
|
||||
clipboard.setText('\n'.join(text_lines))
|
||||
|
||||
self.status_bar.showMessage(f"已复制 {len(selected_items)} 个单元格的内容")
|
||||
|
||||
def closeEvent(self, event):
|
||||
"""关闭事件处理"""
|
||||
logger.info("关闭应用程序")
|
||||
|
||||
670
product/templates/index.html
Normal file
670
product/templates/index.html
Normal file
@@ -0,0 +1,670 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="zh-CN">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>SQLite数据库查看器</title>
|
||||
<style>
|
||||
* {
|
||||
margin: 0;
|
||||
padding: 0;
|
||||
box-sizing: border-box;
|
||||
}
|
||||
|
||||
body {
|
||||
font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
|
||||
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
|
||||
min-height: 100vh;
|
||||
color: #333;
|
||||
}
|
||||
|
||||
.container {
|
||||
width: 100%;
|
||||
margin: 0 auto;
|
||||
padding: 20px;
|
||||
}
|
||||
|
||||
.header {
|
||||
background: rgba(255, 255, 255, 0.95);
|
||||
backdrop-filter: blur(10px);
|
||||
border-radius: 15px;
|
||||
padding: 25px;
|
||||
margin-bottom: 25px;
|
||||
box-shadow: 0 8px 32px rgba(0, 0, 0, 0.1);
|
||||
}
|
||||
|
||||
.header h1 {
|
||||
color: #2c3e50;
|
||||
font-size: 2.5em;
|
||||
margin-bottom: 10px;
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
.controls {
|
||||
display: flex;
|
||||
gap: 20px;
|
||||
align-items: center;
|
||||
flex-wrap: wrap;
|
||||
margin-top: 20px;
|
||||
}
|
||||
|
||||
.control-group {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 8px;
|
||||
}
|
||||
|
||||
.control-group label {
|
||||
font-weight: 600;
|
||||
color: #34495e;
|
||||
font-size: 0.9em;
|
||||
}
|
||||
|
||||
select, input {
|
||||
padding: 12px 15px;
|
||||
border: 2px solid #e0e6ed;
|
||||
border-radius: 8px;
|
||||
font-size: 14px;
|
||||
transition: all 0.3s ease;
|
||||
background: white;
|
||||
}
|
||||
|
||||
select:focus, input:focus {
|
||||
outline: none;
|
||||
border-color: #667eea;
|
||||
box-shadow: 0 0 0 3px rgba(102, 126, 234, 0.1);
|
||||
}
|
||||
|
||||
.btn {
|
||||
padding: 12px 24px;
|
||||
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
|
||||
color: white;
|
||||
border: none;
|
||||
border-radius: 8px;
|
||||
cursor: pointer;
|
||||
font-weight: 600;
|
||||
transition: all 0.3s ease;
|
||||
text-decoration: none;
|
||||
display: inline-block;
|
||||
}
|
||||
|
||||
.btn:hover {
|
||||
transform: translateY(-2px);
|
||||
box-shadow: 0 5px 15px rgba(0, 0, 0, 0.2);
|
||||
}
|
||||
|
||||
.data-container {
|
||||
background: rgba(255, 255, 255, 0.95);
|
||||
backdrop-filter: blur(10px);
|
||||
border-radius: 15px;
|
||||
padding: 25px;
|
||||
box-shadow: 0 8px 32px rgba(0, 0, 0, 0.1);
|
||||
}
|
||||
|
||||
.table-info {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
margin-bottom: 20px;
|
||||
padding: 15px;
|
||||
background: #f8f9fa;
|
||||
border-radius: 10px;
|
||||
}
|
||||
|
||||
.table-info h2 {
|
||||
color: #2c3e50;
|
||||
font-size: 1.5em;
|
||||
}
|
||||
|
||||
.stats {
|
||||
display: flex;
|
||||
gap: 20px;
|
||||
font-size: 0.9em;
|
||||
color: #7f8c8d;
|
||||
}
|
||||
|
||||
.table-wrapper {
|
||||
overflow-x: auto;
|
||||
border-radius: 10px;
|
||||
box-shadow: 0 4px 16px rgba(0, 0, 0, 0.1);
|
||||
}
|
||||
|
||||
table {
|
||||
width: 100%;
|
||||
border-collapse: collapse;
|
||||
background: white;
|
||||
font-size: 14px;
|
||||
}
|
||||
|
||||
th {
|
||||
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
|
||||
color: white;
|
||||
padding: 15px 12px;
|
||||
text-align: left;
|
||||
font-weight: 600;
|
||||
position: sticky;
|
||||
top: 0;
|
||||
z-index: 10;
|
||||
}
|
||||
|
||||
td {
|
||||
padding: 12px;
|
||||
border-bottom: 1px solid #ecf0f1;
|
||||
vertical-align: top;
|
||||
}
|
||||
|
||||
tr:nth-child(even) {
|
||||
background-color: #f8f9fa;
|
||||
}
|
||||
|
||||
tr:hover {
|
||||
background-color: #e3f2fd;
|
||||
transition: background-color 0.3s ease;
|
||||
}
|
||||
|
||||
.multiline-cell {
|
||||
white-space: pre-wrap;
|
||||
line-height: 1.6;
|
||||
max-height: 200px;
|
||||
overflow-y: auto;
|
||||
padding: 8px;
|
||||
background: #fff3cd;
|
||||
border-radius: 6px;
|
||||
border-left: 4px solid #ffc107;
|
||||
}
|
||||
|
||||
.normal-cell {
|
||||
white-space: nowrap;
|
||||
overflow: hidden;
|
||||
text-overflow: ellipsis;
|
||||
max-width: 300px;
|
||||
}
|
||||
|
||||
.empty-cell {
|
||||
color: #95a5a6;
|
||||
font-style: italic;
|
||||
}
|
||||
|
||||
.pagination {
|
||||
display: flex;
|
||||
justify-content: center;
|
||||
align-items: center;
|
||||
gap: 10px;
|
||||
margin-top: 25px;
|
||||
flex-wrap: wrap;
|
||||
}
|
||||
|
||||
.page-info {
|
||||
color: #7f8c8d;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.page-btn {
|
||||
padding: 8px 12px;
|
||||
border: 2px solid #e0e6ed;
|
||||
background: white;
|
||||
border-radius: 6px;
|
||||
cursor: pointer;
|
||||
transition: all 0.3s ease;
|
||||
min-width: 40px;
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
.page-btn:hover {
|
||||
border-color: #667eea;
|
||||
background: #667eea;
|
||||
color: white;
|
||||
}
|
||||
|
||||
.page-btn.active {
|
||||
background: #667eea;
|
||||
color: white;
|
||||
border-color: #667eea;
|
||||
}
|
||||
|
||||
.page-btn:disabled {
|
||||
opacity: 0.5;
|
||||
cursor: not-allowed;
|
||||
}
|
||||
|
||||
.loading {
|
||||
text-align: center;
|
||||
padding: 40px;
|
||||
color: #7f8c8d;
|
||||
font-size: 1.1em;
|
||||
}
|
||||
|
||||
.error {
|
||||
background: #f8d7da;
|
||||
color: #721c24;
|
||||
padding: 15px;
|
||||
border-radius: 8px;
|
||||
border: 1px solid #f5c6cb;
|
||||
margin: 20px 0;
|
||||
}
|
||||
|
||||
.no-data {
|
||||
text-align: center;
|
||||
padding: 40px;
|
||||
color: #7f8c8d;
|
||||
font-size: 1.1em;
|
||||
}
|
||||
|
||||
.analyze-btn {
|
||||
background: linear-gradient(135deg, #28a745, #20c997);
|
||||
color: white;
|
||||
border: none;
|
||||
padding: 8px 16px;
|
||||
border-radius: 6px;
|
||||
font-size: 1em;
|
||||
cursor: pointer;
|
||||
transition: all 0.3s ease;
|
||||
}
|
||||
|
||||
.analyze-btn:hover {
|
||||
transform: translateY(-1px);
|
||||
box-shadow: 0 4px 8px rgba(40, 167, 69, 0.3);
|
||||
}
|
||||
|
||||
.analyze-btn:disabled {
|
||||
background: #6c757d;
|
||||
cursor: not-allowed;
|
||||
transform: none;
|
||||
box-shadow: none;
|
||||
}
|
||||
|
||||
.progress-container {
|
||||
margin-top: 20px;
|
||||
padding: 15px;
|
||||
background: rgba(255, 255, 255, 0.95);
|
||||
backdrop-filter: blur(10px);
|
||||
border-radius: 12px;
|
||||
box-shadow: 0 4px 20px rgba(0, 0, 0, 0.1);
|
||||
display: none;
|
||||
}
|
||||
|
||||
.progress-bar {
|
||||
width: 100%;
|
||||
height: 20px;
|
||||
background: #e9ecef;
|
||||
border-radius: 10px;
|
||||
overflow: hidden;
|
||||
margin: 10px 0;
|
||||
}
|
||||
|
||||
.progress-fill {
|
||||
height: 100%;
|
||||
background: linear-gradient(90deg, #667eea, #764ba2);
|
||||
transition: width 0.3s ease;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
color: white;
|
||||
font-size: 0.8em;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
@media (max-width: 768px) {
|
||||
.controls {
|
||||
flex-direction: column;
|
||||
align-items: stretch;
|
||||
}
|
||||
|
||||
.table-info {
|
||||
flex-direction: column;
|
||||
gap: 15px;
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
.stats {
|
||||
justify-content: center;
|
||||
}
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="container">
|
||||
<div class="header">
|
||||
<h1>🗄️ SQLite数据库查看器</h1>
|
||||
<div class="controls">
|
||||
<div class="control-group">
|
||||
<label for="tableSelect">选择数据表:</label>
|
||||
<select id="tableSelect">
|
||||
<option value="">加载中...</option>
|
||||
</select>
|
||||
</div>
|
||||
<div class="control-group">
|
||||
<label for="analyzeBtn">分析:</label>
|
||||
<button id="analyzeScoresBtn" class="analyze-btn">📊 分析缺失分数</button>
|
||||
</div>
|
||||
<div class="control-group">
|
||||
<label for="searchField">筛选字段:</label>
|
||||
<select id="searchField" multiple disabled style="min-height: 80px;">
|
||||
<option value="">所有文本字段</option>
|
||||
</select>
|
||||
</div>
|
||||
<div class="control-group">
|
||||
<label for="searchValue">筛选内容:</label>
|
||||
<input type="text" id="searchValue" placeholder="输入筛选内容..." disabled>
|
||||
</div>
|
||||
<button class="btn" onclick="loadData()">刷新数据</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="data-container">
|
||||
<div class="table-info">
|
||||
<h2 id="tableName">请选择数据表</h2>
|
||||
<div class="stats">
|
||||
<span id="recordCount">记录数: 0</span>
|
||||
<span id="pageInfo">第 0 页,共 0 页</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div id="dataContainer">
|
||||
<div class="no-data">请选择数据表以查看内容</div>
|
||||
</div>
|
||||
|
||||
<div id="pagination" class="pagination" style="display: none;">
|
||||
<button class="page-btn" onclick="changePage('prev')" id="prevBtn">上一页</button>
|
||||
<span class="page-info" id="pageInfoDetail"></span>
|
||||
<button class="page-btn" onclick="changePage('next')" id="nextBtn">下一页</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div id="progressSection" class="progress-container">
|
||||
<h3>📊 分数分析进度</h3>
|
||||
<div class="progress-bar">
|
||||
<div id="progressFill" class="progress-fill" style="width: 0%;">0%</div>
|
||||
</div>
|
||||
<p id="progressText">等待分析开始...</p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<script>
|
||||
let currentTable = '';
|
||||
let currentPage = 1;
|
||||
let perPage = 50;
|
||||
let totalPages = 1;
|
||||
let currentData = null;
|
||||
|
||||
// 绑定事件
|
||||
document.addEventListener('DOMContentLoaded', function() {
|
||||
loadTables();
|
||||
|
||||
// 绑定事件
|
||||
document.getElementById('tableSelect').addEventListener('change', function() {
|
||||
currentTable = this.value;
|
||||
currentPage = 1;
|
||||
if (currentTable) {
|
||||
loadTableStructure();
|
||||
loadData();
|
||||
}
|
||||
});
|
||||
|
||||
// 分析缺失分数按钮事件
|
||||
document.getElementById('analyzeScoresBtn').addEventListener('click', analyzeMissingScores);
|
||||
});
|
||||
|
||||
// 分析缺失分数
|
||||
async function analyzeMissingScores() {
|
||||
const analyzeBtn = document.getElementById('analyzeScoresBtn');
|
||||
const progressSection = document.getElementById('progressSection');
|
||||
const progressFill = document.getElementById('progressFill');
|
||||
const progressText = document.getElementById('progressText');
|
||||
|
||||
try {
|
||||
// 禁用按钮
|
||||
analyzeBtn.disabled = true;
|
||||
analyzeBtn.textContent = '分析中...';
|
||||
|
||||
// 显示进度条
|
||||
progressSection.style.display = 'block';
|
||||
progressFill.style.width = '0%';
|
||||
progressFill.textContent = '0%';
|
||||
progressText.textContent = '正在启动分析任务...';
|
||||
|
||||
// 启动分析任务
|
||||
const response = await fetch('/api/analyze_missing_scores');
|
||||
const data = await response.json();
|
||||
|
||||
if (data.task_id) {
|
||||
// 定期查询任务状态
|
||||
const interval = setInterval(async () => {
|
||||
try {
|
||||
const statusResponse = await fetch(`/api/update_task_status/${data.task_id}`);
|
||||
const statusData = await statusResponse.json();
|
||||
|
||||
// 更新进度
|
||||
progressFill.style.width = `${statusData.progress}%`;
|
||||
progressFill.textContent = `${statusData.progress}%`;
|
||||
|
||||
if (statusData.status === 'running') {
|
||||
progressText.textContent = `正在分析: ${statusData.completed}/${statusData.total} 个产品`;
|
||||
} else if (statusData.status === 'completed') {
|
||||
progressText.textContent = '🎉 所有缺失分数分析完成!';
|
||||
clearInterval(interval);
|
||||
analyzeBtn.disabled = false;
|
||||
analyzeBtn.textContent = '📊 分析缺失分数';
|
||||
|
||||
// 如果当前正在查看product_analysis表,自动刷新
|
||||
if (currentTable === 'product_analysis') {
|
||||
loadData();
|
||||
}
|
||||
} else if (statusData.status === 'failed') {
|
||||
progressText.textContent = `❌ 分析失败: ${statusData.error}`;
|
||||
clearInterval(interval);
|
||||
analyzeBtn.disabled = false;
|
||||
analyzeBtn.textContent = '📊 分析缺失分数';
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('查询任务状态失败:', error);
|
||||
progressText.textContent = '查询任务状态失败';
|
||||
}
|
||||
}, 2000);
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('启动分析任务失败:', error);
|
||||
progressText.textContent = `启动分析失败: ${error.message}`;
|
||||
analyzeBtn.disabled = false;
|
||||
analyzeBtn.textContent = '📊 分析缺失分数';
|
||||
}
|
||||
}
|
||||
|
||||
document.getElementById('searchField').addEventListener('change', loadData);
|
||||
document.getElementById('searchValue').addEventListener('input', debounce(loadData, 500));
|
||||
|
||||
// 防抖函数
|
||||
function debounce(func, wait) {
|
||||
let timeout;
|
||||
return function executedFunction(...args) {
|
||||
const later = () => {
|
||||
clearTimeout(timeout);
|
||||
func(...args);
|
||||
};
|
||||
clearTimeout(timeout);
|
||||
timeout = setTimeout(later, wait);
|
||||
};
|
||||
}
|
||||
|
||||
// 加载表列表
|
||||
async function loadTables() {
|
||||
try {
|
||||
const response = await fetch('/api/tables');
|
||||
const data = await response.json();
|
||||
const select = document.getElementById('tableSelect');
|
||||
select.innerHTML = '<option value="">选择数据表...</option>';
|
||||
|
||||
data.tables.forEach(table => {
|
||||
const option = document.createElement('option');
|
||||
option.value = table;
|
||||
option.textContent = table;
|
||||
select.appendChild(option);
|
||||
});
|
||||
} catch (error) {
|
||||
console.error('加载表列表失败:', error);
|
||||
showError('加载表列表失败: ' + error.message);
|
||||
}
|
||||
}
|
||||
|
||||
// 加载表结构
|
||||
async function loadTableStructure() {
|
||||
if (!currentTable) return;
|
||||
|
||||
try {
|
||||
const response = await fetch(`/api/table/${currentTable}/structure`);
|
||||
const data = await response.json();
|
||||
const searchField = document.getElementById('searchField');
|
||||
|
||||
searchField.innerHTML = '<option value="">所有文本字段</option>';
|
||||
data.structure.forEach(field => {
|
||||
const option = document.createElement('option');
|
||||
option.value = field.name;
|
||||
option.textContent = field.name;
|
||||
searchField.appendChild(option);
|
||||
});
|
||||
|
||||
searchField.disabled = false;
|
||||
document.getElementById('searchValue').disabled = false;
|
||||
} catch (error) {
|
||||
console.error('加载表结构失败:', error);
|
||||
}
|
||||
}
|
||||
|
||||
// 加载数据
|
||||
async function loadData() {
|
||||
if (!currentTable) return;
|
||||
|
||||
const container = document.getElementById('dataContainer');
|
||||
container.innerHTML = '<div class="loading">📊 数据加载中...</div>';
|
||||
|
||||
const searchFieldSelect = document.getElementById('searchField');
|
||||
const searchValue = document.getElementById('searchValue').value;
|
||||
|
||||
try {
|
||||
let url = `/api/table/${currentTable}/data?page=${currentPage}&per_page=${perPage}`;
|
||||
if (searchValue) {
|
||||
// 获取所有选中的字段
|
||||
const selectedFields = Array.from(searchFieldSelect.selectedOptions)
|
||||
.map(option => option.value)
|
||||
.filter(value => value !== '');
|
||||
|
||||
if (selectedFields.length > 0) {
|
||||
// 如果选择了特定字段,传递所有选中的字段
|
||||
selectedFields.forEach(field => {
|
||||
url += `&search_field=${encodeURIComponent(field)}`;
|
||||
});
|
||||
} else {
|
||||
// 否则使用"all"表示所有文本字段
|
||||
url += '&search_field=all';
|
||||
}
|
||||
url += `&search_value=${encodeURIComponent(searchValue)}`;
|
||||
}
|
||||
|
||||
const response = await fetch(url);
|
||||
currentData = await response.json();
|
||||
|
||||
displayData(currentData);
|
||||
updatePagination();
|
||||
|
||||
} catch (error) {
|
||||
console.error('加载数据失败:', error);
|
||||
showError('加载数据失败: ' + error.message);
|
||||
}
|
||||
}
|
||||
|
||||
// 显示数据
|
||||
function displayData(data) {
|
||||
const container = document.getElementById('dataContainer');
|
||||
|
||||
if (!data.rows || data.rows.length === 0) {
|
||||
container.innerHTML = '<div class="no-data">📭 没有找到数据</div>';
|
||||
return;
|
||||
}
|
||||
|
||||
let html = '<div class="table-wrapper"><table><thead><tr>';
|
||||
|
||||
// 表头
|
||||
data.columns.forEach(col => {
|
||||
html += `<th>${col}</th>`;
|
||||
});
|
||||
html += '</tr></thead><tbody>';
|
||||
|
||||
// 数据行
|
||||
data.rows.forEach(row => {
|
||||
html += '<tr>';
|
||||
row.forEach((cell, index) => {
|
||||
const colName = data.columns[index];
|
||||
if (cell.type === 'multiline') {
|
||||
html += `<td><div class="multiline-cell">${escapeHtml(cell.value)}</div></td>`;
|
||||
} else if (cell.type === 'empty') {
|
||||
html += '<td><div class="empty-cell">空</div></td>';
|
||||
} else if (colName === 'product_link' && cell.value) {
|
||||
// 渲染为链接
|
||||
html += `<td><div class="normal-cell"><a href="${escapeHtml(cell.value)}" target="_blank" rel="noopener noreferrer">${escapeHtml(cell.value)}</a></div></td>`;
|
||||
} else {
|
||||
html += `<td><div class="normal-cell">${escapeHtml(cell.value)}</div></td>`;
|
||||
}
|
||||
});
|
||||
html += '</tr>';
|
||||
});
|
||||
|
||||
html += '</tbody></table></div>';
|
||||
container.innerHTML = html;
|
||||
|
||||
// 更新统计信息
|
||||
document.getElementById('tableName').textContent = `📋 ${currentTable}`;
|
||||
document.getElementById('recordCount').textContent = `记录数: ${data.total_count}`;
|
||||
document.getElementById('pageInfo').textContent = `第 ${currentPage} 页,共 ${data.total_pages} 页`;
|
||||
}
|
||||
|
||||
// 更新分页
|
||||
function updatePagination() {
|
||||
if (!currentData) return;
|
||||
|
||||
totalPages = currentData.total_pages;
|
||||
const pagination = document.getElementById('pagination');
|
||||
const prevBtn = document.getElementById('prevBtn');
|
||||
const nextBtn = document.getElementById('nextBtn');
|
||||
const pageInfo = document.getElementById('pageInfoDetail');
|
||||
|
||||
if (totalPages <= 1) {
|
||||
pagination.style.display = 'none';
|
||||
return;
|
||||
}
|
||||
|
||||
pagination.style.display = 'flex';
|
||||
|
||||
prevBtn.disabled = currentPage <= 1;
|
||||
nextBtn.disabled = currentPage >= totalPages;
|
||||
|
||||
pageInfo.textContent = `${currentPage} / ${totalPages}`;
|
||||
}
|
||||
|
||||
// 翻页
|
||||
function changePage(direction) {
|
||||
if (direction === 'prev' && currentPage > 1) {
|
||||
currentPage--;
|
||||
loadData();
|
||||
} else if (direction === 'next' && currentPage < totalPages) {
|
||||
currentPage++;
|
||||
loadData();
|
||||
}
|
||||
}
|
||||
|
||||
// HTML转义
|
||||
function escapeHtml(text) {
|
||||
const div = document.createElement('div');
|
||||
div.textContent = text;
|
||||
return div.innerHTML;
|
||||
}
|
||||
|
||||
// 显示错误
|
||||
function showError(message) {
|
||||
const container = document.getElementById('dataContainer');
|
||||
container.innerHTML = `<div class="error">❌ ${message}</div>`;
|
||||
}
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
1138
product/web_sqlite_viewer.py
Normal file
1138
product/web_sqlite_viewer.py
Normal file
File diff suppressed because it is too large
Load Diff
@@ -1,7 +0,0 @@
|
||||
{
|
||||
"name": "PaletteBrain",
|
||||
"introduction": "PaletteBrain is a macOS application that lets you use ChatGPT with any application by using a shortcut. Create custom templates with your own shortcuts or use the default templates provided.",
|
||||
"user_count": "225 followers",
|
||||
"maker_link": "https://www.producthunt.com/products/raycast",
|
||||
"maker_statement": "Hey Product Hunt 👋If you're new to Raycast: Think of it as your command center for Windows. Hit \"Alt + Space\", type what you need, and go. Launch apps, search files, run extensions like GitHub or Notion, ask AI. All without touching your mouse.Why we built this:We spent the last 5 years building Raycast for Mac. Hundreds of thousands of people now use it daily to cut through the noise. But Windows users deserve the same. Your tools shouldn't get in the way. They should help you focus and get things done. So today, we bring the same experience to your PC.What makes Raycast different:• Custom file search - Windows doesn't have proper indexing, so we built our own. Fast and accurate.• Hundreds of extensions - Control your smart home, search Notion, manage GitHub, translate text, find GIFs. Or build your own!• Free AI during beta - Ask questions to get answers with citations without a subscription needed.• Built for Windows - Familiar keyboard shortcuts, design that fits right in, and you can even launch your favorite games.What's coming next:We'll add AI Chat, Notes, and more features in the months ahead. We ship fast and want your feedback. So please let us know what you miss.Yesterday was Windows' 40th anniversary. It felt like the right moment to launch this. Try it and let us know what you think. What should we build next?"
|
||||
}
|
||||
Binary file not shown.
|
Before Width: | Height: | Size: 477 KiB After Width: | Height: | Size: 717 KiB |
File diff suppressed because it is too large
Load Diff
@@ -1,8 +0,0 @@
|
||||
# SQLite数据库查看器依赖包
|
||||
PySide6>=6.5.0
|
||||
loguru>=0.7.0
|
||||
pyqt-test>=1.0.0
|
||||
|
||||
# 可选:用于GUI测试的额外依赖
|
||||
pytest>=7.0.0
|
||||
pytest-qt>=4.0.0
|
||||
@@ -1,42 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
playwright_stealth 使用示例
|
||||
演示如何使用 stealth 模式运行 ProductHunt 爬虫
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
from loguru import logger
|
||||
from product.new_data_stealth import ProductHuntScraper
|
||||
|
||||
async def run_stealth_scraper():
|
||||
"""运行 stealth 版本的爬虫"""
|
||||
logger.info("开始运行 stealth 版本的 ProductHunt 爬虫")
|
||||
|
||||
# 创建爬虫实例
|
||||
scraper = ProductHuntScraper()
|
||||
|
||||
# 执行爬取
|
||||
success = await scraper.scrape()
|
||||
|
||||
if success:
|
||||
logger.success("Stealth 爬虫执行成功!")
|
||||
logger.info("生成的文件:")
|
||||
logger.info("- product_info_stealth.json: 产品信息数据")
|
||||
logger.info("- product_page_stealth.html: 页面HTML内容")
|
||||
logger.info("- product_screenshot_stealth.png: 页面截图")
|
||||
else:
|
||||
logger.error("Stealth 爬虫执行失败")
|
||||
|
||||
return success
|
||||
|
||||
def main():
|
||||
"""主函数"""
|
||||
logger.info("=== playwright_stealth 使用示例 ===")
|
||||
logger.info("此示例演示如何使用 playwright_stealth 模块增强浏览器反检测能力")
|
||||
|
||||
# 运行异步任务
|
||||
asyncio.run(run_stealth_scraper())
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,91 +0,0 @@
|
||||
2025-11-26 23:10:59.326 | INFO | __main__:__init__:27 - 初始化SQLite数据库查看器
|
||||
2025-11-26 23:10:59.327 | INFO | __main__:init_ui:34 - 设置主窗口界面
|
||||
2025-11-26 23:10:59.327 | INFO | __main__:create_top_buttons:64 - 创建顶部按钮
|
||||
2025-11-26 23:10:59.327 | INFO | __main__:create_filter_section:87 - 创建筛选控件区域
|
||||
2025-11-26 23:10:59.334 | INFO | __main__:create_splitter:132 - 创建分割器界面
|
||||
2025-11-26 23:10:59.335 | INFO | __main__:create_status_bar:161 - 创建状态栏
|
||||
2025-11-26 23:10:59.335 | INFO | __main__:create_menubar:168 - 创建菜单栏
|
||||
2025-11-26 23:10:59.344 | INFO | __main__:init_ui:60 - 界面初始化完成
|
||||
2025-11-26 23:10:59.494 | INFO | __main__:main:426 - 应用程序启动完成
|
||||
2025-11-26 23:11:01.008 | INFO | __main__:open_database:184 - 打开数据库文件对话框
|
||||
2025-11-26 23:11:03.377 | INFO | __main__:open_database:198 - 打开数据库文件: C:/Users/xiaji/Documents/个人文件夹/夏骥/hothub的抓取/product/products.db
|
||||
2025-11-26 23:11:03.378 | INFO | __main__:connect_to_database:208 - 数据库连接成功
|
||||
2025-11-26 23:11:03.379 | INFO | __main__:load_table_list:236 - 加载了 3 个表
|
||||
2025-11-26 23:11:06.700 | INFO | __main__:on_table_selected:246 - 选中表: product_analysis
|
||||
2025-11-26 23:11:06.701 | INFO | __main__:load_table_data:282 - 加载表 product_analysis 数据完成,共 5 行
|
||||
2025-11-26 23:11:06.701 | INFO | __main__:update_field_combo:312 - 更新字段下拉框: product_analysis, 共 8 个字段
|
||||
2025-11-26 23:11:49.497 | INFO | __main__:closeEvent:404 - 关闭应用程序
|
||||
2025-11-27 21:17:24.999 | INFO | __main__:__init__:27 - 初始化SQLite数据库查看器
|
||||
2025-11-27 21:17:25.000 | INFO | __main__:init_ui:34 - 设置主窗口界面
|
||||
2025-11-27 21:17:25.002 | INFO | __main__:create_top_buttons:64 - 创建顶部按钮
|
||||
2025-11-27 21:17:25.007 | INFO | __main__:create_filter_section:87 - 创建筛选控件区域
|
||||
2025-11-27 21:17:25.022 | INFO | __main__:create_splitter:132 - 创建分割器界面
|
||||
2025-11-27 21:17:25.036 | INFO | __main__:create_status_bar:161 - 创建状态栏
|
||||
2025-11-27 21:17:25.038 | INFO | __main__:create_menubar:168 - 创建菜单栏
|
||||
2025-11-27 21:17:25.061 | INFO | __main__:init_ui:60 - 界面初始化完成
|
||||
2025-11-27 21:17:25.250 | INFO | __main__:main:426 - 应用程序启动完成
|
||||
2025-11-27 21:17:28.396 | INFO | __main__:open_database:184 - 打开数据库文件对话框
|
||||
2025-11-27 21:17:29.780 | INFO | __main__:open_database:198 - 打开数据库文件: C:/Users/xiaji/Documents/个人文件夹/夏骥/hothub的抓取/tophub_data.db
|
||||
2025-11-27 21:17:29.786 | INFO | __main__:connect_to_database:208 - 数据库连接成功
|
||||
2025-11-27 21:17:29.792 | INFO | __main__:load_table_list:236 - 加载了 2 个表
|
||||
2025-11-27 21:17:33.220 | INFO | __main__:on_table_selected:246 - 选中表: articles
|
||||
2025-11-27 21:17:33.539 | INFO | __main__:load_table_data:282 - 加载表 articles 数据完成,共 16942 行
|
||||
2025-11-27 21:17:33.542 | INFO | __main__:update_field_combo:312 - 更新字段下拉框: articles, 共 8 个字段
|
||||
2025-11-27 21:17:39.131 | INFO | __main__:closeEvent:404 - 关闭应用程序
|
||||
2025-11-27 22:10:53.621 | INFO | __main__:__init__:27 - 初始化SQLite数据库查看器
|
||||
2025-11-27 22:10:53.622 | INFO | __main__:init_ui:34 - 设置主窗口界面
|
||||
2025-11-27 22:10:53.624 | INFO | __main__:create_top_buttons:64 - 创建顶部按钮
|
||||
2025-11-27 22:10:53.629 | INFO | __main__:create_filter_section:87 - 创建筛选控件区域
|
||||
2025-11-27 22:10:53.643 | INFO | __main__:create_splitter:132 - 创建分割器界面
|
||||
2025-11-27 22:10:53.656 | INFO | __main__:create_status_bar:161 - 创建状态栏
|
||||
2025-11-27 22:10:53.658 | INFO | __main__:create_menubar:168 - 创建菜单栏
|
||||
2025-11-27 22:10:53.678 | INFO | __main__:init_ui:60 - 界面初始化完成
|
||||
2025-11-27 22:10:53.867 | INFO | __main__:main:426 - 应用程序启动完成
|
||||
2025-11-27 22:10:59.059 | INFO | __main__:open_database:184 - 打开数据库文件对话框
|
||||
2025-11-27 22:11:00.561 | INFO | __main__:open_database:198 - 打开数据库文件: C:/Users/xiaji/Documents/个人文件夹/夏骥/hothub的抓取/tophub_data.db
|
||||
2025-11-27 22:11:00.562 | INFO | __main__:connect_to_database:208 - 数据库连接成功
|
||||
2025-11-27 22:11:00.563 | INFO | __main__:load_table_list:236 - 加载了 2 个表
|
||||
2025-11-27 22:11:02.513 | INFO | __main__:on_table_selected:246 - 选中表: articles
|
||||
2025-11-27 22:11:02.853 | INFO | __main__:load_table_data:282 - 加载表 articles 数据完成,共 16942 行
|
||||
2025-11-27 22:11:02.857 | INFO | __main__:update_field_combo:312 - 更新字段下拉框: articles, 共 8 个字段
|
||||
2025-11-27 22:11:19.598 | INFO | __main__:open_database:184 - 打开数据库文件对话框
|
||||
2025-11-27 22:11:22.448 | INFO | __main__:open_database:198 - 打开数据库文件: C:/Users/xiaji/Documents/个人文件夹/夏骥/hothub的抓取/product/products.db
|
||||
2025-11-27 22:11:22.449 | INFO | __main__:connect_to_database:208 - 数据库连接成功
|
||||
2025-11-27 22:11:22.452 | INFO | __main__:load_table_list:236 - 加载了 3 个表
|
||||
2025-11-27 22:11:24.895 | INFO | __main__:on_table_selected:246 - 选中表: product_analysis
|
||||
2025-11-27 22:11:25.053 | INFO | __main__:load_table_data:282 - 加载表 product_analysis 数据完成,共 251 行
|
||||
2025-11-27 22:11:25.054 | INFO | __main__:update_field_combo:312 - 更新字段下拉框: product_analysis, 共 8 个字段
|
||||
2025-11-27 22:14:44.131 | INFO | __main__:closeEvent:404 - 关闭应用程序
|
||||
2025-11-27 22:48:07.339 | INFO | __main__:__init__:27 - 初始化SQLite数据库查看器
|
||||
2025-11-27 22:48:07.340 | INFO | __main__:init_ui:34 - 设置主窗口界面
|
||||
2025-11-27 22:48:07.342 | INFO | __main__:create_top_buttons:64 - 创建顶部按钮
|
||||
2025-11-27 22:48:07.347 | INFO | __main__:create_filter_section:87 - 创建筛选控件区域
|
||||
2025-11-27 22:48:07.363 | INFO | __main__:create_splitter:132 - 创建分割器界面
|
||||
2025-11-27 22:48:07.375 | INFO | __main__:create_status_bar:161 - 创建状态栏
|
||||
2025-11-27 22:48:07.377 | INFO | __main__:create_menubar:168 - 创建菜单栏
|
||||
2025-11-27 22:48:07.397 | INFO | __main__:init_ui:60 - 界面初始化完成
|
||||
2025-11-27 22:48:07.565 | INFO | __main__:main:426 - 应用程序启动完成
|
||||
2025-11-27 22:48:08.529 | INFO | __main__:open_database:184 - 打开数据库文件对话框
|
||||
2025-11-27 22:48:10.594 | INFO | __main__:open_database:198 - 打开数据库文件: C:/Users/xiaji/Documents/个人文件夹/夏骥/hothub的抓取/product/products.db
|
||||
2025-11-27 22:48:10.595 | INFO | __main__:connect_to_database:208 - 数据库连接成功
|
||||
2025-11-27 22:48:10.596 | INFO | __main__:load_table_list:236 - 加载了 3 个表
|
||||
2025-11-27 22:48:12.872 | INFO | __main__:on_table_selected:246 - 选中表: product_analysis
|
||||
2025-11-27 22:48:12.882 | INFO | __main__:load_table_data:282 - 加载表 product_analysis 数据完成,共 251 行
|
||||
2025-11-27 22:48:12.883 | INFO | __main__:update_field_combo:312 - 更新字段下拉框: product_analysis, 共 9 个字段
|
||||
2025-11-27 22:49:47.902 | INFO | __main__:apply_filter:365 - 应用筛选条件: difficulty_score LIKE '%<75%', 匹配到 0 行数据
|
||||
2025-11-27 22:50:04.651 | INFO | __main__:apply_filter:365 - 应用筛选条件: difficulty_score LIKE '%difficulty_score<75%', 匹配到 0 行数据
|
||||
2025-11-27 22:50:44.808 | INFO | __main__:closeEvent:404 - 关闭应用程序
|
||||
2025-11-27 22:53:01.583 | INFO | __main__:__init__:27 - 初始化SQLite数据库查看器
|
||||
2025-11-27 22:53:01.583 | INFO | __main__:init_ui:34 - 设置主窗口界面
|
||||
2025-11-27 22:53:01.583 | INFO | __main__:create_top_buttons:64 - 创建顶部按钮
|
||||
2025-11-27 22:53:01.584 | INFO | __main__:create_filter_section:87 - 创建筛选控件区域
|
||||
2025-11-27 22:53:01.590 | INFO | __main__:create_splitter:132 - 创建分割器界面
|
||||
2025-11-27 22:53:01.591 | INFO | __main__:create_status_bar:161 - 创建状态栏
|
||||
2025-11-27 22:53:01.591 | INFO | __main__:create_menubar:168 - 创建菜单栏
|
||||
2025-11-27 22:53:01.600 | INFO | __main__:init_ui:60 - 界面初始化完成
|
||||
2025-11-27 22:53:01.727 | INFO | __main__:main:440 - 应用程序启动完成
|
||||
2025-11-27 22:53:03.101 | INFO | __main__:open_database:184 - 打开数据库文件对话框
|
||||
2025-11-27 22:53:04.822 | INFO | __main__:open_database:198 - 打开数据库文件: C:/Users/xiaji/Documents/个人文件夹/夏骥/hothub的抓取/product/products.db
|
||||
2025-11-27 22:53:04.823 | INFO | __main__:connect_to_database:208 - 数据库连接成功
|
||||
2025-11-27 22:53:04.824 | INFO | __main__:load_table_list:236 - 加载了 3 个表
|
||||
2025-11-27 22:53:42.968 | INFO | __main__:closeEvent:418 - 关闭应用程序
|
||||
@@ -1,11 +1,11 @@
|
||||
=== Product Hunt 产品信息 ===
|
||||
|
||||
产品名称: NoSho.app
|
||||
产品名称: Greta
|
||||
|
||||
产品简介: One simple link for businesses to grow their waitlist and fill last-minute availability with deposits fast. Make the availability you want booked fast visible to customers with one click, stop promoting slots you have hidden in a booking system maze. No more chasing messages or posting Stories that vanish in 24 hours. Just share your NoSho profile, automatically notify customers when new slots are added and get booked securely.
|
||||
产品简介: 未获取
|
||||
|
||||
制作人发言: 未获取
|
||||
制作人发言: This is first first proposed project. If you want to support Santiago getting his project built, here are the details.https://onemillionlines.com/proj...
|
||||
|
||||
用户数: 60 followers
|
||||
用户数: 664 followers
|
||||
|
||||
提取时间: 2025-11-27 20:18:48
|
||||
提取时间: 2026-03-08 20:40:13
|
||||
|
||||
670
templates/index.html
Normal file
670
templates/index.html
Normal file
@@ -0,0 +1,670 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="zh-CN">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>SQLite数据库查看器</title>
|
||||
<style>
|
||||
* {
|
||||
margin: 0;
|
||||
padding: 0;
|
||||
box-sizing: border-box;
|
||||
}
|
||||
|
||||
body {
|
||||
font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
|
||||
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
|
||||
min-height: 100vh;
|
||||
color: #333;
|
||||
}
|
||||
|
||||
.container {
|
||||
width: 100%;
|
||||
margin: 0 auto;
|
||||
padding: 20px;
|
||||
}
|
||||
|
||||
.header {
|
||||
background: rgba(255, 255, 255, 0.95);
|
||||
backdrop-filter: blur(10px);
|
||||
border-radius: 15px;
|
||||
padding: 25px;
|
||||
margin-bottom: 25px;
|
||||
box-shadow: 0 8px 32px rgba(0, 0, 0, 0.1);
|
||||
}
|
||||
|
||||
.header h1 {
|
||||
color: #2c3e50;
|
||||
font-size: 2.5em;
|
||||
margin-bottom: 10px;
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
.controls {
|
||||
display: flex;
|
||||
gap: 20px;
|
||||
align-items: center;
|
||||
flex-wrap: wrap;
|
||||
margin-top: 20px;
|
||||
}
|
||||
|
||||
.control-group {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 8px;
|
||||
}
|
||||
|
||||
.control-group label {
|
||||
font-weight: 600;
|
||||
color: #34495e;
|
||||
font-size: 0.9em;
|
||||
}
|
||||
|
||||
select, input {
|
||||
padding: 12px 15px;
|
||||
border: 2px solid #e0e6ed;
|
||||
border-radius: 8px;
|
||||
font-size: 14px;
|
||||
transition: all 0.3s ease;
|
||||
background: white;
|
||||
}
|
||||
|
||||
select:focus, input:focus {
|
||||
outline: none;
|
||||
border-color: #667eea;
|
||||
box-shadow: 0 0 0 3px rgba(102, 126, 234, 0.1);
|
||||
}
|
||||
|
||||
.btn {
|
||||
padding: 12px 24px;
|
||||
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
|
||||
color: white;
|
||||
border: none;
|
||||
border-radius: 8px;
|
||||
cursor: pointer;
|
||||
font-weight: 600;
|
||||
transition: all 0.3s ease;
|
||||
text-decoration: none;
|
||||
display: inline-block;
|
||||
}
|
||||
|
||||
.btn:hover {
|
||||
transform: translateY(-2px);
|
||||
box-shadow: 0 5px 15px rgba(0, 0, 0, 0.2);
|
||||
}
|
||||
|
||||
.data-container {
|
||||
background: rgba(255, 255, 255, 0.95);
|
||||
backdrop-filter: blur(10px);
|
||||
border-radius: 15px;
|
||||
padding: 25px;
|
||||
box-shadow: 0 8px 32px rgba(0, 0, 0, 0.1);
|
||||
}
|
||||
|
||||
.table-info {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
margin-bottom: 20px;
|
||||
padding: 15px;
|
||||
background: #f8f9fa;
|
||||
border-radius: 10px;
|
||||
}
|
||||
|
||||
.table-info h2 {
|
||||
color: #2c3e50;
|
||||
font-size: 1.5em;
|
||||
}
|
||||
|
||||
.stats {
|
||||
display: flex;
|
||||
gap: 20px;
|
||||
font-size: 0.9em;
|
||||
color: #7f8c8d;
|
||||
}
|
||||
|
||||
.table-wrapper {
|
||||
overflow-x: auto;
|
||||
border-radius: 10px;
|
||||
box-shadow: 0 4px 16px rgba(0, 0, 0, 0.1);
|
||||
}
|
||||
|
||||
table {
|
||||
width: 100%;
|
||||
border-collapse: collapse;
|
||||
background: white;
|
||||
font-size: 14px;
|
||||
}
|
||||
|
||||
th {
|
||||
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
|
||||
color: white;
|
||||
padding: 15px 12px;
|
||||
text-align: left;
|
||||
font-weight: 600;
|
||||
position: sticky;
|
||||
top: 0;
|
||||
z-index: 10;
|
||||
}
|
||||
|
||||
td {
|
||||
padding: 12px;
|
||||
border-bottom: 1px solid #ecf0f1;
|
||||
vertical-align: top;
|
||||
}
|
||||
|
||||
tr:nth-child(even) {
|
||||
background-color: #f8f9fa;
|
||||
}
|
||||
|
||||
tr:hover {
|
||||
background-color: #e3f2fd;
|
||||
transition: background-color 0.3s ease;
|
||||
}
|
||||
|
||||
.multiline-cell {
|
||||
white-space: pre-wrap;
|
||||
line-height: 1.6;
|
||||
max-height: 200px;
|
||||
overflow-y: auto;
|
||||
padding: 8px;
|
||||
background: #fff3cd;
|
||||
border-radius: 6px;
|
||||
border-left: 4px solid #ffc107;
|
||||
}
|
||||
|
||||
.normal-cell {
|
||||
white-space: nowrap;
|
||||
overflow: hidden;
|
||||
text-overflow: ellipsis;
|
||||
max-width: 300px;
|
||||
}
|
||||
|
||||
.empty-cell {
|
||||
color: #95a5a6;
|
||||
font-style: italic;
|
||||
}
|
||||
|
||||
.pagination {
|
||||
display: flex;
|
||||
justify-content: center;
|
||||
align-items: center;
|
||||
gap: 10px;
|
||||
margin-top: 25px;
|
||||
flex-wrap: wrap;
|
||||
}
|
||||
|
||||
.page-info {
|
||||
color: #7f8c8d;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.page-btn {
|
||||
padding: 8px 12px;
|
||||
border: 2px solid #e0e6ed;
|
||||
background: white;
|
||||
border-radius: 6px;
|
||||
cursor: pointer;
|
||||
transition: all 0.3s ease;
|
||||
min-width: 40px;
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
.page-btn:hover {
|
||||
border-color: #667eea;
|
||||
background: #667eea;
|
||||
color: white;
|
||||
}
|
||||
|
||||
.page-btn.active {
|
||||
background: #667eea;
|
||||
color: white;
|
||||
border-color: #667eea;
|
||||
}
|
||||
|
||||
.page-btn:disabled {
|
||||
opacity: 0.5;
|
||||
cursor: not-allowed;
|
||||
}
|
||||
|
||||
.loading {
|
||||
text-align: center;
|
||||
padding: 40px;
|
||||
color: #7f8c8d;
|
||||
font-size: 1.1em;
|
||||
}
|
||||
|
||||
.error {
|
||||
background: #f8d7da;
|
||||
color: #721c24;
|
||||
padding: 15px;
|
||||
border-radius: 8px;
|
||||
border: 1px solid #f5c6cb;
|
||||
margin: 20px 0;
|
||||
}
|
||||
|
||||
.no-data {
|
||||
text-align: center;
|
||||
padding: 40px;
|
||||
color: #7f8c8d;
|
||||
font-size: 1.1em;
|
||||
}
|
||||
|
||||
.analyze-btn {
|
||||
background: linear-gradient(135deg, #28a745, #20c997);
|
||||
color: white;
|
||||
border: none;
|
||||
padding: 8px 16px;
|
||||
border-radius: 6px;
|
||||
font-size: 1em;
|
||||
cursor: pointer;
|
||||
transition: all 0.3s ease;
|
||||
}
|
||||
|
||||
.analyze-btn:hover {
|
||||
transform: translateY(-1px);
|
||||
box-shadow: 0 4px 8px rgba(40, 167, 69, 0.3);
|
||||
}
|
||||
|
||||
.analyze-btn:disabled {
|
||||
background: #6c757d;
|
||||
cursor: not-allowed;
|
||||
transform: none;
|
||||
box-shadow: none;
|
||||
}
|
||||
|
||||
.progress-container {
|
||||
margin-top: 20px;
|
||||
padding: 15px;
|
||||
background: rgba(255, 255, 255, 0.95);
|
||||
backdrop-filter: blur(10px);
|
||||
border-radius: 12px;
|
||||
box-shadow: 0 4px 20px rgba(0, 0, 0, 0.1);
|
||||
display: none;
|
||||
}
|
||||
|
||||
.progress-bar {
|
||||
width: 100%;
|
||||
height: 20px;
|
||||
background: #e9ecef;
|
||||
border-radius: 10px;
|
||||
overflow: hidden;
|
||||
margin: 10px 0;
|
||||
}
|
||||
|
||||
.progress-fill {
|
||||
height: 100%;
|
||||
background: linear-gradient(90deg, #667eea, #764ba2);
|
||||
transition: width 0.3s ease;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
color: white;
|
||||
font-size: 0.8em;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
@media (max-width: 768px) {
|
||||
.controls {
|
||||
flex-direction: column;
|
||||
align-items: stretch;
|
||||
}
|
||||
|
||||
.table-info {
|
||||
flex-direction: column;
|
||||
gap: 15px;
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
.stats {
|
||||
justify-content: center;
|
||||
}
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="container">
|
||||
<div class="header">
|
||||
<h1>🗄️ SQLite数据库查看器</h1>
|
||||
<div class="controls">
|
||||
<div class="control-group">
|
||||
<label for="tableSelect">选择数据表:</label>
|
||||
<select id="tableSelect">
|
||||
<option value="">加载中...</option>
|
||||
</select>
|
||||
</div>
|
||||
<div class="control-group">
|
||||
<label for="analyzeBtn">分析:</label>
|
||||
<button id="analyzeScoresBtn" class="analyze-btn">📊 分析缺失分数</button>
|
||||
</div>
|
||||
<div class="control-group">
|
||||
<label for="searchField">筛选字段:</label>
|
||||
<select id="searchField" multiple disabled style="min-height: 80px;">
|
||||
<option value="">所有文本字段</option>
|
||||
</select>
|
||||
</div>
|
||||
<div class="control-group">
|
||||
<label for="searchValue">筛选内容:</label>
|
||||
<input type="text" id="searchValue" placeholder="输入筛选内容..." disabled>
|
||||
</div>
|
||||
<button class="btn" onclick="loadData()">刷新数据</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="data-container">
|
||||
<div class="table-info">
|
||||
<h2 id="tableName">请选择数据表</h2>
|
||||
<div class="stats">
|
||||
<span id="recordCount">记录数: 0</span>
|
||||
<span id="pageInfo">第 0 页,共 0 页</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div id="dataContainer">
|
||||
<div class="no-data">请选择数据表以查看内容</div>
|
||||
</div>
|
||||
|
||||
<div id="pagination" class="pagination" style="display: none;">
|
||||
<button class="page-btn" onclick="changePage('prev')" id="prevBtn">上一页</button>
|
||||
<span class="page-info" id="pageInfoDetail"></span>
|
||||
<button class="page-btn" onclick="changePage('next')" id="nextBtn">下一页</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div id="progressSection" class="progress-container">
|
||||
<h3>📊 分数分析进度</h3>
|
||||
<div class="progress-bar">
|
||||
<div id="progressFill" class="progress-fill" style="width: 0%;">0%</div>
|
||||
</div>
|
||||
<p id="progressText">等待分析开始...</p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<script>
|
||||
let currentTable = '';
|
||||
let currentPage = 1;
|
||||
let perPage = 50;
|
||||
let totalPages = 1;
|
||||
let currentData = null;
|
||||
|
||||
// 绑定事件
|
||||
document.addEventListener('DOMContentLoaded', function() {
|
||||
loadTables();
|
||||
|
||||
// 绑定事件
|
||||
document.getElementById('tableSelect').addEventListener('change', function() {
|
||||
currentTable = this.value;
|
||||
currentPage = 1;
|
||||
if (currentTable) {
|
||||
loadTableStructure();
|
||||
loadData();
|
||||
}
|
||||
});
|
||||
|
||||
// 分析缺失分数按钮事件
|
||||
document.getElementById('analyzeScoresBtn').addEventListener('click', analyzeMissingScores);
|
||||
});
|
||||
|
||||
// 分析缺失分数
|
||||
async function analyzeMissingScores() {
|
||||
const analyzeBtn = document.getElementById('analyzeScoresBtn');
|
||||
const progressSection = document.getElementById('progressSection');
|
||||
const progressFill = document.getElementById('progressFill');
|
||||
const progressText = document.getElementById('progressText');
|
||||
|
||||
try {
|
||||
// 禁用按钮
|
||||
analyzeBtn.disabled = true;
|
||||
analyzeBtn.textContent = '分析中...';
|
||||
|
||||
// 显示进度条
|
||||
progressSection.style.display = 'block';
|
||||
progressFill.style.width = '0%';
|
||||
progressFill.textContent = '0%';
|
||||
progressText.textContent = '正在启动分析任务...';
|
||||
|
||||
// 启动分析任务
|
||||
const response = await fetch('/api/analyze_missing_scores');
|
||||
const data = await response.json();
|
||||
|
||||
if (data.task_id) {
|
||||
// 定期查询任务状态
|
||||
const interval = setInterval(async () => {
|
||||
try {
|
||||
const statusResponse = await fetch(`/api/update_task_status/${data.task_id}`);
|
||||
const statusData = await statusResponse.json();
|
||||
|
||||
// 更新进度
|
||||
progressFill.style.width = `${statusData.progress}%`;
|
||||
progressFill.textContent = `${statusData.progress}%`;
|
||||
|
||||
if (statusData.status === 'running') {
|
||||
progressText.textContent = `正在分析: ${statusData.completed}/${statusData.total} 个产品`;
|
||||
} else if (statusData.status === 'completed') {
|
||||
progressText.textContent = '🎉 所有缺失分数分析完成!';
|
||||
clearInterval(interval);
|
||||
analyzeBtn.disabled = false;
|
||||
analyzeBtn.textContent = '📊 分析缺失分数';
|
||||
|
||||
// 如果当前正在查看product_analysis表,自动刷新
|
||||
if (currentTable === 'product_analysis') {
|
||||
loadData();
|
||||
}
|
||||
} else if (statusData.status === 'failed') {
|
||||
progressText.textContent = `❌ 分析失败: ${statusData.error}`;
|
||||
clearInterval(interval);
|
||||
analyzeBtn.disabled = false;
|
||||
analyzeBtn.textContent = '📊 分析缺失分数';
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('查询任务状态失败:', error);
|
||||
progressText.textContent = '查询任务状态失败';
|
||||
}
|
||||
}, 2000);
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('启动分析任务失败:', error);
|
||||
progressText.textContent = `启动分析失败: ${error.message}`;
|
||||
analyzeBtn.disabled = false;
|
||||
analyzeBtn.textContent = '📊 分析缺失分数';
|
||||
}
|
||||
}
|
||||
|
||||
document.getElementById('searchField').addEventListener('change', loadData);
|
||||
document.getElementById('searchValue').addEventListener('input', debounce(loadData, 500));
|
||||
|
||||
// 防抖函数
|
||||
function debounce(func, wait) {
|
||||
let timeout;
|
||||
return function executedFunction(...args) {
|
||||
const later = () => {
|
||||
clearTimeout(timeout);
|
||||
func(...args);
|
||||
};
|
||||
clearTimeout(timeout);
|
||||
timeout = setTimeout(later, wait);
|
||||
};
|
||||
}
|
||||
|
||||
// 加载表列表
|
||||
async function loadTables() {
|
||||
try {
|
||||
const response = await fetch('/api/tables');
|
||||
const data = await response.json();
|
||||
const select = document.getElementById('tableSelect');
|
||||
select.innerHTML = '<option value="">选择数据表...</option>';
|
||||
|
||||
data.tables.forEach(table => {
|
||||
const option = document.createElement('option');
|
||||
option.value = table;
|
||||
option.textContent = table;
|
||||
select.appendChild(option);
|
||||
});
|
||||
} catch (error) {
|
||||
console.error('加载表列表失败:', error);
|
||||
showError('加载表列表失败: ' + error.message);
|
||||
}
|
||||
}
|
||||
|
||||
// 加载表结构
|
||||
async function loadTableStructure() {
|
||||
if (!currentTable) return;
|
||||
|
||||
try {
|
||||
const response = await fetch(`/api/table/${currentTable}/structure`);
|
||||
const data = await response.json();
|
||||
const searchField = document.getElementById('searchField');
|
||||
|
||||
searchField.innerHTML = '<option value="">所有文本字段</option>';
|
||||
data.structure.forEach(field => {
|
||||
const option = document.createElement('option');
|
||||
option.value = field.name;
|
||||
option.textContent = field.name;
|
||||
searchField.appendChild(option);
|
||||
});
|
||||
|
||||
searchField.disabled = false;
|
||||
document.getElementById('searchValue').disabled = false;
|
||||
} catch (error) {
|
||||
console.error('加载表结构失败:', error);
|
||||
}
|
||||
}
|
||||
|
||||
// 加载数据
|
||||
async function loadData() {
|
||||
if (!currentTable) return;
|
||||
|
||||
const container = document.getElementById('dataContainer');
|
||||
container.innerHTML = '<div class="loading">📊 数据加载中...</div>';
|
||||
|
||||
const searchFieldSelect = document.getElementById('searchField');
|
||||
const searchValue = document.getElementById('searchValue').value;
|
||||
|
||||
try {
|
||||
let url = `/api/table/${currentTable}/data?page=${currentPage}&per_page=${perPage}`;
|
||||
if (searchValue) {
|
||||
// 获取所有选中的字段
|
||||
const selectedFields = Array.from(searchFieldSelect.selectedOptions)
|
||||
.map(option => option.value)
|
||||
.filter(value => value !== '');
|
||||
|
||||
if (selectedFields.length > 0) {
|
||||
// 如果选择了特定字段,传递所有选中的字段
|
||||
selectedFields.forEach(field => {
|
||||
url += `&search_field=${encodeURIComponent(field)}`;
|
||||
});
|
||||
} else {
|
||||
// 否则使用"all"表示所有文本字段
|
||||
url += '&search_field=all';
|
||||
}
|
||||
url += `&search_value=${encodeURIComponent(searchValue)}`;
|
||||
}
|
||||
|
||||
const response = await fetch(url);
|
||||
currentData = await response.json();
|
||||
|
||||
displayData(currentData);
|
||||
updatePagination();
|
||||
|
||||
} catch (error) {
|
||||
console.error('加载数据失败:', error);
|
||||
showError('加载数据失败: ' + error.message);
|
||||
}
|
||||
}
|
||||
|
||||
// 显示数据
|
||||
function displayData(data) {
|
||||
const container = document.getElementById('dataContainer');
|
||||
|
||||
if (!data.rows || data.rows.length === 0) {
|
||||
container.innerHTML = '<div class="no-data">📭 没有找到数据</div>';
|
||||
return;
|
||||
}
|
||||
|
||||
let html = '<div class="table-wrapper"><table><thead><tr>';
|
||||
|
||||
// 表头
|
||||
data.columns.forEach(col => {
|
||||
html += `<th>${col}</th>`;
|
||||
});
|
||||
html += '</tr></thead><tbody>';
|
||||
|
||||
// 数据行
|
||||
data.rows.forEach(row => {
|
||||
html += '<tr>';
|
||||
row.forEach((cell, index) => {
|
||||
const colName = data.columns[index];
|
||||
if (cell.type === 'multiline') {
|
||||
html += `<td><div class="multiline-cell">${escapeHtml(cell.value)}</div></td>`;
|
||||
} else if (cell.type === 'empty') {
|
||||
html += '<td><div class="empty-cell">空</div></td>';
|
||||
} else if (colName === 'product_link' && cell.value) {
|
||||
// 渲染为链接
|
||||
html += `<td><div class="normal-cell"><a href="${escapeHtml(cell.value)}" target="_blank" rel="noopener noreferrer">${escapeHtml(cell.value)}</a></div></td>`;
|
||||
} else {
|
||||
html += `<td><div class="normal-cell">${escapeHtml(cell.value)}</div></td>`;
|
||||
}
|
||||
});
|
||||
html += '</tr>';
|
||||
});
|
||||
|
||||
html += '</tbody></table></div>';
|
||||
container.innerHTML = html;
|
||||
|
||||
// 更新统计信息
|
||||
document.getElementById('tableName').textContent = `📋 ${currentTable}`;
|
||||
document.getElementById('recordCount').textContent = `记录数: ${data.total_count}`;
|
||||
document.getElementById('pageInfo').textContent = `第 ${currentPage} 页,共 ${data.total_pages} 页`;
|
||||
}
|
||||
|
||||
// 更新分页
|
||||
function updatePagination() {
|
||||
if (!currentData) return;
|
||||
|
||||
totalPages = currentData.total_pages;
|
||||
const pagination = document.getElementById('pagination');
|
||||
const prevBtn = document.getElementById('prevBtn');
|
||||
const nextBtn = document.getElementById('nextBtn');
|
||||
const pageInfo = document.getElementById('pageInfoDetail');
|
||||
|
||||
if (totalPages <= 1) {
|
||||
pagination.style.display = 'none';
|
||||
return;
|
||||
}
|
||||
|
||||
pagination.style.display = 'flex';
|
||||
|
||||
prevBtn.disabled = currentPage <= 1;
|
||||
nextBtn.disabled = currentPage >= totalPages;
|
||||
|
||||
pageInfo.textContent = `${currentPage} / ${totalPages}`;
|
||||
}
|
||||
|
||||
// 翻页
|
||||
function changePage(direction) {
|
||||
if (direction === 'prev' && currentPage > 1) {
|
||||
currentPage--;
|
||||
loadData();
|
||||
} else if (direction === 'next' && currentPage < totalPages) {
|
||||
currentPage++;
|
||||
loadData();
|
||||
}
|
||||
}
|
||||
|
||||
// HTML转义
|
||||
function escapeHtml(text) {
|
||||
const div = document.createElement('div');
|
||||
div.textContent = text;
|
||||
return div.innerHTML;
|
||||
}
|
||||
|
||||
// 显示错误
|
||||
function showError(message) {
|
||||
const container = document.getElementById('dataContainer');
|
||||
container.innerHTML = `<div class="error">❌ ${message}</div>`;
|
||||
}
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
File diff suppressed because it is too large
Load Diff
@@ -205,9 +205,13 @@ def process_temp_files():
|
||||
continue
|
||||
|
||||
# 处理每篇文章
|
||||
for article in tqdm(articles, desc=f"处理 {file_path}"):
|
||||
for i, article in tqdm(enumerate(articles), desc=f"处理 {file_path}", total=len(articles)):
|
||||
total_processed += 1
|
||||
|
||||
# 每处理10篇文章记录一次进度
|
||||
if i % 10 == 0 and i > 0:
|
||||
logger.info(f"已处理 {i}/{len(articles)} 篇文章,完成 {i/len(articles)*100:.1f}%")
|
||||
|
||||
# 检查重复
|
||||
if check_duplicate(article['title'], source_date):
|
||||
logger.info(f"跳过重复文章(最近三天已存在): {article['title']}")
|
||||
|
||||
BIN
tophub_data.db
BIN
tophub_data.db
Binary file not shown.
64902
tophub_scraper.log
64902
tophub_scraper.log
File diff suppressed because it is too large
Load Diff
@@ -262,7 +262,7 @@ class TopHubScraper:
|
||||
|
||||
# 实时读取输出以避免编码问题
|
||||
try:
|
||||
stdout, stderr = process.communicate(timeout=300) # 5分钟超时
|
||||
stdout, stderr = process.communicate(timeout=3600) # 1小时超时
|
||||
except subprocess.TimeoutExpired:
|
||||
process.kill()
|
||||
logger.error("tophub_add_data_to_db.py执行超时")
|
||||
@@ -287,6 +287,8 @@ class TopHubScraper:
|
||||
|
||||
if __name__ == "__main__":
|
||||
scraper = TopHubScraper()
|
||||
|
||||
|
||||
try:
|
||||
# 抓取数据
|
||||
scraped_data = scraper.scrape_by_node_ids()
|
||||
|
||||
Reference in New Issue
Block a user