第一次提交。

其中爬取是tophub_scraper.py 数据入库是 tophub_add_data_to_db.py 查看当前数据内容是 db_viewer.py
2025-11-09 17:20:44 +08:00
commit 25da264413
29 changed files with 28508 additions and 0 deletions
--- a/2025年11月9日131545.txt
+++ b/2025年11月9日131545.txt
--- a/README.md
+++ b/README.md
@@ -0,0 +1,149 @@
 # TopHub数据处理系统
 本项目用于处理TopHub网站抓取的临时文件，对数据进行分类并存储到SQLite数据库中。
 ## 功能特点
 1. **文件解析**：读取临时文件（格式为"日期+时间.txt"），每5行作为一个数据单元
 2. **数据提取**：从每个数据单元中提取标题和链接
 3. **智能分类**：调用本地API（Ollama）对标题进行自动分类
 4. **去重处理**：检查标题+日期是否已存在于数据库中，避免重复录入
 5. **进度显示**：使用进度条显示处理进度
 6. **分类标准化**：将相似分类合并为标准分类
 ## 文件说明
 ### 核心脚本
 1. **process_temp_files.py** - 主处理脚本
   - 解析临时文件
   - 调用API进行分类
   - 存储到数据库
 2. **cleanup_categories.py** - 分类清理脚本
   - 清理分类中的特殊字符
   - 统一分类格式
 3. **standardize_categories.py** - 分类标准化脚本
   - 将相似分类合并为标准分类
   - 提供分类映射规则
 ### 辅助脚本
 1. **check_db.py** - 数据库结构检查脚本
 2. **test_api.py** - API测试脚本
 3. **view_categories.py** - 查看分类示例脚本
 ## 使用方法
 ### 1. 处理临时文件
 ```bash
 python process_temp_files.py
 ```
 该脚本会：
 - 扫描当前目录下的所有临时文件（格式为"日期+时间.txt"）
 - 解析文件内容，提取标题和链接
 - 调用本地API对标题进行分类
 - 检查并避免重复数据
 - 存储到tophub_data.db数据库
 ### 2. 清理和标准化分类
 ```bash
 # 清理分类中的特殊字符
 python cleanup_categories.py
 # 标准化分类
 python standardize_categories.py
 ```
 ### 3. 查看数据
 ```bash
 # 查看分类示例
 python view_categories.py
 # 检查数据库结构
 python check_db.py
 ```
 ## 数据库结构
 数据库文件为`tophub_data.db`，包含以下表：
 1. **tophub_entries** - 主数据表
   - id: 主键
   - text_content: 标题内容（非空）
   - link: 链接
   - category: 分类
   - scrape_time: 抓取时间
 2. **classification_progress** - 分类进度表
   - id: 主键
   - total_count: 总数量
   - processed_count: 已处理数量
   - last_updated: 最后更新时间
 ## API配置
 脚本使用本地Ollama API进行分类：
 - API地址：http://localhost:11434/api/generate
 - 模型：gemma3:4b
 - 请求格式：JSON
 ## 分类标准
 系统支持以下标准分类：
 1. 科技 - 新质科技、互联网等
 2. 社会 - 社会新闻、生活服务等
 3. 体育 - 体育新闻、足球等
 4. 历史 - 历史事件、历史人物等
 5. 安全 - 安全漏洞、安全科技等
 6. 军事 - 军事新闻、国防等
 7. 金融 - 金融新闻、市场分析等
 8. 购物 - 电商、购物等
 9. 游戏 - 游戏新闻等
 10. 娱乐 - 娱乐八卦、音乐等
 11. 健康 - 健康医疗、健康生活等
 12. 其他 - 其他未分类内容
 ## 注意事项
 1. 确保本地Ollama服务已启动并可访问
 2. 临时文件格式必须为"日期+时间.txt"
 3. 每个数据单元包含5行：节点ID、分类、标题、链接和分隔线
 4. 数据库文件会自动创建，无需手动创建
 ## 日志文件
 系统会生成以下日志文件：
 - process_temp_files.log - 主处理日志
 - cleanup_categories.log - 分类清理日志
 - standardize_categories.log - 分类标准化日志
 ## 示例
 ### 临时文件格式示例
 ```
 节点ID: 102
 分类: 宽带山
 标题: 女机器人
 链接: http://club.kdslife.com/t_11502693.html
 --------------------------------------------------
 节点ID: 103
 分类: 宽带山
 标题: 这个应该属于底盘不行吗
 链接: http://club.kdslife.com/t_11502686.html
 --------------------------------------------------
 ```
 ### 处理结果示例
 ```
 标题 '女机器人' 分类为: 科技
 标题 '这个应该属于底盘不行吗' 分类为: 其他
 ```
--- a/pycache/db_viewer.cpython-38.pyc
+++ b/pycache/db_viewer.cpython-38.pyc
--- a/pycache/tophub_add_data_to_db.cpython-38.pyc
+++ b/pycache/tophub_add_data_to_db.cpython-38.pyc
--- a/pycache/tophub_scraper.cpython-38.pyc
+++ b/pycache/tophub_scraper.cpython-38.pyc
--- a/add_interested_field.py
+++ b/add_interested_field.py
@@ -0,0 +1,72 @@
 #!/usr/bin/env python3
 """
 添加感兴趣标记字段脚本
 为articles表添加is_interested字段，默认值为0
 """
 import sqlite3
 import os
 from loguru import logger
 def add_interested_field():
    """为articles表添加is_interested字段"""
    # 获取当前脚本所在目录的数据库文件路径
    script_dir = os.path.dirname(os.path.abspath(__file__))
    db_path = os.path.join(script_dir, "tophub_data.db")
    # 检查数据库文件是否存在
    if not os.path.exists(db_path):
        logger.error(f"数据库文件不存在: {db_path}")
        return False
    try:
        # 连接数据库
        conn = sqlite3.connect(db_path)
        cursor = conn.cursor()
        # 检查is_interested字段是否已存在
        cursor.execute("PRAGMA table_info(articles)")
        columns = cursor.fetchall()
        column_names = [column[1] for column in columns]
        if "is_interested" in column_names:
            logger.info("is_interested字段已存在，无需添加")
            conn.close()
            return True
        # 添加is_interested字段，默认值为0
        logger.info("正在添加is_interested字段...")
        cursor.execute("ALTER TABLE articles ADD COLUMN is_interested INTEGER DEFAULT 0")
        # 提交更改
        conn.commit()
        logger.info("成功添加is_interested字段")
        # 验证字段是否添加成功
        cursor.execute("PRAGMA table_info(articles)")
        columns = cursor.fetchall()
        column_names = [column[1] for column in columns]
        if "is_interested" in column_names:
            logger.info("验证成功：is_interested字段已添加到articles表")
        else:
            logger.error("验证失败：is_interested字段未成功添加")
            conn.close()
            return False
        conn.close()
        return True
    except sqlite3.Error as e:
        logger.error(f"数据库操作出错: {str(e)}")
        return False
    except Exception as e:
        logger.error(f"添加字段时出错: {str(e)}")
        return False
 if __name__ == "__main__":
    logger.add("db_modify.log", rotation="10 MB", level="INFO")
    if add_interested_field():
        logger.info("数据库修改完成")
    else:
        logger.error("数据库修改失败")
--- a/check_db.py
+++ b/check_db.py
@@ -0,0 +1,23 @@
 #!/usr/bin/env python3
 import sqlite3
 # 连接数据库
 conn = sqlite3.connect('tophub_data.db')
 cursor = conn.cursor()
 # 查看所有表
 cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")
 tables = cursor.fetchall()
 print('Tables:', tables)
 # 查看表结构
 for table in tables:
    table_name = table[0]
    print(f'\nTable {table_name}:')
    cursor.execute(f'PRAGMA table_info({table_name});')
    columns = cursor.fetchall()
    for col in columns:
        print(col)
 # 关闭连接
 conn.close()
--- a/check_db_structure.py
+++ b/check_db_structure.py
@@ -0,0 +1,25 @@
 #!/usr/bin/env python3
 import sqlite3
 # 连接到数据库
 conn = sqlite3.connect('tophub_data.db')
 cursor = conn.cursor()
 # 获取所有表名
 cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")
 tables = cursor.fetchall()
 print("数据库中的表:")
 for table in tables:
    print(f"  - {table[0]}")
 # 获取每个表的结构
 for table in tables:
    table_name = table[0]
    print(f"\n表 '{table_name}' 的结构:")
    cursor.execute(f"PRAGMA table_info({table_name});")
    columns = cursor.fetchall()
    for column in columns:
        print(f"  {column[1]} ({column[2]})")
 conn.close()
--- a/check_interested_values.py
+++ b/check_interested_values.py
@@ -0,0 +1,50 @@
 #!/usr/bin/env python3
 import sqlite3
 from loguru import logger
 def check_interested_values():
    """检查is_interested字段的值范围"""
    try:
        # 连接数据库
        conn = sqlite3.connect('tophub_data.db')
        cursor = conn.cursor()
        # 查询is_interested字段的最小值、最大值和平均值
        cursor.execute("SELECT MIN(is_interested), MAX(is_interested), AVG(is_interested) FROM articles")
        result = cursor.fetchone()
        min_val, max_val, avg_val = result
        logger.info(f"is_interested字段统计:")
        logger.info(f"  最小值: {min_val}")
        logger.info(f"  最大值: {max_val}")
        logger.info(f"  平均值: {avg_val:.2f}")
        # 查询不同值的分布
        cursor.execute("SELECT is_interested, COUNT(*) FROM articles GROUP BY is_interested ORDER BY is_interested")
        distribution = cursor.fetchall()
        logger.info("\nis_interested值分布:")
        for value, count in distribution:
            logger.info(f"  {value}: {count} 条记录")
        # 查询一些示例记录
        cursor.execute("SELECT id, title, is_interested FROM articles ORDER BY is_interested DESC LIMIT 5")
        examples = cursor.fetchall()
        logger.info("\n示例记录:")
        for example in examples:
            logger.info(f"  ID: {example[0]}, 标题: {example[1][:30]}..., is_interested: {example[2]}")
        conn.close()
        return True
    except sqlite3.Error as e:
        logger.error(f"数据库操作出错: {str(e)}")
        return False
    except Exception as e:
        logger.error(f"查询数据时出错: {str(e)}")
        return False
 if __name__ == "__main__":
    logger.add("check_interested_values.log", rotation="10 MB", level="INFO")
    check_interested_values()
--- a/db_modify.log
+++ b/db_modify.log
--- a/db_modify.py
+++ b/db_modify.py
@@ -0,0 +1,220 @@
 #!/usr/bin/env python3
 # -*- coding: utf-8 -*-
 """
 打开tophub_data.db数据库，读取表单，提取所有的类
 访问本地ollama的api，修改类的名称为2-4个字，去掉中间的空格、特殊字符等字符
 """
 import requests
 import sqlite3
 import re
 import time
 from loguru import logger
 # 配置日志
 logger.add("db_modify.log", rotation="10 MB", level="INFO")
 class CategoryModifier:
    """类别修改器，用于优化数据库中的类别名称"""
    def __init__(self, db_path="tophub_data.db"):
        """
        初始化类别修改器
        Args:
            db_path (str): 数据库路径
        """
        self.db_path = db_path
        self.ollama_url = "http://localhost:11434/api/generate"
        self.model = "qwen3:8b"
    def get_all_categories(self):
        """
        从数据库中获取所有唯一的类别
        Returns:
            list: 包含所有唯一类别的列表
        """
        try:
            conn = sqlite3.connect(self.db_path)
            cursor = conn.cursor()
            cursor.execute("SELECT DISTINCT category FROM articles")
            categories = [row[0] for row in cursor.fetchall() if row[0]]
            conn.close()
            logger.info(f"成功获取 {len(categories)} 个唯一类别")
            return categories
        except Exception as e:
            logger.error(f"获取类别时出错: {e}")
            return []
    def clean_category_name(self, category):
        """
        清理类别名称，移除特殊字符和多余空格
        Args:
            category (str): 原始类别名称
        Returns:
            str: 清理后的类别名称
        """
        # 移除特殊字符，只保留中文、英文和数字
        cleaned = re.sub(r'[^\u4e00-\u9fa5a-zA-Z0-9]', '', category)
        # 移除多余的空格
        cleaned = re.sub(r'\s+', '', cleaned)
        return cleaned
    def optimize_category_with_ollama(self, category):
        """
        使用Ollama API优化类别名称
        Args:
            category (str): 原始类别名称
        Returns:
            str: 优化后的类别名称
        """
        try:
            # 构造提示词
            prompt = f"请将以下类别名称简化为3-6个汉字，去除空格和特殊符号，更容易理解，并保持原意：'{category}'。" + \
                     "例子一：'新科科技'，优化为'新质生产力'。例子二：'产设'，优化为'产品设计'。例子三：'史人'，优化为'历史人物'。"
            # 准备请求数据
            data = {
                "model": self.model,
                "prompt": prompt,
                "stream": False
            }
            # 发送请求到Ollama API
            response = requests.post(self.ollama_url, json=data, timeout=30)
            response.raise_for_status()
            # 解析响应
            result = response.json()
            optimized = result.get("response", "").strip()
            # 清理优化后的名称
            optimized = self.clean_category_name(optimized)
            logger.info(f"类别 '{category}' 优化为 '{optimized}'")
            return optimized
        except Exception as e:
            logger.error(f"优化类别 '{category}' 时出错: {e}")
            # 如果API调用失败，返回清理后的原始名称
            return self.clean_category_name(category)
    def update_category_in_db(self, old_category, new_category):
        """
        更新数据库中的类别名称
        Args:
            old_category (str): 原始类别名称
            new_category (str): 新的类别名称
        """
        try:
            conn = sqlite3.connect(self.db_path)
            cursor = conn.cursor()
            cursor.execute(
                "UPDATE articles SET category = ? WHERE category = ?",
                (new_category, old_category)
            )
            count = cursor.rowcount
            conn.commit()
            conn.close()
            logger.info(f"成功更新类别 '{old_category}' 为 '{new_category}'，影响 {count} 条记录")
        except Exception as e:
            logger.error(f"更新类别 '{old_category}' 时出错: {e}")
    def process_all_categories(self):
        """
        处理所有类别
        """
        logger.info("开始处理所有类别...")
        # 获取所有类别
        categories = self.get_all_categories()
        if not categories:
            logger.warning("未找到任何类别")
            return
        # 初始化进度统计
        total_categories = len(categories)
        processed_count = 0
        unchanged_count = 0
        updated_count = 0
        start_time = time.time()
        logger.info(f"总共需要处理 {total_categories} 个类别")
        # 处理每个类别
        for i, category in enumerate(categories, 1):
            category_start_time = time.time()
            logger.info(f"处理进度: {i}/{total_categories} ({i/total_categories*100:.1f}%) - 类别: {category}")
            # 使用Ollama API优化类别名称
            optimized_category = self.optimize_category_with_ollama(category)
            # 如果优化后的名称与原始名称不同，则更新数据库
            if optimized_category != category:
                self.update_category_in_db(category, optimized_category)
                updated_count += 1
                logger.info(f"类别 '{category}' 已更新为 '{optimized_category}'")
            else:
                unchanged_count += 1
                logger.info(f"类别 '{category}' 无需更改")
            processed_count += 1
            category_end_time = time.time()
            category_duration = category_end_time - category_start_time
            # 显示当前类别处理时间和平均处理时间
            elapsed_time = time.time() - start_time
            avg_time_per_category = elapsed_time / processed_count
            estimated_remaining = avg_time_per_category * (total_categories - processed_count)
            logger.info(f"类别 '{category}' 处理完成，耗时: {category_duration:.2f}秒")
            logger.info(f"累计处理: {processed_count}/{total_categories} | "
                       f"已更新: {updated_count} | 未更改: {unchanged_count} | "
                       f"平均耗时: {avg_time_per_category:.2f}秒/类别 | "
                       f"预计剩余时间: {estimated_remaining:.2f}秒")
        # 显示总体统计信息
        total_duration = time.time() - start_time
        logger.info("="*60)
        logger.info("所有类别处理完成!")
        logger.info(f"总计处理类别数: {total_categories}")
        logger.info(f"更新类别数: {updated_count}")
        logger.info(f"未更改类别数: {unchanged_count}")
        logger.info(f"总耗时: {total_duration:.2f}秒")
        logger.info(f"平均每类别处理时间: {total_duration/total_categories:.2f}秒")
        logger.info("="*60)
 def main():
    """主函数"""
    modifier = CategoryModifier()
    # 检查Ollama服务是否可用
    try:
        response = requests.get("http://localhost:11434/api/tags", timeout=5)
        if response.status_code == 200:
            logger.info("Ollama服务可用")
        else:
            logger.warning("Ollama服务不可用，请确保服务已启动")
            return
    except Exception as e:
        logger.warning(f"无法连接到Ollama服务: {e}")
        logger.info("请确保Ollama服务已在本地运行")
        return
    # 处理所有类别
    modifier.process_all_categories()
 if __name__ == "__main__":
    main()
--- a/db_modify_score.log
+++ b/db_modify_score.log
@@ -0,0 +1,6 @@
 2025-11-07 23:49:35.277 | INFO     | __main__:modify_database_structure:44 - 正在添加score字段...
 2025-11-07 23:49:35.281 | INFO     | __main__:modify_database_structure:48 - 正在转换is_interested数据到score字段...
 2025-11-07 23:49:35.288 | INFO     | __main__:modify_database_structure:63 - 成功添加score字段并转换数据
 2025-11-07 23:49:35.289 | INFO     | __main__:modify_database_structure:71 - 验证成功：score字段已添加到articles表
 2025-11-07 23:49:35.289 | INFO     | __main__:modify_database_structure:84 - 数据转换结果: score=7的记录数: 1, score=5的记录数: 1196
 2025-11-07 23:49:35.290 | INFO     | __main__:<module>:99 - 数据库结构修改完成
--- a/db_modify_zhipu.log
+++ b/db_modify_zhipu.log
--- a/db_modify_zhipu.py
+++ b/db_modify_zhipu.py
@@ -0,0 +1,101 @@
 # 调用智谱的api，修改每一个项目的分类
 # 从db文件读取表，读取第二个项，标题，根据标题，提交到api，获取回复，返回，并更新到db文件
 import sqlite3
 import time
 from loguru import logger
 from zhipuai import ZhipuAI
 # 配置日志
 logger.add("db_modify_zhipu.log", rotation="10 MB", level="INFO")
 # 初始化客户端
 client = ZhipuAI(api_key="fad3d9f9a45f4d939f0e7a7133fa07bf.X4bOO053GAIPKLE5")
 def get_simplified_category(title):
    """
    调用智谱API获取简化的分类名称
    """
    try:
        # 创建聊天完成请求
        response = client.chat.completions.create(
            model="glm-4-flash",
            messages=[
                {
                    "role": "system",
                    "content": "你是一个专业的分类助手。请根据文章标题，提供一个3-6个汉字的简化分类名称，去除空格和特殊符号，更容易理解，并保持原意。"
                },
                {
                    "role": "user",
                    "content": f"对以下文字内容进行分类，返回结果为类别，如\"社会新闻\"，\"机器人\"，\"金融\"，\"历史\"，\"购物\"，\"新质生产力\"等等。目的：只返回2-6个汉字，不返回其它内容。内容：'{title}'"
                }
            ],
            temperature=0.7
        )
        # 提取回复内容
        category = response.choices[0].message.content.strip()
        logger.info(f"标题: {title[:30]}... -> 分类: {category}")
        return category
    except Exception as e:
        logger.error(f"获取分类失败: {str(e)}")
        return None
 def update_database_categories():
    """
    更新数据库中的分类信息
    """
    # 连接到数据库
    conn = sqlite3.connect('tophub_data.db')
    cursor = conn.cursor()
    try:
        # 获取所有记录
        cursor.execute("SELECT id, title, category FROM articles")
        records = cursor.fetchall()
        logger.info(f"共找到 {len(records)} 条记录需要处理")
        updated_count = 0
        failed_count = 0
        # 处理每条记录
        for record in records:
            record_id, title, current_category = record
            # 跳过已经简化的分类（长度<=6且不包含特殊字符）
            if current_category and len(current_category) <= 6 and not any(c in current_category for c in " ,.!?;:，。！？；："):
                logger.info(f"跳过记录 {record_id}，分类已简化: {current_category}")
                continue
            logger.info(f"处理记录 {record_id}: {title[:30]}...")
            # 获取新的分类
            new_category = get_simplified_category(title)
            if new_category:
                # 更新数据库
                cursor.execute("UPDATE articles SET category = ? WHERE id = ?", (new_category, record_id))
                conn.commit()
                updated_count += 1
                logger.info(f"已更新记录 {record_id} 的分类为: {new_category}")
            else:
                failed_count += 1
                logger.error(f"无法获取记录 {record_id} 的新分类")
            # 添加延迟，避免API调用过于频繁
            time.sleep(1)
        logger.info(f"处理完成! 成功更新 {updated_count} 条记录，失败 {failed_count} 条记录")
    except Exception as e:
        logger.error(f"更新数据库时出错: {str(e)}")
        conn.rollback()
    finally:
        conn.close()
 if __name__ == "__main__":
    logger.info("开始更新数据库分类...")
    update_database_categories()
    logger.info("程序执行完成")
--- a/db_viewer.py
+++ b/db_viewer.py
@@ -0,0 +1,835 @@
 #!/usr/bin/env python3
 """
 TopHub数据查看器 - PySide5界面应用程序
 用于显示SQLite数据库中的TopHub抓取数据
 """
 import sys
 import os
 import sqlite3
 import webbrowser
 from datetime import datetime
 from loguru import logger
 from PySide6.QtWidgets import (
    QApplication, QMainWindow, QTableWidget, QTableWidgetItem, QVBoxLayout, 
    QHBoxLayout, QWidget, QLabel, QLineEdit, QPushButton, QComboBox, 
    QGroupBox, QStatusBar, QMenuBar, QMenu, QMessageBox, QHeaderView,
    QAbstractItemView, QDialog, QFormLayout, QTextEdit, QInputDialog
 )
 from PySide6.QtCore import Qt, QUrl, QTimer, QEvent
 from PySide6.QtGui import QAction, QFont, QIcon, QDesktopServices, QClipboard
 class DatabaseViewer(QMainWindow):
    """主窗口类，用于显示数据库内容"""
    def __init__(self):
        super().__init__()
        # 获取当前脚本所在目录的数据库文件路径
        script_dir = os.path.dirname(os.path.abspath(__file__))
        self.db_path = os.path.join(script_dir, "tophub_data.db")
        # 检查数据库文件是否存在
        if not os.path.exists(self.db_path):
            QMessageBox.critical(self, "错误", f"数据库文件不存在: {self.db_path}")
            sys.exit(1)
        self.init_ui()
        self.load_data()
    def init_ui(self):
        """初始化用户界面"""
        # 设置窗口属性
        self.setWindowTitle("TopHub数据查看器")
        self.setGeometry(100, 100, 1200, 800)
        # 创建中央部件
        central_widget = QWidget()
        self.setCentralWidget(central_widget)
        # 创建主布局
        main_layout = QVBoxLayout(central_widget)
        # 创建搜索和筛选区域
        filter_group = QGroupBox("搜索和筛选")
        filter_layout = QHBoxLayout(filter_group)
        # 搜索框
        self.search_edit = QLineEdit()
        self.search_edit.setPlaceholderText("输入搜索关键词...")
        self.search_edit.textChanged.connect(self.filter_data)
        filter_layout.addWidget(QLabel("搜索:"))
        filter_layout.addWidget(self.search_edit)
        # 分类筛选
        self.category_combo = QComboBox()
        self.category_combo.addItem("全部分类")
        self.category_combo.currentTextChanged.connect(self.filter_data)
        filter_layout.addWidget(QLabel("分类:"))
        filter_layout.addWidget(self.category_combo)
        # 刷新按钮
        self.refresh_button = QPushButton("刷新数据")
        self.refresh_button.clicked.connect(self.load_data)
        filter_layout.addWidget(self.refresh_button)
        # 批量删除相关控件
        self.select_by_keyword_button = QPushButton("按关键字选中")
        self.select_by_keyword_button.clicked.connect(self.select_by_keyword)
        filter_layout.addWidget(self.select_by_keyword_button)
        self.delete_selected_button = QPushButton("删除选中项")
        self.delete_selected_button.clicked.connect(self.delete_selected_items)
        filter_layout.addWidget(self.delete_selected_button)
        # 标记感兴趣按钮
        self.mark_interested_button = QPushButton("标记为感兴趣")
        self.mark_interested_button.clicked.connect(self.mark_as_interested)
        filter_layout.addWidget(self.mark_interested_button)
        # 添加筛选区域到主布局
        main_layout.addWidget(filter_group)
        # 创建分类统计显示区域
        self.category_stats_group = QGroupBox("分类统计")
        self.category_stats_layout = QHBoxLayout(self.category_stats_group)
        self.category_stats_label = QLabel("暂无数据")
        self.category_stats_layout.addWidget(self.category_stats_label)
        main_layout.addWidget(self.category_stats_group)
        # 创建表格
        self.table = QTableWidget()
        self.table.setColumnCount(6)  # 保留6列，最后一列显示评分
        self.table.setHorizontalHeaderLabels(["ID", "标题", "链接", "分类", "来源日期", "评分"])
        # 设置表格属性
        self.table.setAlternatingRowColors(True)
        self.table.setSelectionBehavior(QAbstractItemView.SelectRows)
        self.table.setEditTriggers(QAbstractItemView.NoEditTriggers)
        self.table.setSortingEnabled(True)
        # 设置表格选择模式
        self.table.setSelectionMode(QAbstractItemView.SingleSelection)
        # 设置列宽
        header = self.table.horizontalHeader()
        header.setSectionResizeMode(0, QHeaderView.ResizeToContents)  # ID列
        header.setSectionResizeMode(1, QHeaderView.Stretch)  # 文本内容列
        header.setSectionResizeMode(2, QHeaderView.ResizeToContents)  # 链接列
        header.setSectionResizeMode(3, QHeaderView.ResizeToContents)  # 分类列
        header.setSectionResizeMode(4, QHeaderView.ResizeToContents)  # 时间列
        header.setSectionResizeMode(5, QHeaderView.ResizeToContents)  # 评分列
        # 启用链接点击
        self.table.cellClicked.connect(self.on_cell_clicked)
        # 安装事件过滤器以处理链接点击
        self.table.viewport().installEventFilter(self)
        # 启用右键菜单
        self.table.setContextMenuPolicy(Qt.CustomContextMenu)
        self.table.customContextMenuRequested.connect(self.show_context_menu)
        # 添加表格到主布局
        main_layout.addWidget(self.table)
        # 创建状态栏
        self.status_bar = QStatusBar()
        self.setStatusBar(self.status_bar)
        # 创建菜单栏
        self.create_menu_bar()
    def create_menu_bar(self):
        """创建菜单栏"""
        menubar = self.menuBar()
        # 文件菜单
        file_menu = menubar.addMenu("文件")
        # 刷新动作
        refresh_action = QAction("刷新数据", self)
        refresh_action.setShortcut("F5")
        refresh_action.triggered.connect(self.load_data)
        file_menu.addAction(refresh_action)
        # 退出动作
        exit_action = QAction("退出", self)
        exit_action.setShortcut("Ctrl+Q")
        exit_action.triggered.connect(self.close)
        file_menu.addAction(exit_action)
        # 帮助菜单
        help_menu = menubar.addMenu("帮助")
        # 关于动作
        about_action = QAction("关于", self)
        about_action.triggered.connect(self.show_about)
        help_menu.addAction(about_action)
    def load_data(self):
        """从数据库加载数据"""
        try:
            # 连接数据库
            conn = sqlite3.connect(self.db_path)
            cursor = conn.cursor()
            # 检查表是否存在
            cursor.execute("SELECT name FROM sqlite_master WHERE type='table' AND name='articles'")
            if not cursor.fetchone():
                QMessageBox.critical(self, "错误", "数据库中不存在articles表")
                conn.close()
                return
            # 查询数据 - 修改为查询score字段而不是is_interested
            cursor.execute('''
            SELECT id, title, url, category, source_date, score 
            FROM articles 
            ORDER BY id DESC
            ''')
            rows = cursor.fetchall()
            conn.close()
            # 更新表格
            self.table.setRowCount(len(rows))
            # 获取所有分类和统计信息
            categories = set()
            category_counts = {}  # 用于存储每个分类的数量
            for row_idx, row in enumerate(rows):
                id_val, title, url, category, source_date, score = row
                # 添加到分类集合和统计字典
                if category:
                    categories.add(category)
                    category_counts[category] = category_counts.get(category, 0) + 1
                else:
                    # 处理空分类的情况
                    category_counts["未分类"] = category_counts.get("未分类", 0) + 1
                # 设置表格项
                self.table.setItem(row_idx, 0, QTableWidgetItem(str(id_val)))
                self.table.setItem(row_idx, 1, QTableWidgetItem(title))
                # 链接项 - 设置为蓝色并加下划线
                link_item = QTableWidgetItem(url if url else "")
                if url:
                    link_item.setForeground(Qt.blue)
                    link_item.setFont(QFont("", -1, QFont.Bold))
                self.table.setItem(row_idx, 2, link_item)
                self.table.setItem(row_idx, 3, QTableWidgetItem(category if category else "未分类"))
                self.table.setItem(row_idx, 4, QTableWidgetItem(source_date))
                # 感兴趣状态项
                score_item = QTableWidgetItem(str(score))
                # 根据分数设置颜色
                if score >= 8:
                    score_item.setForeground(Qt.green)
                    score_item.setFont(QFont("", -1, QFont.Bold))
                elif score >= 6:
                    score_item.setForeground(Qt.blue)
                elif score <= 3:
                    score_item.setForeground(Qt.red)
                self.table.setItem(row_idx, 5, score_item)
            # 更新分类下拉框
            current_category = self.category_combo.currentText()
            self.category_combo.clear()
            self.category_combo.addItem("全部分类")
            for cat in sorted(categories):
                self.category_combo.addItem(cat)
            # 恢复之前选择的分类
            index = self.category_combo.findText(current_category)
            if index >= 0:
                self.category_combo.setCurrentIndex(index)
            # 更新分类统计显示
            self.update_category_stats(category_counts)
            # 更新状态栏
            self.status_bar.showMessage(f"已加载 {len(rows)} 条记录")
        except sqlite3.Error as e:
            logger.error(f"数据库操作出错: {str(e)}")
            QMessageBox.critical(self, "数据库错误", f"数据库操作出错: {str(e)}")
            self.status_bar.showMessage("加载数据失败")
        except Exception as e:
            logger.error(f"加载数据时出错: {str(e)}")
            QMessageBox.critical(self, "错误", f"加载数据时出错: {str(e)}")
            self.status_bar.showMessage("加载数据失败")
    def update_category_stats(self, category_counts):
        """更新分类统计显示"""
        if not category_counts:
            self.category_stats_label.setText("暂无数据")
            return
        # 按数量降序排列分类
        sorted_categories = sorted(category_counts.items(), key=lambda x: x[1], reverse=True)
        # 构建统计信息文本
        stats_text = " | ".join([f"{category}: {count}" for category, count in sorted_categories])
        # 如果文本过长，进行截断并添加提示
        if len(stats_text) > 200:
            stats_text = stats_text[:200] + "... (更多分类请查看完整数据)"
        self.category_stats_label.setText(stats_text)
        self.category_stats_label.setToolTip(" | ".join([f"{category}: {count}" for category, count in sorted_categories]))
    def update_category_stats_after_filter(self):
        """在筛选后更新分类统计显示"""
        # 统计可见行的分类
        category_counts = {}
        for row in range(self.table.rowCount()):
            # 跳过隐藏的行
            if self.table.isRowHidden(row):
                continue
            # 获取分类项
            category_item = self.table.item(row, 3)
            if category_item:
                category = category_item.text()
                category_counts[category] = category_counts.get(category, 0) + 1
            else:
                category_counts["未分类"] = category_counts.get("未分类", 0) + 1
        # 更新分类统计显示
        self.update_category_stats(category_counts)
    def filter_data(self):
        """根据搜索条件和分类筛选数据"""
        search_text = self.search_edit.text().lower()
        selected_category = self.category_combo.currentText()
        # 遍历所有行
        for row in range(self.table.rowCount()):
            show_row = True
            # 检查搜索条件
            if search_text:
                text_match = False
                for col in range(1, 6):  # 检查标题、链接、分类、日期、感兴趣列
                    item = self.table.item(row, col)
                    if item and search_text in item.text().lower():
                        text_match = True
                        break
                show_row = show_row and text_match
            # 检查分类条件
            if selected_category != "全部分类":
                category_item = self.table.item(row, 3)
                category_match = category_item and category_item.text() == selected_category
                show_row = show_row and category_match
            # 显示或隐藏行
            self.table.setRowHidden(row, not show_row)
        # 计算可见行数
        visible_count = sum(1 for row in range(self.table.rowCount()) 
                           if not self.table.isRowHidden(row))
        self.status_bar.showMessage(f"显示 {visible_count}/{self.table.rowCount()} 条记录")
        # 重新计算并显示分类统计
        self.update_category_stats_after_filter()
    def eventFilter(self, obj, event):
        """事件过滤器，用于处理链接点击而不触发行选择"""
        if obj == self.table.viewport() and event.type() == QEvent.MouseButtonPress:
            # 获取点击位置
            pos = event.position()
            # 获取点击位置的行和列
            row = self.table.rowAt(int(pos.y()))
            column = self.table.columnAt(int(pos.x()))
            # 如果点击的是链接列（第2列，索引为2）
            if column == 2 and row >= 0:
                item = self.table.item(row, column)
                if item and item.text() and item.text().startswith("http"):
                    # 直接打开链接
                    webbrowser.open(item.text())
                    # 返回True表示事件已处理，不再传递给原始处理器
                    # 这样就不会触发行选择，避免鼠标跳动
                    return True
        # 其他事件交给原始处理器处理
        return super().eventFilter(obj, event)
    def on_cell_clicked(self, row, column):
        """处理单元格点击事件"""
        # 链接列的点击已经由eventFilter处理，这里不再处理
        # 只处理非链接列的点击，保持原有选择行为
        if column != 2:
            # 可以在这里添加其他列的点击处理逻辑
            pass
    def show_context_menu(self, position):
        """显示右键菜单"""
        # 获取点击位置的行
        row = self.table.rowAt(position.y())
        if row < 0:
            return
        # 选中该行
        self.table.selectRow(row)
        # 创建右键菜单
        menu = QMenu(self)
        # 添加"增加评分(+1)"动作
        increase_score_action = QAction("增加评分(+1)", self)
        increase_score_action.triggered.connect(self.increase_score)
        menu.addAction(increase_score_action)
        # 添加"减少评分(-1)"动作
        decrease_score_action = QAction("减少评分(-1)", self)
        decrease_score_action.triggered.connect(self.decrease_score)
        menu.addAction(decrease_score_action)
        # 添加分隔线
        menu.addSeparator()
        # 添加"复制信息"动作
        copy_info_action = QAction("复制信息", self)
        copy_info_action.triggered.connect(self.copy_info)
        menu.addAction(copy_info_action)
        # 添加分隔线
        menu.addSeparator()
        # 添加"删除"动作
        delete_action = QAction("删除选中项", self)
        delete_action.triggered.connect(self.delete_selected_items)
        menu.addAction(delete_action)
        # 显示菜单
        menu.exec_(self.table.mapToGlobal(position))
    def copy_info(self):
        """复制选中行的标题、链接、日期等信息"""
        # 获取选中的行
        selected_rows = set()
        for item in self.table.selectedItems():
            selected_rows.add(item.row())
        # 如果没有选中的行，直接返回
        if not selected_rows:
            QMessageBox.information(self, "提示", "请先选中要复制信息的行")
            return
        # 收集所有选中行的信息
        all_info = []
        for row in sorted(selected_rows):
            # 获取标题、链接、日期
            title_item = self.table.item(row, 1)
            url_item = self.table.item(row, 2)
            date_item = self.table.item(row, 4)
            title = title_item.text() if title_item else ""
            url = url_item.text() if url_item else ""
            date = date_item.text() if date_item else ""
            # 用空格组合信息
            info = f"{title} {url} {date}".strip()
            all_info.append(info)
        # 将所有信息用换行符连接
        clipboard_text = "\n".join(all_info)
        # 复制到剪贴板
        clipboard = QApplication.clipboard()
        clipboard.setText(clipboard_text)
        # 更新状态栏
        self.status_bar.showMessage(f"已复制 {len(selected_rows)} 行信息到剪贴板")
    def show_about(self):
        """显示关于对话框"""
        about_text = """
        <h3>TopHub数据查看器</h3>
        <p>版本: 1.0</p>
        <p>用于查看TopHub网站抓取数据的PySide5应用程序</p>
        <p>功能特性:</p>
        <ul>
            <li>显示SQLite数据库中的抓取数据</li>
            <li>支持点击链接在浏览器中打开</li>
            <li>支持搜索和分类筛选</li>
            <li>支持排序功能</li>
            <li>支持标记感兴趣的项目</li>
        </ul>
        """
        QMessageBox.about(self, "关于", about_text)
    def select_by_keyword(self):
        """按关键字选中行"""
        # 弹出输入对话框获取关键字
        keyword, ok = QInputDialog.getText(self, "按关键字选中", "请输入关键字:")
        if not ok or not keyword:
            return
        keyword = keyword.lower()
        selected_count = 0
        # 遍历所有可见行
        for row in range(self.table.rowCount()):
            # 跳过隐藏的行
            if self.table.isRowHidden(row):
                continue
            # 检查该行是否包含关键字
            match = False
            for col in range(self.table.columnCount()):
                item = self.table.item(row, col)
                if item and keyword in item.text().lower():
                    match = True
                    break
            # 如果匹配，则选中该行
            if match:
                self.table.selectRow(row)
                selected_count += 1
        # 更新状态栏
        self.status_bar.showMessage(f"已选中 {selected_count} 行")
    def delete_selected_items(self):
        """删除选中的项目"""
        # 获取选中的行
        selected_rows = set()
        for item in self.table.selectedItems():
            selected_rows.add(item.row())
        # 如果没有选中的行，直接返回
        if not selected_rows:
            QMessageBox.information(self, "提示", "请先选中要删除的行")
            return
        # 弹出确认对话框
        reply = QMessageBox.question(
            self, 
            "确认删除", 
            f"确定要删除选中的 {len(selected_rows)} 行数据吗？此操作不可撤销！",
            QMessageBox.Yes | QMessageBox.No,
            QMessageBox.No
        )
        if reply == QMessageBox.No:
            return
        try:
            # 连接数据库
            conn = sqlite3.connect(self.db_path)
            cursor = conn.cursor()
            # 删除选中的行
            deleted_count = 0
            for row in sorted(selected_rows, reverse=True):  # 从后往前删除，避免索引变化
                # 获取ID
                id_item = self.table.item(row, 0)
                if id_item:
                    article_id = id_item.text()
                    # 从数据库中删除
                    cursor.execute("DELETE FROM articles WHERE id = ?", (article_id,))
                    # 从表格中移除行
                    self.table.removeRow(row)
                    deleted_count += 1
            # 提交更改
            conn.commit()
            conn.close()
            # 更新状态栏
            self.status_bar.showMessage(f"已删除 {deleted_count} 行数据")
            # 重新加载数据以更新分类统计
            self.load_data()
        except sqlite3.Error as e:
            logger.error(f"删除数据时出错: {str(e)}")
            QMessageBox.critical(self, "数据库错误", f"删除数据时出错: {str(e)}")
            self.status_bar.showMessage("删除失败")
        except Exception as e:
            logger.error(f"删除数据时出错: {str(e)}")
            QMessageBox.critical(self, "错误", f"删除数据时出错: {str(e)}")
            self.status_bar.showMessage("删除失败")
    def mark_as_interested(self):
        """将选中的项目标记为感兴趣"""
        # 获取选中的行
        selected_rows = set()
        for item in self.table.selectedItems():
            selected_rows.add(item.row())
        # 如果没有选中的行，直接返回
        if not selected_rows:
            QMessageBox.information(self, "提示", "请先选中要标记的行")
            return
        # 弹出确认对话框
        reply = QMessageBox.question(
            self, 
            "确认标记", 
            f"确定要将选中的 {len(selected_rows)} 行标记为感兴趣吗？",
            QMessageBox.Yes | QMessageBox.No,
            QMessageBox.Yes
        )
        if reply == QMessageBox.No:
            return
        try:
            # 连接数据库
            conn = sqlite3.connect(self.db_path)
            cursor = conn.cursor()
            # 更新选中的行
            updated_count = 0
            for row in selected_rows:
                # 获取ID
                id_item = self.table.item(row, 0)
                if id_item:
                    article_id = id_item.text()
                    # 更新数据库中的is_interested字段
                    cursor.execute("UPDATE articles SET is_interested = 1 WHERE id = ?", (article_id,))
                    # 更新表格中的显示
                    interested_item = QTableWidgetItem("是")
                    interested_item.setForeground(Qt.green)
                    interested_item.setFont(QFont("", -1, QFont.Bold))
                    self.table.setItem(row, 5, interested_item)
                    updated_count += 1
            # 提交更改
            conn.commit()
            conn.close()
            # 更新状态栏
            self.status_bar.showMessage(f"已标记 {updated_count} 行为感兴趣")
        except sqlite3.Error as e:
            logger.error(f"标记数据时出错: {str(e)}")
            QMessageBox.critical(self, "数据库错误", f"标记数据时出错: {str(e)}")
            self.status_bar.showMessage("标记失败")
        except Exception as e:
            logger.error(f"标记数据时出错: {str(e)}")
            QMessageBox.critical(self, "错误", f"标记数据时出错: {str(e)}")
            self.status_bar.showMessage("标记失败")
    def mark_as_not_interested(self):
        """将选中的项目标记为不感兴趣"""
        # 获取选中的行
        selected_rows = set()
        for item in self.table.selectedItems():
            selected_rows.add(item.row())
        # 如果没有选中的行，直接返回
        if not selected_rows:
            QMessageBox.information(self, "提示", "请先选中要标记的行")
            return
        # 弹出确认对话框
        reply = QMessageBox.question(
            self, 
            "确认标记", 
            f"确定要将选中的 {len(selected_rows)} 行标记为不感兴趣吗？",
            QMessageBox.Yes | QMessageBox.No,
            QMessageBox.Yes
        )
        if reply == QMessageBox.No:
            return
        try:
            # 连接数据库
            conn = sqlite3.connect(self.db_path)
            cursor = conn.cursor()
            # 更新选中的行
            updated_count = 0
            for row in selected_rows:
                # 获取ID
                id_item = self.table.item(row, 0)
                if id_item:
                    article_id = id_item.text()
                    # 更新数据库中的is_interested字段
                    cursor.execute("UPDATE articles SET is_interested = 0 WHERE id = ?", (article_id,))
                    # 更新表格中的显示
                    interested_item = QTableWidgetItem("否")
                    # 不感兴趣项使用普通字体和颜色
                    self.table.setItem(row, 5, interested_item)
                    updated_count += 1
            # 提交更改
            conn.commit()
            conn.close()
            # 更新状态栏
            self.status_bar.showMessage(f"已标记 {updated_count} 行为不感兴趣")
        except sqlite3.Error as e:
            logger.error(f"标记数据时出错: {str(e)}")
            QMessageBox.critical(self, "数据库错误", f"标记数据时出错: {str(e)}")
            self.status_bar.showMessage("标记失败")
        except Exception as e:
            logger.error(f"标记数据时出错: {str(e)}")
            QMessageBox.critical(self, "错误", f"标记数据时出错: {str(e)}")
            self.status_bar.showMessage("标记失败")
    def increase_score(self):
        """增加选中项目的评分(+1)"""
        # 获取选中的行
        selected_rows = set()
        for item in self.table.selectedItems():
            selected_rows.add(item.row())
        # 如果没有选中的行，直接返回
        if not selected_rows:
            QMessageBox.information(self, "提示", "请先选中要增加评分的行")
            return
        try:
            # 连接数据库
            conn = sqlite3.connect(self.db_path)
            cursor = conn.cursor()
            # 更新选中的行
            updated_count = 0
            for row in selected_rows:
                # 获取ID
                id_item = self.table.item(row, 0)
                if id_item:
                    article_id = id_item.text()
                    # 获取当前分数
                    cursor.execute("SELECT score FROM articles WHERE id = ?", (article_id,))
                    result = cursor.fetchone()
                    if result:
                        current_score = result[0]
                        # 增加分数，但不超过10
                        new_score = min(current_score + 1, 10)
                        # 更新数据库中的score字段
                        cursor.execute("UPDATE articles SET score = ? WHERE id = ?", (new_score, article_id))
                        # 更新表格中的显示
                        score_item = QTableWidgetItem(str(new_score))
                        # 根据分数设置颜色
                        if new_score >= 8:
                            score_item.setForeground(Qt.green)
                            score_item.setFont(QFont("", -1, QFont.Bold))
                        elif new_score >= 6:
                            score_item.setForeground(Qt.blue)
                        elif new_score <= 3:
                            score_item.setForeground(Qt.red)
                        self.table.setItem(row, 5, score_item)
                        updated_count += 1
            # 提交更改
            conn.commit()
            conn.close()
            # 更新状态栏
            self.status_bar.showMessage(f"已增加 {updated_count} 行的评分")
        except sqlite3.Error as e:
            logger.error(f"增加评分时出错: {str(e)}")
            QMessageBox.critical(self, "数据库错误", f"增加评分时出错: {str(e)}")
            self.status_bar.showMessage("增加评分失败")
        except Exception as e:
            logger.error(f"增加评分时出错: {str(e)}")
            QMessageBox.critical(self, "错误", f"增加评分时出错: {str(e)}")
            self.status_bar.showMessage("增加评分失败")
    def decrease_score(self):
        """减少选中项目的评分(-1)"""
        # 获取选中的行
        selected_rows = set()
        for item in self.table.selectedItems():
            selected_rows.add(item.row())
        # 如果没有选中的行，直接返回
        if not selected_rows:
            QMessageBox.information(self, "提示", "请先选中要减少评分的行")
            return
        try:
            # 连接数据库
            conn = sqlite3.connect(self.db_path)
            cursor = conn.cursor()
            # 更新选中的行
            updated_count = 0
            for row in selected_rows:
                # 获取ID
                id_item = self.table.item(row, 0)
                if id_item:
                    article_id = id_item.text()
                    # 获取当前分数
                    cursor.execute("SELECT score FROM articles WHERE id = ?", (article_id,))
                    result = cursor.fetchone()
                    if result:
                        current_score = result[0]
                        # 减少分数，但不低于0
                        new_score = max(current_score - 1, 0)
                        # 更新数据库中的score字段
                        cursor.execute("UPDATE articles SET score = ? WHERE id = ?", (new_score, article_id))
                        # 更新表格中的显示
                        score_item = QTableWidgetItem(str(new_score))
                        # 根据分数设置颜色
                        if new_score >= 8:
                            score_item.setForeground(Qt.green)
                            score_item.setFont(QFont("", -1, QFont.Bold))
                        elif new_score >= 6:
                            score_item.setForeground(Qt.blue)
                        elif new_score <= 3:
                            score_item.setForeground(Qt.red)
                        self.table.setItem(row, 5, score_item)
                        updated_count += 1
            # 提交更改
            conn.commit()
            conn.close()
            # 更新状态栏
            self.status_bar.showMessage(f"已减少 {updated_count} 行的评分")
        except sqlite3.Error as e:
            logger.error(f"减少评分时出错: {str(e)}")
            QMessageBox.critical(self, "数据库错误", f"减少评分时出错: {str(e)}")
            self.status_bar.showMessage("减少评分失败")
        except Exception as e:
            logger.error(f"减少评分时出错: {str(e)}")
            QMessageBox.critical(self, "错误", f"减少评分时出错: {str(e)}")
            self.status_bar.showMessage("减少评分失败")
 def main():
    """主函数"""
    app = QApplication(sys.argv)
    # 设置应用程序属性
    app.setApplicationName("TopHub数据查看器")
    app.setOrganizationName("TopHub")
    # 创建并显示主窗口
    viewer = DatabaseViewer()
    viewer.show()
    # 运行应用程序
    sys.exit(app.exec())
 if __name__ == "__main__":
    main()
--- a/fix_db_viewer.py
+++ b/fix_db_viewer.py
@@ -0,0 +1,55 @@
 #!/usr/bin/env python3
 """
 修复db_viewer.py文件中的方法位置问题
 将increase_score和decrease_score方法从文件末尾移动到DatabaseViewer类内部
 """
 import re
 def fix_db_viewer():
    """修复db_viewer.py文件"""
    try:
        # 读取原始文件
        with open('db_viewer.py', 'r', encoding='utf-8') as f:
            content = f.read()
        # 找到increase_score和decrease_score方法
        increase_score_match = re.search(r'\n\s*def increase_score\(self\):.*?(?=\n\s*def|\n\nclass|\n\ndef|\n\nif __name__|\Z)', content, re.DOTALL)
        decrease_score_match = re.search(r'\n\s*def decrease_score\(self\):.*?(?=\n\s*def|\n\nclass|\n\ndef|\n\nif __name__|\Z)', content, re.DOTALL)
        if not increase_score_match or not decrease_score_match:
            print("未找到increase_score或decrease_score方法")
            return False
        # 提取方法内容
        increase_score_method = increase_score_match.group(0)
        decrease_score_method = decrease_score_match.group(0)
        # 从文件末尾移除这两个方法
        content = re.sub(r'\n\s*def increase_score\(self\):.*?(?=\n\s*def|\n\nclass|\n\ndef|\n\nif __name__|\Z)', '', content, flags=re.DOTALL)
        content = re.sub(r'\n\s*def decrease_score\(self\):.*?(?=\n\s*def|\n\nclass|\n\ndef|\n\nif __name__|\Z)', '', content, flags=re.DOTALL)
        # 找到mark_as_not_interested方法的结束位置，在其后插入新方法
        mark_as_not_interested_match = re.search(r'(\n\s*def mark_as_not_interested\(self\):.*?(?=\n\s*def|\n\nclass|\n\ndef|\n\nif __name__|\Z))', content, re.DOTALL)
        if not mark_as_not_interested_match:
            print("未找到mark_as_not_interested方法")
            return False
        # 在mark_as_not_interested方法后插入新方法
        insertion_point = mark_as_not_interested_match.end(1)
        new_content = content[:insertion_point] + increase_score_method + decrease_score_method + content[insertion_point:]
        # 写入修复后的文件
        with open('db_viewer.py', 'w', encoding='utf-8') as f:
            f.write(new_content)
        print("成功修复db_viewer.py文件")
        return True
    except Exception as e:
        print(f"修复文件时出错: {str(e)}")
        return False
 if __name__ == "__main__":
    fix_db_viewer()
--- a/gui_test.log
+++ b/gui_test.log
@@ -0,0 +1,2 @@
 2025-11-07 23:39:42.157 | INFO     | __main__:<module>:42 - 开始GUI测试
 2025-11-07 23:39:47.875 | INFO     | __main__:close_app:30 - 测试完成，关闭应用程序
--- a/modify_db_to_score.py
+++ b/modify_db_to_score.py
@@ -0,0 +1,101 @@
 #!/usr/bin/env python3
 """
 修改数据库结构脚本
 将is_interested字段改为score字段，实现10分评分制度
 """
 import sqlite3
 import os
 from loguru import logger
 def modify_database_structure():
    """修改数据库结构，将is_interested字段改为score字段"""
    # 获取当前脚本所在目录的数据库文件路径
    script_dir = os.path.dirname(os.path.abspath(__file__))
    db_path = os.path.join(script_dir, "tophub_data.db")
    # 检查数据库文件是否存在
    if not os.path.exists(db_path):
        logger.error(f"数据库文件不存在: {db_path}")
        return False
    try:
        # 连接数据库
        conn = sqlite3.connect(db_path)
        cursor = conn.cursor()
        # 检查is_interested字段是否存在
        cursor.execute("PRAGMA table_info(articles)")
        columns = cursor.fetchall()
        column_names = [column[1] for column in columns]
        if "is_interested" not in column_names:
            logger.info("is_interested字段不存在，无需修改")
            conn.close()
            return True
        # 检查score字段是否已存在
        if "score" in column_names:
            logger.info("score字段已存在，无需添加")
            conn.close()
            return True
        # 添加score字段，默认值为5
        logger.info("正在添加score字段...")
        cursor.execute("ALTER TABLE articles ADD COLUMN score INTEGER DEFAULT 5")
        # 将is_interested的值转换为score
        logger.info("正在转换is_interested数据到score字段...")
        # 获取所有记录
        cursor.execute("SELECT id, is_interested FROM articles")
        records = cursor.fetchall()
        # 转换数据
        for record in records:
            article_id, is_interested = record
            # 转换逻辑：is_interested=1转为score=7，is_interested=0转为score=5
            score = 7 if is_interested == 1 else 5
            cursor.execute("UPDATE articles SET score = ? WHERE id = ?", (score, article_id))
        # 提交更改
        conn.commit()
        logger.info("成功添加score字段并转换数据")
        # 验证字段是否添加成功
        cursor.execute("PRAGMA table_info(articles)")
        columns = cursor.fetchall()
        column_names = [column[1] for column in columns]
        if "score" in column_names:
            logger.info("验证成功：score字段已添加到articles表")
        else:
            logger.error("验证失败：score字段未成功添加")
            conn.close()
            return False
        # 检查数据转换结果
        cursor.execute("SELECT COUNT(*) FROM articles WHERE score = 7")
        count_7 = cursor.fetchone()[0]
        cursor.execute("SELECT COUNT(*) FROM articles WHERE score = 5")
        count_5 = cursor.fetchone()[0]
        logger.info(f"数据转换结果: score=7的记录数: {count_7}, score=5的记录数: {count_5}")
        conn.close()
        return True
    except sqlite3.Error as e:
        logger.error(f"数据库操作出错: {str(e)}")
        return False
    except Exception as e:
        logger.error(f"修改数据库结构时出错: {str(e)}")
        return False
 if __name__ == "__main__":
    logger.add("db_modify_score.log", rotation="10 MB", level="INFO")
    if modify_database_structure():
        logger.info("数据库结构修改完成")
    else:
        logger.error("数据库结构修改失败")
--- a/ollama_model_viewer.py
+++ b/ollama_model_viewer.py
@@ -0,0 +1,79 @@
 import sys
 import requests
 import json
 from PySide6.QtWidgets import QApplication, QMainWindow, QListWidget, QVBoxLayout, QWidget, QLabel, QPushButton
 from PySide6.QtCore import Qt
 from loguru import logger
 class OllamaModelViewer(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("Ollama 模型查看器")
        self.setGeometry(100, 100, 600, 400)
        # 创建主窗口部件
        self.central_widget = QWidget()
        self.setCentralWidget(self.central_widget)
        # 创建布局
        self.layout = QVBoxLayout()
        self.central_widget.setLayout(self.layout)
        # 创建标题标签
        self.title_label = QLabel("当前安装的Ollama模型:")
        self.title_label.setStyleSheet("font-weight: bold; font-size: 14px;")
        self.layout.addWidget(self.title_label)
        # 创建列表部件
        self.model_list = QListWidget()
        self.model_list.setStyleSheet("font-family: monospace;")
        self.layout.addWidget(self.model_list)
        # 创建刷新按钮
        self.refresh_button = QPushButton("刷新模型列表")
        self.refresh_button.clicked.connect(self.fetch_models)
        self.layout.addWidget(self.refresh_button)
        # 初始加载模型
        self.fetch_models()
    def fetch_models(self):
        """从Ollama API获取模型列表"""
        self.model_list.clear()
        try:
            logger.info("正在获取Ollama模型列表...")
            response = requests.get("http://localhost:11434/api/tags", timeout=5)
            if response.status_code == 200:
                data = response.json()
                models = data.get("models", [])
                if models:
                    for model in models:
                        model_name = model.get("model", "")
                        if model_name:
                            self.model_list.addItem(model_name)
                            logger.info(f"找到模型: {model_name}")
                else:
                    self.model_list.addItem("未找到任何模型")
                    logger.info("未找到任何模型")
            else:
                self.model_list.addItem(f"API请求失败，状态码: {response.status_code}")
                logger.error(f"API请求失败，状态码: {response.status_code}")
        except requests.exceptions.RequestException as e:
            self.model_list.addItem("无法连接到Ollama API")
            logger.error(f"无法连接到Ollama API: {str(e)}")
        except json.JSONDecodeError as e:
            self.model_list.addItem("API响应格式错误")
            logger.error(f"API响应格式错误: {str(e)}")
        except Exception as e:
            self.model_list.addItem(f"发生错误: {str(e)}")
            logger.error(f"发生未知错误: {str(e)}")
 if __name__ == "__main__":
    app = QApplication(sys.argv)
    window = OllamaModelViewer()
    window.show()
    sys.exit(app.exec())
--- a/requirements.txt
+++ b/requirements.txt
@@ -0,0 +1,6 @@
 requests>=2.25.1
 lxml>=4.6.3
 tqdm>=4.61.2
 loguru>=0.5.3
 zhipuai>=2.1.0
 PySide6>=6.0.0
--- a/tophub_add_data_to_db.log
+++ b/tophub_add_data_to_db.log
--- a/tophub_add_data_to_db.py
+++ b/tophub_add_data_to_db.py
@@ -0,0 +1,213 @@
 #!/usr/bin/env python3
 """
 处理临时文件并写入数据库的脚本
 读取指定格式的临时文件，提取标题和链接，调用API进行分类，然后写入SQLite数据库
 """
 import sqlite3
 import requests
 import os
 import re
 from datetime import datetime
 from tqdm import tqdm
 from loguru import logger
 import glob
 # 配置日志
 logger.add("tophub_add_data_to_db.log", rotation="10 MB", level="INFO")
 # API配置
 API_URL = "http://localhost:11434/api/generate"
 API_MODEL = "gemma3:4b"
 def init_database():
    """初始化数据库，创建表结构"""
    conn = sqlite3.connect('tophub_data.db')
    cursor = conn.cursor()
    cursor.execute('''
        CREATE TABLE IF NOT EXISTS articles (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            title TEXT NOT NULL,
            url TEXT NOT NULL,
            category TEXT,
            source_date TEXT NOT NULL,
            created_at TEXT NOT NULL,
            UNIQUE(title, source_date)
        )
    ''')
    conn.commit()
    conn.close()
    logger.info("数据库初始化完成")
 def find_temp_files():
    """查找符合格式的临时文件"""
    pattern = "*年*月*日*.txt"
    files = glob.glob(pattern)
    logger.info(f"找到 {len(files)} 个临时文件: {files}")
    return files
 def parse_file_content(file_path):
    """解析文件内容，按5行一个循环提取数据"""
    articles = []
    try:
        with open(file_path, 'r', encoding='utf-8') as f:
            lines = f.readlines()
        # 按5行一组进行解析
        for i in range(0, len(lines), 5):
            if i + 4 < len(lines):
                node_id = lines[i].strip()
                category = lines[i+1].strip()
                title = lines[i+2].strip()
                url = lines[i+3].strip()
                separator = lines[i+4].strip() if i+4 < len(lines) else ""
                # 提取关键信息
                title_match = re.search(r'标题: (.+)', title)
                url_match = re.search(r'链接: (.+)', url)
                if title_match and url_match:
                    articles.append({
                        'title': title_match.group(1),
                        'url': url_match.group(1),
                        'category': category.split(': ')[1] if ': ' in category else '未知'
                    })
        logger.info(f"从文件 {file_path} 解析出 {len(articles)} 条数据")
        return articles
    except Exception as e:
        logger.error(f"解析文件 {file_path} 失败: {e}")
        return []
 def check_duplicate(title, date_str):
    """检查标题+日期是否已存在"""
    conn = sqlite3.connect('tophub_data.db')
    cursor = conn.cursor()
    cursor.execute('''
        SELECT COUNT(*) FROM articles 
        WHERE title = ? AND source_date = ?
    ''', (title, date_str))
    count = cursor.fetchone()[0]
    conn.close()
    return count > 0
 def classify_title(title):
    """调用API对标题进行分类"""
    try:
        prompt = f"目标：对以下文字内容进行分类，返回结果为类别，如\"社会新闻\"，\"金融\"，\"历史\"，\"购物\"，\"新质科技\"等等。目的：只返回2-4个字，不返回其它内容。内容：{title}"
        data = {
            "model": API_MODEL,
            "prompt": prompt,
            "stream": False
        }
        response = requests.post(API_URL, json=data, timeout=30)
        response.raise_for_status()
        result = response.json()
        category = result.get('response', '').strip()
        # 验证分类结果长度
        if len(category) < 2 or len(category) > 8:
            category = '其他'
        logger.info(f"标题 '{title}' 分类为: {category}")
        return category
    except Exception as e:
        logger.error(f"API调用失败，标题 '{title}': {e}")
        return '其他'
 def insert_article(title, url, category, source_date):
    """插入文章到数据库"""
    conn = sqlite3.connect('tophub_data.db')
    cursor = conn.cursor()
    try:
        created_at = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
        cursor.execute('''
            INSERT INTO articles (title, url, category, source_date, created_at)
            VALUES (?, ?, ?, ?, ?)
        ''', (title, url, category, source_date, created_at))
        conn.commit()
        logger.info(f"成功插入文章: {title}")
        return True
    except sqlite3.IntegrityError:
        logger.warning(f"文章已存在，跳过: {title}")
        return False
    except Exception as e:
        logger.error(f"插入文章失败: {e}")
        return False
    finally:
        conn.close()
 def process_temp_files():
    """主处理函数"""
    logger.info("开始处理临时文件...")
    # 初始化数据库
    init_database()
    # 查找临时文件
    temp_files = find_temp_files()
    if not temp_files:
        logger.warning("未找到临时文件")
        return
    total_processed = 0
    total_inserted = 0
    # 处理每个文件
    for file_path in temp_files:
        logger.info(f"处理文件: {file_path}")
        # 从文件名提取日期
        date_match = re.search(r'(\d{4})年(\d{1,2})月(\d{1,2})日', file_path)
        if date_match:
            source_date = f"{date_match.group(1)}-{int(date_match.group(2)):02d}-{int(date_match.group(3)):02d}"
        else:
            source_date = datetime.now().strftime('%Y-%m-%d')
        # 解析文件内容
        articles = parse_file_content(file_path)
        if not articles:
            continue
        # 处理每篇文章
        for article in tqdm(articles, desc=f"处理 {file_path}"):
            total_processed += 1
            # 检查重复
            if check_duplicate(article['title'], source_date):
                logger.info(f"跳过重复文章: {article['title']}")
                continue
            # 分类标题
            category = classify_title(article['title'])
            # 插入数据库
            if insert_article(article['title'], article['url'], category, source_date):
                total_inserted += 1
    logger.info(f"处理完成! 总计处理: {total_processed}, 成功插入: {total_inserted}")
 if __name__ == "__main__":
    try:
        process_temp_files()
    except Exception as e:
        logger.error(f"程序执行失败: {e}")
        raise
--- a/tophub_ban_column.txt
+++ b/tophub_ban_column.txt
@@ -0,0 +1,14 @@
 淘宝
 音乐
 电影
 猫眼
 IMDB
 视频
 七猫
 读书
 TapTap
 Music
 即刻
 站酷
 App
 彩票
--- a/tophub_data.db
+++ b/tophub_data.db
--- a/tophub_scraper.log
+++ b/tophub_scraper.log
--- a/tophub_scraper.py
+++ b/tophub_scraper.py
@@ -0,0 +1,209 @@
 #!/usr/bin/env python3
 # -*- coding: utf-8 -*-
 """
 TopHub网站数据抓取脚本
 负责从tophub.today网站抓取数据，根据指定规则过滤并保存
 """
 import requests
 from lxml import html
 import json
 import time
 import os
 import re
 from datetime import datetime
 from loguru import logger
 # 配置日志
 logger.add("tophub_scraper.log", rotation="10 MB", level="INFO")
 class TopHubScraper:
    """TopHub网站数据抓取器"""
    def __init__(self):
        """
        初始化抓取器
        """
        self.base_url = "https://tophub.today/"
        self.ban_list_file = "tophub_ban_column.txt"
        self.session = requests.Session()
        self.session.headers.update({
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
        })
        self.ban_list = self.load_ban_list()
    def load_ban_list(self):
        """
        加载需要过滤的栏目列表
        Returns:
            set: 需要过滤的栏目集合
        """
        ban_list = set()
        try:
            if os.path.exists(self.ban_list_file):
                with open(self.ban_list_file, 'r', encoding='utf-8') as f:
                    for line in f:
                        line = line.strip()
                        if line:
                            ban_list.add(line)
                logger.info(f"已加载 {len(ban_list)} 个需要过滤的栏目")
            else:
                logger.warning(f"过滤文件 {self.ban_list_file} 不存在，将不过滤任何栏目")
        except Exception as e:
            logger.error(f"加载过滤文件失败: {e}")
        return ban_list
    def fetch_webpage(self):
        """
        获取网页内容
        Returns:
            str: 网页HTML内容
        """
        logger.info(f"正在获取网页内容: {self.base_url}")
        try:
            response = self.session.get(self.base_url, timeout=10)
            response.raise_for_status()
            logger.info("网页内容获取成功")
            return response.text
        except requests.RequestException as e:
            logger.error(f"获取网页内容失败: {e}")
            raise
    def scrape_by_node_ids(self):
        """
        根据节点ID范围抓取数据
        Returns:
            list: 包含已抓取数据的列表
        """
        try:
            # 1. 获取网页内容
            html_content = self.fetch_webpage()
            tree = html.fromstring(html_content)
            # 2. 创建输出文件名（基于当前日期时间）
            now = datetime.now()
            output_file = f"{now.year}年{now.month}月{now.day}日{now.hour}{now.minute}{now.second}.txt"
            scraped_data = []
            # 3. 遍历节点ID范围
            for node_id in range(1, 1000):  # 从1到999
                xpath = f'//*[@id="node-{node_id}"]'
                logger.info(f"正在查找节点: {xpath}")
                # 查找节点
                nodes = tree.xpath(xpath)
                if not nodes:
                    continue  # 没有找到节点，跳过下一个数字
                node = nodes[0]
                # 查找span标签
                spans = node.xpath('.//span')
                if not spans:
                    logger.info(f"节点 {node_id} 中未找到span标签，跳过")
                    continue
                # 获取第一个span的文本内容
                span_text = spans[0].text_content().strip()
                if not span_text:
                    logger.info(f"节点 {node_id} 的span标签为空，跳过")
                    continue
                # 检查是否在过滤列表中（部分匹配）
                should_skip = False
                for ban_word in self.ban_list:
                    if ban_word in span_text:
                        logger.info(f"节点 {node_id} 的内容 '{span_text}' 包含过滤词 '{ban_word}'，跳过")
                        should_skip = True
                        break
                if should_skip:
                    continue
                logger.info(f"节点 {node_id} 的内容 '{span_text}' 通过过滤，继续处理")
                # 查找a元素
                links = node.xpath('.//a')
                if not links:
                    logger.info(f"节点 {node_id} 中未找到a元素，跳过")
                    continue
                # 提取所有链接和文本
                for link in links:
                    link_text = link.text_content().strip()
                    href = link.get('href', '')
                    if link_text and href:
                        # 补全相对链接
                        if not href.startswith('http'):
                            href = f"https://tophub.today{href}"
                        # 当category和text的值相同时，跳过当前循环
                        if span_text == link_text:
                            logger.info(f"节点 {node_id} 的分类和标题相同 ({span_text})，跳过")
                            continue
                        scraped_data.append({
                            'node_id': node_id,
                            'category': span_text,
                            'text': link_text,
                            'link': href
                        })
            # 4. 保存数据到文件
            if scraped_data:
                self.save_to_file(scraped_data, output_file)
                logger.info(f"成功抓取 {len(scraped_data)} 条数据，保存到 {output_file}")
            else:
                logger.warning("未抓取到任何数据")
            return scraped_data
        except Exception as e:
            logger.error(f"抓取数据时出错: {e}")
            raise
    def save_to_file(self, data, filename):
        """
        将数据保存到文件
        Args:
            data (list): 要保存的数据
            filename (str): 文件名
        """
        try:
            with open(filename, 'w', encoding='utf-8') as f:
                for item in data:
                    f.write(f"节点ID: {item['node_id']}\n")
                    f.write(f"分类: {item['category']}\n")
                    # 使用正则表达式清洗标题，去除数字序号和多余空白
                    title_text = item['text']
                    # 处理多行标题，提取实际内容
                    lines = title_text.strip().split('\n')
                    if len(lines) >= 2:
                        # 第二行通常是实际标题内容
                        cleaned_title = lines[1].strip()
                    else:
                        # 如果只有一行，尝试使用正则表达式
                        match = re.match(r'^\d+\s+(.+)$', title_text.strip(), re.DOTALL)
                        if match:
                            cleaned_title = match.group(1).strip()
                        else:
                            cleaned_title = title_text.strip()
                    f.write(f"标题: {cleaned_title}\n")
                    f.write(f"链接: {item['link']}\n")
                    f.write("-" * 50 + "\n")
            logger.info(f"数据已保存到 {filename}")
        except Exception as e:
            logger.error(f"保存文件失败: {e}")
            raise
 if __name__ == "__main__":
    scraper = TopHubScraper()
    scraper.scrape_by_node_ids()
--- a/右键菜单功能说明.md
+++ b/右键菜单功能说明.md
@@ -0,0 +1,155 @@
 # 右键菜单功能说明
 ## 功能概述
 TopHub数据查看器的右键菜单功能允许用户通过右键点击表格中的项目，快速执行常用操作，提高操作效率。
 ## 新增功能
 ### 1. 标记为感兴趣
 - **功能描述**：将选中的项目标记为感兴趣状态
 - **数据库操作**：将对应记录的`is_interested`字段设置为1
 - **界面显示**：在"感兴趣"列显示为"是"，使用绿色粗体字体
 ### 2. 标记为不感兴趣
 - **功能描述**：将选中的项目标记为不感兴趣状态
 - **数据库操作**：将对应记录的`is_interested`字段设置为0
 - **界面显示**：在"感兴趣"列显示为"否"，使用普通字体和颜色
 ### 3. 删除选中项
 - **功能描述**：删除选中的项目
 - **数据库操作**：从数据库中删除对应记录
 - **界面显示**：从表格中移除对应行
 ## 使用方法
 1. 打开TopHub数据查看器
 2. 在表格中右键点击任意项目
 3. 在弹出的右键菜单中选择所需操作：
   - 点击"标记为感兴趣"将项目标记为感兴趣
   - 点击"标记为不感兴趣"将项目标记为不感兴趣
   - 点击"删除选中项"删除选中的项目
 ## 技术实现
 ### 右键菜单实现
 ```python
 # 启用右键菜单
 self.table.setContextMenuPolicy(Qt.CustomContextMenu)
 self.table.customContextMenuRequested.connect(self.show_context_menu)
 def show_context_menu(self, position):
    """显示右键菜单"""
    # 获取点击位置的行
    row = self.table.rowAt(position.y())
    if row < 0:
        return
    # 选中该行
    self.table.selectRow(row)
    # 创建右键菜单
    menu = QMenu(self)
    # 添加"标记为感兴趣"动作
    mark_action = QAction("标记为感兴趣", self)
    mark_action.triggered.connect(self.mark_as_interested)
    menu.addAction(mark_action)
    # 添加"标记为不感兴趣"动作
    unmark_action = QAction("标记为不感兴趣", self)
    unmark_action.triggered.connect(self.mark_as_not_interested)
    menu.addAction(unmark_action)
    # 添加分隔线
    menu.addSeparator()
    # 添加"删除"动作
    delete_action = QAction("删除选中项", self)
    delete_action.triggered.connect(self.delete_selected_items)
    menu.addAction(delete_action)
    # 显示菜单
    menu.exec_(self.table.mapToGlobal(position))
 ```
 ### 标记为不感兴趣方法实现
 ```python
 def mark_as_not_interested(self):
    """将选中的项目标记为不感兴趣"""
    # 获取选中的行
    selected_rows = set()
    for item in self.table.selectedItems():
        selected_rows.add(item.row())
    # 如果没有选中的行，直接返回
    if not selected_rows:
        QMessageBox.information(self, "提示", "请先选中要标记的行")
        return
    # 弹出确认对话框
    reply = QMessageBox.question(
        self, 
        "确认标记", 
        f"确定要将选中的 {len(selected_rows)} 行标记为不感兴趣吗？",
        QMessageBox.Yes | QMessageBox.No,
        QMessageBox.Yes
    )
    if reply == QMessageBox.No:
        return
    try:
        # 连接数据库
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
        # 更新选中的行
        updated_count = 0
        for row in selected_rows:
            # 获取ID
            id_item = self.table.item(row, 0)
            if id_item:
                article_id = id_item.text()
                # 更新数据库中的is_interested字段
                cursor.execute("UPDATE articles SET is_interested = 0 WHERE id = ?", (article_id,))
                # 更新表格中的显示
                interested_item = QTableWidgetItem("否")
                # 不感兴趣项使用普通字体和颜色
                self.table.setItem(row, 5, interested_item)
                updated_count += 1
        # 提交更改
        conn.commit()
        conn.close()
        # 更新状态栏
        self.status_bar.showMessage(f"已标记 {updated_count} 行为不感兴趣")
    except sqlite3.Error as e:
        logger.error(f"标记数据时出错: {str(e)}")
        QMessageBox.critical(self, "数据库错误", f"标记数据时出错: {str(e)}")
        self.status_bar.showMessage("标记失败")
    except Exception as e:
        logger.error(f"标记数据时出错: {str(e)}")
        QMessageBox.critical(self, "错误", f"标记数据时出错: {str(e)}")
        self.status_bar.showMessage("标记失败")
 ```
 ## 测试
 测试脚本`test_mark_not_interested.py`验证了"标记为不感兴趣"功能的正确性。测试结果显示功能正常工作，能够正确地将项目标记为不感兴趣，并更新数据库和界面显示。
 ## 注意事项
 1. 右键菜单操作前必须先选中要操作的项目
 2. 删除操作不可撤销，请谨慎使用
 3. 标记操作会直接更新数据库，确保操作前已确认选择
 4. 批量操作时，所有选中的项目都会被同时处理
 ## 更新记录
 - 2023-11-07：添加"标记为不感兴趣"功能到右键菜单
 - 2023-11-07：完成功能测试和文档编写
--- a/数据库字段添加总结.md
+++ b/数据库字段添加总结.md
@@ -0,0 +1,44 @@
 # 数据库字段添加总结
 ## 任务概述
 为TopHub数据库查看器添加一个"感兴趣"字段，允许用户标记感兴趣的文章。
 ## 实施步骤
 ### 1. 数据库结构修改
 - 创建了`add_interested_field.py`脚本，用于向`articles`表添加`is_interested`字段
 - 字段类型：INTEGER，默认值：0
 - 脚本包含字段存在性检查、添加逻辑和验证功能
 ### 2. 数据库验证
 - 创建了`check_db_structure.py`脚本，用于检查数据库结构
 - 创建了`test_interested_field.py`脚本，用于验证字段功能
 - 创建了`show_data_with_interested.py`脚本，用于显示包含感兴趣状态的记录
 ### 3. GUI界面修改
 - 修改了`db_viewer.py`文件，添加了以下功能：
  - 在表格中添加"感兴趣"列，显示`is_interested`字段值
  - 添加"标记为感兴趣"按钮，允许用户将选中的文章标记为感兴趣
  - 更新查询语句，包含`is_interested`字段
  - 更新筛选功能，包含感兴趣列
 ## 测试结果
 - 数据库字段成功添加，默认值为0
 - 可以成功将记录标记为感兴趣（值为1）
 - GUI应用程序能够正常显示和操作感兴趣字段
 - 统计功能正常工作，可以显示感兴趣和不感兴趣的记录数量
 ## 使用方法
 1. 运行`python db_viewer.py`启动应用程序
 2. 在表格中选择一条记录
 3. 点击"标记为感兴趣"按钮将记录标记为感兴趣
 4. 可以使用筛选功能查看感兴趣的记录
 5. 统计面板会显示感兴趣和不感兴趣的记录数量
 ## 文件清单
 - `add_interested_field.py` - 添加数据库字段的脚本
 - `check_db_structure.py` - 检查数据库结构的脚本
 - `test_interested_field.py` - 测试字段功能的脚本
 - `show_data_with_interested.py` - 显示记录的命令行工具
 - `test_gui.py` - GUI测试脚本
 - `db_viewer.py` - 修改后的主应用程序
--- a/评分系统使用说明.md
+++ b/评分系统使用说明.md
@@ -0,0 +1,70 @@
 # TopHub数据查看器 - 评分系统使用说明
 ## 概述
 TopHub数据查看器已从简单的"感兴趣/不感兴趣"标记系统升级为10分评分制度。新系统提供了更精细的内容评价能力，让您能够更准确地标记和管理抓取的内容。
 ## 评分系统说明
 ### 评分范围
 - **最低分**: 0分 (完全不感兴趣)
 - **默认分**: 5分 (中立态度)
 - **最高分**: 10分 (非常感兴趣)
 ### 颜色编码
 为了便于快速识别内容质量，系统根据分数自动显示不同颜色：
 - **绿色加粗**: 8分及以上 (高价值内容)
 - **蓝色**: 6-7分 (中等价值内容)
 - **默认颜色**: 4-5分 (一般内容)
 - **红色**: 3分及以下 (低价值内容)
 ## 使用方法
 ### 增加评分
 1. 在表格中选择一行或多行
 2. 右键点击选中的行
 3. 从菜单中选择"增加评分(+1)"
 4. 系统会将选中项的评分增加1分，最高不超过10分
 ### 减少评分
 1. 在表格中选择一行或多行
 2. 右键点击选中的行
 3. 从菜单中选择"减少评分(-1)"
 4. 系统会将选中项的评分减少1分，最低不低于0分
 ### 批量操作
 - 可以同时选择多行进行批量评分调整
 - 使用"按关键字选中"功能可以快速选择包含特定关键词的行
 - 然后通过右键菜单进行批量评分调整
 ## 数据迁移
 原有的"感兴趣/不感兴趣"数据已自动转换为新的评分系统：
 - 原标记为"感兴趣"的项目已转换为7分
 - 原标记为"不感兴趣"的项目已转换为5分(默认值)
 ## 技术细节
 ### 数据库结构
 - 新增了`score`字段(INTEGER类型)替代原来的`is_interested`字段
 - `score`字段默认值为5，范围限制为0-10
 ### 界面更新
 - 表格中的"感兴趣"列已更新为"评分"列，显示具体分数
 - 右键菜单已更新为"增加评分(+1)"和"减少评分(-1)"选项
 - 根据分数自动应用颜色编码，便于快速识别
 ## 常见问题
 **Q: 为什么默认分数是5分而不是0分？**
 A: 5分代表中立态度，更符合日常评分习惯。0分通常用于表示完全不相关或质量极差的内容。
 **Q: 如何快速找到高评分内容？**
 A: 高评分内容(8分及以上)会以绿色加粗显示，非常醒目。您也可以使用排序功能按评分列排序。
 **Q: 可以直接设置任意分数吗？**
 A: 当前版本只支持通过+1/-1的方式调整分数，这样可以保持评分的一致性和可追溯性。
 ---
 如有其他问题或建议，请随时反馈。
		`@@ -0,0 +1,2 @@`
							`2025-11-07 23:39:42.157 \| INFO \| __main__:<module>:42 - 开始GUI测试`
							`2025-11-07 23:39:47.875 \| INFO \| __main__:close_app:30 - 测试完成，关闭应用程序`