产品决定:搜索建议只展示 ts_stat 高频词续接(如'美'→美国/美军/美国政府), 不要真实文章 id 提示(用户认为这种'文章#566871'是噪音,没连续性)。 改动: - SearchSuggestionsResponse 去 title,只剩 query + keywords - SearchService 只查 search_keywords,fallback 路径也只针对 keywords - Feed.vue: 删掉 suggestTitles 状态 + SuggestTitleOption 类型联合, renderSuggestion 简化成 '词' 标签 + 词文本 + 右侧 weight 数字 - 0011 迁移: 删 search_title_suggestions 表 + 3 索引 + trigger + 函数 (trigger 在每篇文章 INSERT/UPDATE 都会跑,删了能省掉无用性能损耗) - 删除: app/models/search_title_suggestion.py + backfill_search_suggestions.py 替换成: app/scripts/refresh_search_keywords.py(只跑一次词频刷新)
49 lines
1.3 KiB
Python
49 lines
1.3 KiB
Python
"""刷新 search_keywords(立即跑一次,不依赖 worker 03:00 调度)。
|
|
|
|
历史:
|
|
- 最初版本是回灌 search_title_suggestions(articles trigger 维护的真实标题)
|
|
- 0011 迁移删了 search_title_suggestions(产品决定只展示 keyword 续接词)
|
|
- 现在脚本只做一件事:立即跑一次 refresh_search_keywords()
|
|
|
|
用法:
|
|
docker compose exec api python -m app.scripts.refresh_search_keywords
|
|
# 预期: search_keywords refreshed
|
|
|
|
性能:ts_stat 1545 篇文章全量聚合 ~88s(每天 worker 03:00 会自动跑一次,通常不需要手动)
|
|
"""
|
|
from __future__ import annotations
|
|
|
|
import asyncio
|
|
import logging
|
|
import sys
|
|
|
|
from sqlalchemy import text
|
|
|
|
from app.database import AsyncSessionLocal
|
|
|
|
logger = logging.getLogger("news.refresh_keywords")
|
|
logging.basicConfig(
|
|
level="INFO",
|
|
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
|
|
)
|
|
|
|
|
|
async def refresh() -> None:
|
|
async with AsyncSessionLocal() as s:
|
|
await s.execute(text("SELECT refresh_search_keywords()"))
|
|
await s.commit()
|
|
logger.info("search_keywords refreshed")
|
|
|
|
|
|
def main() -> int:
|
|
try:
|
|
asyncio.run(refresh())
|
|
except KeyboardInterrupt:
|
|
logger.warning("interrupted")
|
|
return 1
|
|
return 0
|
|
|
|
|
|
if __name__ == "__main__":
|
|
sys.exit(main())
|