feat(ingest): API Push 短新闻数据层
- alembic 0008:articles 加 is_short_news/external_id/source_ref/content_hash (UNIQUE);sources.kind 加 'api_push';api_tokens 加 purpose + source_id - SourceKind.API_PUSH enum;Article/ApiToken model 加新字段 - enrichment_article 短新闻跳过 format/image; enrichment_loop SQL 加 is_short_news 路径(并入'可 enrich' 条件) - 入库侧由 commit 2(ingest 接口)负责:写 body_zh_text=body_text, format/image/commentary_meituan_status='n/a', classify/commentary_status='pending'(带 tags 时 classify='ok') 无迁移爆炸半径:articles.url 保持 NOT NULL,短新闻合成 api-push:// 占位
This commit is contained in:
@@ -5,6 +5,7 @@ from datetime import datetime
|
||||
|
||||
from sqlalchemy import (
|
||||
BigInteger,
|
||||
Boolean,
|
||||
DateTime,
|
||||
Float,
|
||||
ForeignKey,
|
||||
@@ -36,6 +37,16 @@ class Article(Base):
|
||||
url_hash: Mapped[str] = mapped_column(String(40), unique=True, nullable=False, index=True)
|
||||
guid: Mapped[str | None] = mapped_column(String(255), index=True) # 源站给的 ID
|
||||
|
||||
# === API Push 短新闻特有 ===
|
||||
is_short_news: Mapped[bool] = mapped_column(
|
||||
Boolean, default=False, nullable=False, index=True
|
||||
)
|
||||
external_id: Mapped[str | None] = mapped_column(String(128), index=True) # 调用方幂等 key
|
||||
source_ref: Mapped[str | None] = mapped_column(String(64), index=True) # 短新闻里再细分来源
|
||||
content_hash: Mapped[str | None] = mapped_column(
|
||||
String(40), unique=True, index=True
|
||||
) # 内容指纹,核心去重 key(NULL 不参与 unique)
|
||||
|
||||
# === 原文内容 ===
|
||||
title: Mapped[str] = mapped_column(Text, nullable=False)
|
||||
body_html: Mapped[str | None] = mapped_column(Text) # 抽取后保留结构
|
||||
|
||||
Reference in New Issue
Block a user