Compare commits

..

33 Commits

Author SHA1 Message Date
498d5110e9 chore: 更新二进制数据库文件 tophub_data.db 2026-03-11 19:23:14 +08:00
851d536b59 docs: 更新产品信息文件中的内容
更新 temp_product_info.txt 文件中的产品名称、制作人发言和用户数等信息,反映最新的产品数据
2026-03-08 20:43:42 +08:00
adc9c76864 chore: 更新二进制数据库文件 2026-03-07 23:42:14 +08:00
624e158be9 chore: 更新产品信息和调试截图文件
更新了temp_product_info.txt中的产品名称、用户数和提取时间
同步了数据库和截图文件的变更
2026-03-01 16:45:10 +08:00
5bc40abbc1 docs: 更新产品信息文件中的临时数据
更新 temp_product_info.txt 文件中的产品名称、制作人发言、用户数和提取时间
2026-02-26 20:56:40 +08:00
bd2c457f54 docs: 更新产品信息文件中的Thinglo数据
更新temp_product_info.txt文件中的产品名称、用户数和提取时间
2026-02-25 18:51:55 +08:00
179bfa327b docs: 更新产品信息文件中的TypeBoost数据
更新temp_product_info.txt文件中的产品信息,将Lums替换为TypeBoost,并更新对应的产品简介、制作人发言、用户数和提取时间。产品简介字段当前为空,显示为“未获取”。
2026-02-24 22:49:48 +08:00
c2357ffb67 docs: 更新产品信息文件内容
更新 temp_product_info.txt 中的产品名称、简介、用户数和提取时间,反映最新数据
2026-02-07 23:18:12 +08:00
0d287e7c1f docs: 更新产品信息文件中的临时数据
更新 temp_product_info.txt 文件中的产品名称、简介、制作人发言、用户数和提取时间,反映最新的产品信息
2026-02-06 23:04:33 +08:00
674ee1e1e2 今日更新 2026-02-06 22:21:22 +08:00
0cf231f9f7 feat: 更新产品信息和抓取日志
更新了temp_product_info.txt中的产品信息,从Sheetsbase更改为Yomio
添加了新的抓取节点日志并记录数据保存操作
2026-02-01 19:28:13 +08:00
f82da3bab1 chore: 更新二进制数据库文件 2026-02-01 17:55:18 +08:00
22a50ad5c6 docs: 更新产品信息文件中的Sheetsbase数据
更新temp_product_info.txt文件中的产品信息,将GameCutAI替换为Sheetsbase的相关信息,包括产品简介、用户数和提取时间
2026-01-31 19:04:04 +08:00
0d9e427a34 docs: 更新产品信息文件中的GameCutAI数据
更新temp_product_info.txt文件中的产品信息,将Notte替换为GameCutAI的相关数据,包括产品简介、用户数和提取时间
2026-01-28 18:16:57 +08:00
ec68b83827 chore: 更新二进制数据库文件 tophub_data.db 2026-01-27 22:25:48 +08:00
130bbfb090 docs: 更新产品信息文件中的Notte产品数据
更新temp_product_info.txt文件中的产品信息,包括产品名称、简介、制作人发言、用户数和提取时间
2026-01-22 20:26:32 +08:00
6e83136dc6 chore: 更新二进制文件和调试截图 2026-01-21 22:03:32 +08:00
f6f4da7d07 docs: 更新产品信息文件中的Dewdrop产品详情
更新temp_product_info.txt中的产品信息,将Boom video替换为Dewdrop的相关信息,包括产品简介、用户数和提取时间
2026-01-18 15:55:15 +08:00
a2be43d42a 更新今日数据 2026-01-17 17:19:03 +08:00
a4c106fa5a docs: 更新产品信息文件中的内容
更新 temp_product_info.txt 文件中的产品名称、简介和用户数等信息
2026-01-15 20:37:22 +08:00
f24ca9aa29 更新今日数据 2026-01-13 21:03:19 +08:00
a537d3825b 更新今日数据 2026-01-13 18:59:21 +08:00
e67931c3ca 更新今日数据 2026-01-12 20:36:44 +08:00
b7cd03434d fix: 更新产品信息并修复脚本超时错误
更新temp_product_info.txt中的产品信息为Clear for Slack
修复tophub_add_data_to_db.py脚本执行超时问题
2026-01-11 19:12:18 +08:00
a9d6c4699d 更新今日数据 2026-01-03 22:51:43 +08:00
3984b81f86 更新今日数据,1 2026-01-01 17:22:51 +08:00
d62cd2fcca 更新今日数据 2025-12-30 23:03:59 +08:00
d44a294bf7 更新今日数据 2025-12-30 22:19:55 +08:00
57e0029eb1 更新今日数据 2025-12-29 23:36:06 +08:00
a2ecc7f451 更新今日数据 2025-12-26 22:26:29 +08:00
6ae10c9d36 更新今日数据 2025-12-24 22:06:13 +08:00
20b2f46533 更新昨日2025年12月19日数据 2025-12-20 17:15:17 +08:00
43ec564daa Add .gitignore file for ignoring test, bug, and other unnecessary files 2025-12-20 17:13:54 +08:00
25 changed files with 136129 additions and 157414 deletions

68
.gitignore vendored Normal file
View File

@@ -0,0 +1,68 @@
# Python
__pycache__/
*.py[cod]
*$py.class
# Logs
*.log
integrated_product_system.log
# Databases
*.db
*.sqlite
# IDE
.trae/
.vscode/
.idea/
*.swp
*.swo
*~
# OS
.DS_Store
Thumbs.db
# Test files
*test*.py
*Test*.py
pytest_cache/
.tox/
.coverage
coverage.xml
# Temporary files
*.tmp
*.temp
temp*.txt
*.bak
*.swp
*.swo
*~
temp_*.txt
# Bug and debug files
*debug*.png
*bug*.txt
# Batch files
*.bat
# Output files
*.out
*.output
# Environment
.env
.env.local
.env.*.local
# Documentation build
_build/
build/
dist/
*.egg-info/
# Other
2025年12月*.txt
*.png

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

5790
2026年1月15日1991.txt Normal file

File diff suppressed because it is too large Load Diff

5820
2026年1月17日16419.txt Normal file

File diff suppressed because it is too large Load Diff

5800
2026年1月18日9249.txt Normal file

File diff suppressed because it is too large Load Diff

5840
2026年1月21日19238.txt Normal file

File diff suppressed because it is too large Load Diff

5795
2026年1月22日18556.txt Normal file

File diff suppressed because it is too large Load Diff

5855
2026年1月29日20470.txt Normal file

File diff suppressed because it is too large Load Diff

5795
2026年1月31日91239.txt Normal file

File diff suppressed because it is too large Load Diff

5800
2026年3月10日183431.txt Normal file

File diff suppressed because it is too large Load Diff

5810
2026年3月8日18119.txt Normal file

File diff suppressed because it is too large Load Diff

Binary file not shown.

Before

Width:  |  Height:  |  Size: 257 KiB

After

Width:  |  Height:  |  Size: 526 KiB

File diff suppressed because it is too large Load Diff

Binary file not shown.

Before

Width:  |  Height:  |  Size: 261 KiB

After

Width:  |  Height:  |  Size: 231 KiB

Binary file not shown.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 437 KiB

After

Width:  |  Height:  |  Size: 717 KiB

View File

@@ -1,13 +1,11 @@
=== Product Hunt 产品信息 === === Product Hunt 产品信息 ===
产品名称: Croct 产品名称: Greta
产品简介: Croct is a conversion optimization platform that includes AI-powered audience segmentation, content personalization, AB testing, feature flag, real-time website analytics, and a component-based CMS for modern frameworks like Next.js and React. 产品简介: 未获取
It helps product and growth teams scale UI and website optimization, enabling faster growth without over-relying on developers. 制作人发言: This is first first proposed project. If you want to support Santiago getting his project built, here are the details.https://onemillionlines.com/proj...
制作人发言: AI-powered website segmentation 用户数: 664 followers
用户数: 619 followers 提取时间: 2026-03-08 20:40:13
提取时间: 2025-12-17 18:36:11

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -205,9 +205,13 @@ def process_temp_files():
continue continue
# 处理每篇文章 # 处理每篇文章
for article in tqdm(articles, desc=f"处理 {file_path}"): for i, article in tqdm(enumerate(articles), desc=f"处理 {file_path}", total=len(articles)):
total_processed += 1 total_processed += 1
# 每处理10篇文章记录一次进度
if i % 10 == 0 and i > 0:
logger.info(f"已处理 {i}/{len(articles)} 篇文章,完成 {i/len(articles)*100:.1f}%")
# 检查重复 # 检查重复
if check_duplicate(article['title'], source_date): if check_duplicate(article['title'], source_date):
logger.info(f"跳过重复文章(最近三天已存在): {article['title']}") logger.info(f"跳过重复文章(最近三天已存在): {article['title']}")

Binary file not shown.

File diff suppressed because it is too large Load Diff

View File

@@ -262,7 +262,7 @@ class TopHubScraper:
# 实时读取输出以避免编码问题 # 实时读取输出以避免编码问题
try: try:
stdout, stderr = process.communicate(timeout=300) # 5分钟超时 stdout, stderr = process.communicate(timeout=3600) # 1小时超时
except subprocess.TimeoutExpired: except subprocess.TimeoutExpired:
process.kill() process.kill()
logger.error("tophub_add_data_to_db.py执行超时") logger.error("tophub_add_data_to_db.py执行超时")
@@ -287,6 +287,8 @@ class TopHubScraper:
if __name__ == "__main__": if __name__ == "__main__":
scraper = TopHubScraper() scraper = TopHubScraper()
try: try:
# 抓取数据 # 抓取数据
scraped_data = scraper.scrape_by_node_ids() scraped_data = scraper.scrape_by_node_ids()