* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, BlinkMacSystemFont, ‚Segoe UI‘, Roboto, sans-serif; background: #0f1117; color: #e2e8f0; line-height: 1.6; }
.container { max-width: 860px; margin: 0 auto; padding: 2rem 1.5rem; }
h1 { font-size: 1.7rem; color: #7dd3fc; margin-bottom: 0.3rem; }
.subtitle { color: #94a3b8; margin-bottom: 1.5rem; }
h2 { font-size: 1.2rem; color: #93c5fd; margin: 1.5rem 0 0.5rem; }
h3 { font-size: 1rem; color: #c084fc; margin: 1rem 0 0.3rem; }
p, li { font-size: 0.92rem; color: #cbd5e1; }
ul, ol { padding-left: 1.3rem; margin-bottom: 0.8rem; }
li { margin-bottom: 0.3rem; }
code { background: #1e2130; padding: 0.15rem 0.4rem; border-radius: 4px; font-size: 0.85rem; color: #6ee7b7; font-family: ‚SF Mono‘, ‚Fira Code‘, monospace; }
pre { background: #151825; border: 1px solid #2d3748; border-radius: 8px; padding: 1rem; overflow-x: auto; font-size: 0.82rem; color: #e2e8f0; margin: 0.8rem 0; }
.tag { display: inline-block; padding: 0.15rem 0.5rem; border-radius: 999px; font-size: 0.7rem; font-weight: 600; margin-right: 0.3rem; margin-bottom: 0.3rem; }
.tag-green { background: #05966920; color: #34d399; border: 1px solid #05966940; }
.tag-blue { background: #1d4ed820; color: #60a5fa; border: 1px solid #1d4ed840; }
.tag-purple { background: #7c3aed20; color: #a78bfa; border: 1px solid #7c3aed40; }
.arch-diagram { background: #151825; border: 1px solid #2d3748; border-radius: 8px; padding: 1rem; margin: 1rem 0; font-family: ‚SF Mono‘, monospace; font-size: 0.8rem; color: #94a3b8; white-space: pre; }
.success { color: #34d399; }
.warn { color: #fbbf24; }
π€ Weekly AI Digest v3 β Automated Content Curation Script
Reviewed: June 4, 2026
Python script that auto-curates arXiv papers, GitHub trending repos, and AI news RSS feeds into a ready-to-publish digest.
Python 3 / stdlib only
No API keys required
π What It Does
The weekly_digest_v3_curator.py script automatically gathers content from multiple sources and produces a formatted HTML digest ready to publish to WordPress via the dg/v1/publish endpoint.
Data Sources
- π arXiv API β Fetches top papers from cs.AI, cs.CL, cs.LG, cs.RO, cs.NE categories, sorted by submission date. Selects Paper of the Week + 3 runners-up.
- β GitHub Search API β Finds trending AI/ML repos created in the past week with 50+ stars. Filters by topics: machine-learning, artificial-intelligence, deep-learning, llm, transformers.
- π° RSS Feeds β Aggregates from 6 sources: arXiv CL Blog, arXiv AI Blog, Hugging Face Blog, OpenAI Blog, Google AI Blog, MIT Technology Review. Deduplicates by title hash.
ποΈ Architecture
β arXiv API β β GitHub Searchβ β RSS Feeds β
β (Atom XML) β β (JSON API) β β (RSS/Atom) β
ββββββββ¬βββββββ ββββββββ¬ββββββββ ββββββββ¬βββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββ
β weekly_digest_v3_curator.py β
β β’ Parse & deduplicate β
β β’ Generate HTML (Paper of Week, Papers, β
β GitHub Trending, News, Trend Analysis) β
β β’ Output JSON + HTML β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββ
β dg/v1/publish β
β (WordPress) β
βββββββββββββββββββ
π Usage
# Basic run β outputs JSON with embedded HTML python3 weekly_digest_v3_curator.py # Specify output file and also generate standalone HTML python3 weekly_digest_v3_curator.py --output /tmp/digest_v3_latest.json --format html # Set a custom week-ending date python3 weekly_digest_v3_curator.py --week-ending "June 3, 2026"
π Output Format
The script outputs a JSON file with the following structure:
{
"version": "3.0",
"week_ending": "May 27, 2026",
"generated_at": "2026-05-27T10:30:00+00:00",
"stats": {
"papers": 4,
"repos": 5,
"news_items": 6
},
"data": {
"papers": [...], // Array of paper objects
"repos": [...], // Array of repo objects
"news": [...] // Array of news item objects
},
"html": "<div class='dg-digest'>...</div>" // Ready-to-publish HTML
}
β Live Test Results
Script tested successfully on May 27, 2026:
- β arXiv: 4 papers fetched from cs.AI, cs.CL, cs.LG categories
- β RSS News: 6 items aggregated from 6 feeds, deduplicated
- β GitHub: Rate-limited on free API (422). Falls back gracefully with empty repos list.
- β HTML Generation: Full digest HTML produced with all sections
π§ Integration with Digest Publishing
To publish a digest edition, the WordPress dg/v1/publish endpoint reads the html field from the JSON output and creates a new post:
import json, subprocess
with open('/tmp/digest_v3_latest.json') as f:
data = json.load(f)
payload = {
"posts": [{
"title": f"Weekly AI Digest v3 β {data['week_ending']}",
"content": data["html"],
"status": "publish",
"slug": f"weekly-ai-digest-v3-{data['week_ending'].lower().replace(' ', '-')[:20]}"
}]
}
π Notes
- Requires only Python 3 stdlib β no pip install needed
- arXiv API has rate limits; script fetches max 8 papers to stay within bounds
- GitHub Search API may rate-limit unauthenticated requests (60/hour)
- RSS feeds that fail to load are skipped gracefully
- Output HTML is self-contained with inline styles β works in WordPress without external CSS
