Weekly AI Digest v3 β€” Automated Content Curation Script

* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, BlinkMacSystemFont, ‚Segoe UI‘, Roboto, sans-serif; background: #0f1117; color: #e2e8f0; line-height: 1.6; }
.container { max-width: 860px; margin: 0 auto; padding: 2rem 1.5rem; }
h1 { font-size: 1.7rem; color: #7dd3fc; margin-bottom: 0.3rem; }
.subtitle { color: #94a3b8; margin-bottom: 1.5rem; }
h2 { font-size: 1.2rem; color: #93c5fd; margin: 1.5rem 0 0.5rem; }
h3 { font-size: 1rem; color: #c084fc; margin: 1rem 0 0.3rem; }
p, li { font-size: 0.92rem; color: #cbd5e1; }
ul, ol { padding-left: 1.3rem; margin-bottom: 0.8rem; }
li { margin-bottom: 0.3rem; }
code { background: #1e2130; padding: 0.15rem 0.4rem; border-radius: 4px; font-size: 0.85rem; color: #6ee7b7; font-family: ‚SF Mono‘, ‚Fira Code‘, monospace; }
pre { background: #151825; border: 1px solid #2d3748; border-radius: 8px; padding: 1rem; overflow-x: auto; font-size: 0.82rem; color: #e2e8f0; margin: 0.8rem 0; }
.tag { display: inline-block; padding: 0.15rem 0.5rem; border-radius: 999px; font-size: 0.7rem; font-weight: 600; margin-right: 0.3rem; margin-bottom: 0.3rem; }
.tag-green { background: #05966920; color: #34d399; border: 1px solid #05966940; }
.tag-blue { background: #1d4ed820; color: #60a5fa; border: 1px solid #1d4ed840; }
.tag-purple { background: #7c3aed20; color: #a78bfa; border: 1px solid #7c3aed40; }
.arch-diagram { background: #151825; border: 1px solid #2d3748; border-radius: 8px; padding: 1rem; margin: 1rem 0; font-family: ‚SF Mono‘, monospace; font-size: 0.8rem; color: #94a3b8; white-space: pre; }
.success { color: #34d399; }
.warn { color: #fbbf24; }

πŸ€– Weekly AI Digest v3 β€” Automated Content Curation Script

Reviewed: June 4, 2026

Python script that auto-curates arXiv papers, GitHub trending repos, and AI news RSS feeds into a ready-to-publish digest.

βœ… Operational
Python 3 / stdlib only
No API keys required

πŸ“‹ What It Does

The weekly_digest_v3_curator.py script automatically gathers content from multiple sources and produces a formatted HTML digest ready to publish to WordPress via the dg/v1/publish endpoint.

Data Sources

  • πŸ“„ arXiv API β€” Fetches top papers from cs.AI, cs.CL, cs.LG, cs.RO, cs.NE categories, sorted by submission date. Selects Paper of the Week + 3 runners-up.
  • ⭐ GitHub Search API β€” Finds trending AI/ML repos created in the past week with 50+ stars. Filters by topics: machine-learning, artificial-intelligence, deep-learning, llm, transformers.
  • πŸ“° RSS Feeds β€” Aggregates from 6 sources: arXiv CL Blog, arXiv AI Blog, Hugging Face Blog, OpenAI Blog, Google AI Blog, MIT Technology Review. Deduplicates by title hash.

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ arXiv API β”‚ β”‚ GitHub Searchβ”‚ β”‚ RSS Feeds β”‚
β”‚ (Atom XML) β”‚ β”‚ (JSON API) β”‚ β”‚ (RSS/Atom) β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚ β”‚
β–Ό β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ weekly_digest_v3_curator.py β”‚
β”‚ β€’ Parse & deduplicate β”‚
β”‚ β€’ Generate HTML (Paper of Week, Papers, β”‚
β”‚ GitHub Trending, News, Trend Analysis) β”‚
β”‚ β€’ Output JSON + HTML β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ dg/v1/publish β”‚
β”‚ (WordPress) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Usage

# Basic run β€” outputs JSON with embedded HTML
python3 weekly_digest_v3_curator.py

# Specify output file and also generate standalone HTML
python3 weekly_digest_v3_curator.py --output /tmp/digest_v3_latest.json --format html

# Set a custom week-ending date
python3 weekly_digest_v3_curator.py --week-ending "June 3, 2026"

πŸ“Š Output Format

The script outputs a JSON file with the following structure:

{
  "version": "3.0",
  "week_ending": "May 27, 2026",
  "generated_at": "2026-05-27T10:30:00+00:00",
  "stats": {
    "papers": 4,
    "repos": 5,
    "news_items": 6
  },
  "data": {
    "papers": [...],   // Array of paper objects
    "repos": [...],    // Array of repo objects
    "news": [...]      // Array of news item objects
  },
  "html": "<div class='dg-digest'>...</div>"  // Ready-to-publish HTML
}

βœ… Live Test Results

Script tested successfully on May 27, 2026:

  • βœ… arXiv: 4 papers fetched from cs.AI, cs.CL, cs.LG categories
  • βœ… RSS News: 6 items aggregated from 6 feeds, deduplicated
  • ⚠ GitHub: Rate-limited on free API (422). Falls back gracefully with empty repos list.
  • βœ… HTML Generation: Full digest HTML produced with all sections

πŸ”§ Integration with Digest Publishing

To publish a digest edition, the WordPress dg/v1/publish endpoint reads the html field from the JSON output and creates a new post:

import json, subprocess

with open('/tmp/digest_v3_latest.json') as f:
    data = json.load(f)

payload = {
    "posts": [{
        "title": f"Weekly AI Digest v3 β€” {data['week_ending']}",
        "content": data["html"],
        "status": "publish",
        "slug": f"weekly-ai-digest-v3-{data['week_ending'].lower().replace(' ', '-')[:20]}"
    }]
}

πŸ“Œ Notes

  • Requires only Python 3 stdlib β€” no pip install needed
  • arXiv API has rate limits; script fetches max 8 papers to stay within bounds
  • GitHub Search API may rate-limit unauthenticated requests (60/hour)
  • RSS feeds that fail to load are skipped gracefully
  • Output HTML is self-contained with inline styles β€” works in WordPress without external CSS

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht verΓΆffentlicht. Erforderliche Felder sind mit * markiert