Weekly AI Digest — Week 1, June 2026

Reviewed: June 4, 2026

June 2–6, 2026 | DataGate.ch AI Industry Roundup

Welcome to the first weekly AI digest from DataGate.ch. Each week we round up the most important developments in AI infrastructure, agent frameworks, safety research, and developer tools — so you don’t have to scan 50 RSS feeds yourself.

🏗️ Infrastructure

NVIDIA GTC Fall: Rubin Architecture Details Emerge

NVIDIA released additional details about its next-generation Rubin GPU architecture, expected to ship in H2 2027. Rubin promises 3x the inference throughput of Blackwell NVLink 72, with native support for FP4 quantization. For AI teams running large-scale inference, this signals that per-token costs could drop another 50-60% within 18 months.

AWS Launches Trainium3 Instances in GA

Amazon Web Services made Trainium3 instances generally available in us-east-1 and eu-west-1. The new chips deliver 2x training throughput compared to Trainium2, and AWS is positioning them as a cost-effective alternative to NVIDIA H100s for fine-tuning workloads. Early benchmarks show competitive performance on Llama and Mistral fine-tuning at roughly 40% lower cost per token than equivalently-sized GPU clusters.

Cloud GPU Spot Pricing Hits New Lows

Competition among cloud providers drove NVIDIA A100 spot pricing below $1.20/hour on both GCP and Lambda Labs. For teams doing non-urgent training runs or batch inference, this makes 80GB-class hardware accessible for the first time at near-commodity prices.

🤖 Agents & Frameworks

LangGraph 1.0 Released

LangChain officially released LangGraph 1.0 with a stable API, built-in persistence, and human-in-the-loop support. The release includes a new visual debugger, improved streaming for multi-agent workflows, and official deployment guides for production. If you’ve been on the fence about adopting LangGraph, the stable API commitment makes this the right time to migrate from the 0.x releases.

CrewAI Surpasses 25k GitHub Stars

CrewAI, the multi-agent orchestration framework, crossed 25,000 stars on GitHub on the back of its v0.90 release. The update adds native support for custom tool caching, improved delegation logic between agents, and a new CLI for scaffolding multi-agent projects. The framework has become the go-to choice for teams building role-based multi-agent systems.

OpenAI Codex CLI Becomes Default in Cursor

Cursor announced that OpenAI Codex CLI is now the default agentic coding backend in its editor, replacing the previous custom integration. The change brings improved multi-file reasoning and faster diff application. For developers already using Cursor, this is a transparent backend swap — but it signals deeper OpenAI tooling integration across the IDE landscape.

🔒 Safety & Alignment

Anthropic Publishes Constitutional AI 2.0 Paper

Anthropic released a detailed paper on Constitutional AI 2.0, describing a new red-teaming pipeline that combines automated adversarial testing with human evaluation. The key innovation is a „constitution distillation“ technique that transfers safety principles from larger to smaller models with minimal performance degradation. This has immediate implications for teams fine-tuning smaller models who want safety guardrails without the overhead of running a large evaluator model.

EU AI Enforcement Guidelines Published

The European Commission published its first set of enforcement guidelines for the AI Act, clarifying timelines for compliance across risk categories. High-risk AI systems (including those used in hiring, credit scoring, and critical infrastructure) have until August 2027 for full compliance. The guidelines include a useful self-assessment checklist that AI teams should run against their current deployments.

🛠️ Tools & Libraries

vLLM v0.8 Adds KV Cache Quantization

The vLLM serving library released v0.8 with support for KV cache quantization, reducing memory usage by up to 4x with less than 1% throughput penalty on most benchmarks. This is a significant win for teams serving long-context models where KV cache memory is the primary bottleneck.

DSPy 3.0: Declarative LM Programming

DSPy released version 3.0 with a redesigned compiler that can optimize multi-stage LM pipelines end-to-end. The new version includes built-in support for RAG optimization, automatic bootstrapping of few-shot examples, and integration with W&B for experiment tracking.

Ollama Adds Native Tool Calling

Ollama v0.7 introduced native function calling support for all OpenAI-compatible models served locally. This removes the biggest gap between local and cloud-hosted model capabilities and makes it practical to run full agent workflows entirely on local hardware.

📊 By the Numbers

🔮 What to Watch Next Week

This digest is published weekly by DataGate.ch. Subscribe to the newsletter to get it delivered to your inbox.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert