94%: Browser-use v0.2 action success rate on WebArena (up from 72%) $160M: Modal Series B funding round 50+: Models available through Cloudflare Workers AI gateway 3-5x: AutoGen 0.4 message passing speedup over 0.3 🔮 What to Watch Next Week Expected PyTorch 2.6 release with improved compile() perfor

Weekly AI Digest — Week 2, June 2026

Q: 🏗️ Infrastructure

Google Cloud TPU v6 "Trillium" Now Available for Fine-Tuning Google made TPU v6 pods available for fine-tuning workloads on Vertex AI, offering competitive pricing against NVIDIA A100s for transformer training. Early adopters report 30% faster training times for models in the 7B-70B range compared t

Weekly AI Digest — Week 2, June 2026

Reviewed: June 4, 2026

June 9–13, 2026 | DataGate.ch AI Industry Roundup

The second week of June brought major releases in model serving, agent tooling, and a surprising open-source milestone. Here’s what matters.

🏗️ Infrastructure

Google Cloud TPU v6 „Trillium“ Now Available for Fine-Tuning

Google made TPU v6 pods available for fine-tuning workloads on Vertex AI, offering competitive pricing against NVIDIA A100s for transformer training. Early adopters report 30% faster training times for models in the 7B-70B range compared to equivalently-priced GPU clusters. The catch: you’ll need JAX or PyTorch/XLA, which adds friction for teams standardized on vanilla PyTorch.

Modal Raises $160M Series B for Serverless GPU

Modal, the serverless GPU platform popular with AI researchers, raised $160M to expand its infrastructure. The platform now supports A100, H100, and L40S GPUs with per-second billing and automatic scaling to zero. For teams running sporadic training jobs or batch inference, Modal’s pricing model can cut cloud GPU costs by 60-80%.

Cloudflare Launches Workers AI Model Gateway

Cloudflare introduced a model gateway for Workers AI that provides unified API access to 50+ open-source models with built-in caching, rate limiting, and failover. The gateway sits at Cloudflare’s edge, reducing latency for global applications. Free tier includes 10,000 neurons/day (roughly 100,000 tokens).

🤖 Agents & Frameworks

AutoGen 0.4 Released with New Agent Runtime

Microsoft’s AutoGen framework released version 0.4 with a completely redesigned actor-based runtime. The new architecture supports distributed multi-agent systems across multiple machines, built-in message serialization, and a new declarative workflow DSL. Migration from 0.3 requires some code changes, but the performance improvements (3-5x faster message passing) make it worthwhile.

OpenAI Function Calling Gets Structured Outputs by Default

OpenAI updated its API to return structured outputs by default for function calling, eliminating the need for response_format: {"type": "json_object"} in most cases. The change is backward-compatible and reduces a common source of parsing errors in agent workflows.

Browser-Use v0.2: Web Agents Get Reliable

The browser-use library, which lets AI agents control web browsers, released v0.2 with dramatically improved reliability. The new version uses accessibility trees instead of screenshots for element detection, reducing token usage by 80% and improving action success rates from 72% to 94% on the WebArena benchmark.

🔒 Safety & Alignment

NIST AI RMF 2.0 Published

NIST released version 2.0 of its AI Risk Management Framework, adding specific guidance for generative AI and autonomous systems. The update includes a new „Agent Safety“ section covering tool use risks, multi-agent coordination failures, and emergent behaviors. Organizations building AI agents should map their safety practices against the new framework.

Open-Source Red Teaming Toolkit Released

A consortium of AI safety organizations released an open-source red teaming toolkit covering prompt injection, jailbreak detection, and output filtering. The toolkit includes 2,000+ pre-built attack scenarios and integrates with popular LLM evaluation harnesses. Available on GitHub under Apache 2.0.

🛠️ Tools & Libraries

Llama.cpp Adds Multi-GPU Inference

Llama.cpp added support for automatic multi-GPU inference, allowing models to be split across multiple consumer GPUs without NVLink. This means you can run a 70B model on two RTX 4090s (48GB total VRAM) with near-linear scaling. The feature is experimental but functional for GGUF-format models.

Weights & Biases Launches W&B Prompts

W&Biased released W&B Prompts, a dedicated prompt engineering and evaluation platform. Features include prompt versioning, A/B testing, automatic evaluation against custom metrics, and integration with 15+ LLM providers. Free tier supports up to 1,000 evaluations/month.

Guardrails AI Reaches 1.0

Guardrails AI, the open-source library for validating LLM outputs, hit version 1.0 with a stable API. The library supports Pydantic-based output validation, regex constraints, and custom validators. It integrates with LangChain, LlamaIndex, and direct API calls.

📊 By the Numbers

94%: Browser-use v0.2 action success rate on WebArena (up from 72%)
$160M: Modal Series B funding round
50+: Models available through Cloudflare Workers AI gateway
3-5x: AutoGen 0.4 message passing speedup over 0.3

🔮 What to Watch Next Week

Expected PyTorch 2.6 release with improved compile() performance
Meta reportedly preparing Llama 4 announcement for late June
Anthropic safety audit results for Claude 4 expected to be published

This digest is published weekly by DataGate.ch. Subscribe to the newsletter to get it delivered to your inbox.

📚 Related Posts

DataGate AI Content Intelligence Dashboard — DataGate AI Content Intelligence Dashboard *{box-sizing:border-box;margin:0;padding:0} :root{--bg:#0f172a;--card:#1e293b;--accent:#3b82f6;--accent2:#8b5cf6;--green:#10b981;--yellow:#f59e0b;--red:#ef4444;--text:#e2e8f0;--muted:#94a3b8} body{font-family:'Segoe UI',system-ui,sans-serif;background:var(--bg);color:var(--text);padding:16px;line-height:1.6} .header{display:flex;align-items:center;justify-content:space-between;flex-wrap:wrap;gap:12px;margin-bottom:16px} .header h1{font-size:1.5rem;background:linear-gradient(90deg,var(--accent),var(--accent2));-webkit-background-clip:text;-webkit-text-fill-color:transparent} .header .badge{background:linear-gradient(135deg,var(--accent),var(--accent2));color:#fff;padding:4px 12px;border-radius:20px;font-size:.75rem;font-weight:600}…
Topic Trend Tracker — Topic Trend Tracker *{box-sizing:border-box;margin:0;padding:0} :root{--bg:#0f172a;--card:#1e293b;--accent:#3b82f6;--accent2:#8b5cf6;--green:#10b981;--yellow:#f59e0b;--red:#ef4444;--text:#e2e8f0;--muted:#94a3b8} body{font-family:'Segoe UI',system-ui,sans-serif;background:var(--bg);color:var(--text);padding:20px;line-height:1.6} .wrap{max-width:1100px;margin:0 auto} h1{font-size:1.6rem;margin:4px 0 16px;background:linear-gradient(90deg,var(--accent),var(--accent2));-webkit-background-clip:text;-webkit-text-fill-color:transparent} .sub{color:var(--muted);margin-bottom:20px;font-size:.9rem} .grid{display:grid;grid-template-columns:1fr 1fr;gap:16px}…
Audience Segmentation Explorer — Audience Segmentation Explorer *{box-sizing:border-box;margin:0;padding:0} :root{--bg:#0f172a;--card:#1e293b;--accent:#3b82f6;--accent2:#8b5cf6;--green:#10b981;--yellow:#f59e0b;--red:#ef4444;--text:#e2e8f0;--muted:#94a3b8} body{font-family:'Segoe UI',system-ui,sans-serif;background:var(--bg);color:var(--text);padding:20px;line-height:1.6} .wrap{max-width:1100px;margin:0 auto} h1{font-size:1.6rem;margin:4px 0 16px;background:linear-gradient(90deg,var(--accent),var(--accent2));-webkit-background-clip:text;-webkit-text-fill-color:transparent} .sub{color:var(--muted);margin-bottom:20px;font-size:.9rem} .grid{display:grid;grid-template-columns:1fr 1fr;gap:16px}…
AI Content Performance Analyzer — AI Content Performance Analyzer *{box-sizing:border-box;margin:0;padding:0} :root{--bg:#0f172a;--card:#1e293b;--accent:#3b82f6;--accent2:#8b5cf6;--green:#10b981;--yellow:#f59e0b;--red:#ef4444;--text:#e2e8f0;--muted:#94a3b8} body{font-family:'Segoe UI',system-ui,sans-serif;background:var(--bg);color:var(--text);padding:20px;line-height:1.6} .wrap{max-width:1100px;margin:0 auto} h1{font-size:1.6rem;margin:4px 0 16px;background:linear-gradient(90deg,var(--accent),var(--accent2));-webkit-background-clip:text;-webkit-text-fill-color:transparent} .sub{color:var(--muted);margin-bottom:20px;font-size:.9rem} .stats{display:grid;grid-template-columns:repeat(auto-fit,minmax(140px,1fr));gap:12px;margin-bottom:20px}…
Wave 151 Hub: AI Agent Engineering — 🌊 Wave 151: AI Agent Engineering The definitive guide to building production-grade AI agents —…

Weekly AI Digest — Week 2, June 2026

Weekly AI Digest — Week 2, June 2026

🏗️ Infrastructure

Google Cloud TPU v6 „Trillium“ Now Available for Fine-Tuning

Modal Raises $160M Series B for Serverless GPU

Cloudflare Launches Workers AI Model Gateway

🤖 Agents & Frameworks

AutoGen 0.4 Released with New Agent Runtime

OpenAI Function Calling Gets Structured Outputs by Default

Browser-Use v0.2: Web Agents Get Reliable

🔒 Safety & Alignment

NIST AI RMF 2.0 Published

Open-Source Red Teaming Toolkit Released

🛠️ Tools & Libraries

Llama.cpp Adds Multi-GPU Inference

Weights & Biases Launches W&B Prompts

Guardrails AI Reaches 1.0

📊 By the Numbers

🔮 What to Watch Next Week

📚 Related Posts

Schreibe einen Kommentar Antwort abbrechen