Weekly AI Digest — Week 2, June 2026

Reviewed: June 4, 2026

June 9–13, 2026 | DataGate.ch AI Industry Roundup

The second week of June brought major releases in model serving, agent tooling, and a surprising open-source milestone. Here’s what matters.

🏗️ Infrastructure

Google Cloud TPU v6 „Trillium“ Now Available for Fine-Tuning

Google made TPU v6 pods available for fine-tuning workloads on Vertex AI, offering competitive pricing against NVIDIA A100s for transformer training. Early adopters report 30% faster training times for models in the 7B-70B range compared to equivalently-priced GPU clusters. The catch: you’ll need JAX or PyTorch/XLA, which adds friction for teams standardized on vanilla PyTorch.

Modal Raises $160M Series B for Serverless GPU

Modal, the serverless GPU platform popular with AI researchers, raised $160M to expand its infrastructure. The platform now supports A100, H100, and L40S GPUs with per-second billing and automatic scaling to zero. For teams running sporadic training jobs or batch inference, Modal’s pricing model can cut cloud GPU costs by 60-80%.

Cloudflare Launches Workers AI Model Gateway

Cloudflare introduced a model gateway for Workers AI that provides unified API access to 50+ open-source models with built-in caching, rate limiting, and failover. The gateway sits at Cloudflare’s edge, reducing latency for global applications. Free tier includes 10,000 neurons/day (roughly 100,000 tokens).

🤖 Agents & Frameworks

AutoGen 0.4 Released with New Agent Runtime

Microsoft’s AutoGen framework released version 0.4 with a completely redesigned actor-based runtime. The new architecture supports distributed multi-agent systems across multiple machines, built-in message serialization, and a new declarative workflow DSL. Migration from 0.3 requires some code changes, but the performance improvements (3-5x faster message passing) make it worthwhile.

OpenAI Function Calling Gets Structured Outputs by Default

OpenAI updated its API to return structured outputs by default for function calling, eliminating the need for response_format: {"type": "json_object"} in most cases. The change is backward-compatible and reduces a common source of parsing errors in agent workflows.

Browser-Use v0.2: Web Agents Get Reliable

The browser-use library, which lets AI agents control web browsers, released v0.2 with dramatically improved reliability. The new version uses accessibility trees instead of screenshots for element detection, reducing token usage by 80% and improving action success rates from 72% to 94% on the WebArena benchmark.

🔒 Safety & Alignment

NIST AI RMF 2.0 Published

NIST released version 2.0 of its AI Risk Management Framework, adding specific guidance for generative AI and autonomous systems. The update includes a new „Agent Safety“ section covering tool use risks, multi-agent coordination failures, and emergent behaviors. Organizations building AI agents should map their safety practices against the new framework.

Open-Source Red Teaming Toolkit Released

A consortium of AI safety organizations released an open-source red teaming toolkit covering prompt injection, jailbreak detection, and output filtering. The toolkit includes 2,000+ pre-built attack scenarios and integrates with popular LLM evaluation harnesses. Available on GitHub under Apache 2.0.

🛠️ Tools & Libraries

Llama.cpp Adds Multi-GPU Inference

Llama.cpp added support for automatic multi-GPU inference, allowing models to be split across multiple consumer GPUs without NVLink. This means you can run a 70B model on two RTX 4090s (48GB total VRAM) with near-linear scaling. The feature is experimental but functional for GGUF-format models.

Weights & Biases Launches W&B Prompts

W&Biased released W&B Prompts, a dedicated prompt engineering and evaluation platform. Features include prompt versioning, A/B testing, automatic evaluation against custom metrics, and integration with 15+ LLM providers. Free tier supports up to 1,000 evaluations/month.

Guardrails AI Reaches 1.0

Guardrails AI, the open-source library for validating LLM outputs, hit version 1.0 with a stable API. The library supports Pydantic-based output validation, regex constraints, and custom validators. It integrates with LangChain, LlamaIndex, and direct API calls.

📊 By the Numbers

🔮 What to Watch Next Week

This digest is published weekly by DataGate.ch. Subscribe to the newsletter to get it delivered to your inbox.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert