the dam brief #15 - AI BEAVERS

The Headline

the week's biggest story

Anthropic launches Claude Sonnet 5 for cheaper agentic runs

Anthropic launched Claude Sonnet 5 on June 30, calling it its most agentic Sonnet model yet. the company says it can make plans, use browsers and terminals, and run autonomously at a level that recently required larger and more expensive models.

the release landed alongside Claude Science, a research workbench for scientists and pharma teams, and a July 1 return for Fable 5 after US export controls were lifted.

(Source)

The Build

tools, models, and dev releases

quick signal before the rabbit hole: these six repos add up to 105k+ GitHub stars. the useful part is what each one lets you try this weekend.

Strix stress-tests apps with AI penetration testing - a runnable path for finding and fixing app vulnerabilities before a human red team gets expensive (GitHub)

OmniRoute gives coding agents one gateway across model providers - a TypeScript gateway with MCP, A2A, and provider fallback if your agent stack is glued together by API keys (GitHub)

video-use edits video with coding agents - a browser-use project with a small enough README to try if your content pipeline still treats video as manual timeline work (GitHub)

cognee gives agents self-hosted long-term memory - a knowledge-graph memory platform with a concrete install path for teams tired of agents forgetting the last run (GitHub)

herdr multiplexes terminal agents - a Rust terminal pattern for supervising multiple agent runs without turning your desktop into a window graveyard (GitHub)

CubeSandbox hardens agent execution - TencentCloud's Rust sandbox for instant, concurrent, lightweight isolation around agent code (GitHub)

Founder Fuel

funding, acquisitions, and market moves

Microsoft launches an AI deployment company with a $2.5B commitment - TechCrunch reports that the new group follows Amazon, OpenAI, and Anthropic into the business of taking AI from models into enterprise operations (More)

OpenAI reportedly proposes giving 5% of its equity to a US sovereign wealth fund - the proposal would turn private frontier-lab upside into a public-finance question instead of a normal cap-table story (More)

Anthropic discusses a custom chip with Samsung - the talks add another frontier lab to the custom-silicon race after OpenAI's Broadcom partnership moved compute strategy closer to the model roadmap (More)

Bending Spoons jumps 40% on its first trading day - the Milan-based software buyer went public after building a portfolio around brands including Evernote, Meetup, Vimeo, AOL, and Eventbrite (More)

Longevity & Health

aging, biotech, and health research

OpenAI releases GeneBench-Pro for genomics and biology evaluation - the benchmark tests AI systems on complex real-world datasets in genomics, biology, and scientific research rather than generic multiple-choice biology prompts (More)

Anthropic ships Claude Science for research and pharma teams - the workbench integrates common scientific tools, produces audit trails, and targets repeatable research workflows rather than one-off chat answers (More)

scientists trace a possible route for Alzheimer's spread through the brain - ScienceDaily reports that blocking toxic Tau-carrying packages before they reach healthy cells reduced a suspected disease-spread mechanism in lab work (More)

The World

regulation, policy, and geopolitics

Google must pay a €4.1B Android fine in Europe - the BBC reports that the court rejected Google's challenge over using Android to block rivals, keeping platform bundling inside Europe's competition fight (More)

UK police warn parents about public child photos and AI abuse - the National Crime Agency says ordinary images can be repurposed into child abuse material, turning family sharing into a live safety and platform-governance issue (More)

AWS publishes its frontier-model release process for Bedrock - the post details security, governance, and customer-release checks for new frontier models as cloud providers become gatekeepers for model access (More)

Go Deeper

long reads, podcasts, and documentaries

standard benchmarks underestimate what agents can do - The Decoder | staff. the UK's AI Security Institute found that capped compute budgets hide agent capability, especially on software engineering tasks. (Read)

Google DeepMind unionization talks are off to a rocky start - Wired | Paresh Dave. a useful labor-side view of what happens when frontier AI work becomes normal workplace politics. (Read)

paradise revisited - Longreads | staff. a Galapagos and Darwin piece for anyone who needs one long read that is not another model benchmark. (Read)

constructions, concoctions, conjurations - Longreads | staff. a reading list on constructed languages, useful if you like watching humans build protocols without calling them protocols. (Read)

The Colony

events and community

next friday: July 10 - build fridays Hamburg - bring a laptop, pick one thing, and make visible progress with other builders in the room (RSVP)

next sunday: July 12 - agentic coding studio, Hamburg - a focused session for builders working with coding agents, hosted by AI BEAVERS with Malte Sussdorff and Rodrigo R. Espitia (RSVP)

new for members: the community perks hub is live with Notion, Linear, and SAGEOBOT benefits for eligible AI BEAVERS members after newsletter confirmation and profile completion (Open perks)

beavers have transparent eyelids called nictitating membranes, so they can see while swimming underwater. agent teams are now asking for the same thing: let the model dive into terminals, browsers, sandboxes, and long runs, but please keep one weird little eyelid on the audit log.

was this useful? anything you'd like to hear more about? just reply.

stay curious,
Alex & Vlady

share this with a friend · LinkedIn · Instagram · TikTok · Luma