Brandon Seppa Navigation
  • Home
  • About
  • Search
  • Home
  • About
  • Search

Google Built a TPU for the Age of Inference. Meet Ironwood.

TPU Ironwood is Google’s 7th-generation custom AI chip, and unlike its predecessors, it was built for inference first. Here’s what that means and why it matters.

AI InferenceAI InfrastructureCustom SiliconGoogle CloudIronwoodISVTPU

GPU Inference Without the Cluster. Cloud Run Finally Makes That Real.

Cloud Run now supports GPUs with scale-to-zero billing. For AI inference workloads that are bursty, sporadic, or just getting started, that changes the math entirely.

AI InferenceCloud RunGoogle CloudGPUISVLLM InferenceServerless

MCP Is the New REST. Google Cloud Just Made It Enterprise-Ready.

The Model Context Protocol is becoming the standard for how AI agents call external tools. Google Cloud Managed MCP Servers turn your existing APIs into agent tools — governed, discoverable, no new infrastructure.

Agentic AIAI agentsAPI ManagementApigeeGoogle CloudISVMCP

Generated Video, Images, and Music. One API.

Veo 3 generates video with native audio. Imagen 4 renders at 2K. Lyria 3 Pro writes full songs. All on Vertex AI, all under one bill.

Agentic AIGenerative AIGoogle CloudImagenVeoVertex AIVideo Generation

Apigee Got a New Job: The Control Plane for Your AI.

Apigee evolved from API gateway to the control plane for LLM traffic, agent actions, and MCP tools. Here is why that matters for anyone building AI features at scale.

Agentic AIAI agentsAPI ManagementApigeeGoogle CloudISVLLM Inference

Wait, Oracle Runs Inside Google Cloud Now?

Oracle and Google Cloud put actual Exadata hardware inside GCP data centers. That is a strange sentence to type, and it has some interesting implications.

AI agentsCloud DatabaseCloud MigrationGoogle CloudISVMulticloudOracle

AI That Understands Your Entire Codebase?

Gemini Code Assist Enterprise gives your engineering team an AI that understands your private codebase, your GCP infrastructure, and your org’s coding standards. For ISVs, it is the difference between faster typing and actually shipping faster.

AI codingdeveloper productivityEnterprise AIGemini Code AssistGoogle Cloud

A2A Is How AI Agents Finally Learn to Play Nicely

Google’s Agent2Agent protocol, now under Linux Foundation stewardship, gives AI agents a standard way to find, authenticate, and collaborate with each other across any vendor or framework. For ISVs, it changes what a multi-agent product architecture can look like.

A2A protocolAgent2AgentAI agentsGoogle Cloudmulti-agent

Google Just Open-Sourced Its Best Small Models. Now What?

Gemma 4 is Google’s first fully open-source multimodal model family, released under Apache 2.0. For ISVs, that changes the calculus on how you build and what you ship.

Gemma 4Google CloudISVopen source AIVertex AI

Embeddings, Index, and Hybrid Search. No Assembly Required.

Vertex AI Vector Search 2.0 went GA in March 2026. For ISVs building on Google Cloud, it collapses embedding pipelines, indexing, feature stores, and hybrid search into one managed service, so you can ship AI features instead of building infrastructure.

Enterprise AIGoogle CloudRAGVector SearchVertex AI

Google’s Classified AI Play: Running Gemini Where the Internet Can’t Go

Google Distributed Cloud Air-Gapped puts Gemini LLMs inside fully air-gapped defense and sovereign environments – the first hyperscaler to pull this off. Here’s why it matters and who should care.

air-gapped AIdefense cloudGeminiGoogle Distributed Cloudsovereign cloud

Your AI Model Needs a Bouncer. Google Built One.

Prompt injection is the SQL injection of the AI era. Model Armor is the first cloud-native solution that protects any LLM, on any cloud, without locking you into a single vendor.

AI SecurityGoogle CloudLLM SecurityModel ArmorPrompt Injection

LLM Traffic Is Weird. Your Infrastructure Needs to Know That.

Standard load balancers treat LLM inference like any other HTTP traffic. That is expensive and slow. GKE Inference Gateway knows the difference.

AI InfrastructureGKEGoogle CloudKubernetesLLM Inference

PostgreSQL Got a Supercharger. Your Database Bill Didn’t Get the Memo.

AlloyDB collapses three separate database systems into one managed PostgreSQL instance. The benchmarks are embarrassing for Aurora.

AlloyDBCloud DatabaseEnterprise AIGoogle CloudPostgreSQL

The AI Pilot Trap, and How to Get Out of It

Most enterprise AI projects don’t fail because the technology doesn’t work. They fail because no one built the infrastructure to let it work at scale.

AI StrategyDigital TransformationEnterprise AIGoogle CloudVertex AI

The ETL Pipeline You’re Running Probably Shouldn’t Exist

BigQuery can now run AI models directly inside SQL. The implications for how you’ve been architecting your data stack are a little uncomfortable.

BigQueryData EngineeringEnterprise AIETLGoogle Cloud

Most Enterprise AI Is Blind to 80% of Your Data

Your AI reads text just fine. It’s the contracts, recordings, and images it can’t touch that are going to cost you.

Enterprise AIGoogle CloudMultimodal AIUnstructured DataVertex AI

TurboQuant: Kind of a Big Deal.

Google Research just published a way to cut AI serving costs by 50% with zero accuracy loss. The interesting part is what happens to the ISVs who figure this out first.

AI InfrastructureCost OptimizationGoogle CloudLLM InferenceTPU
LinkedIn
BRANDONSEPPA.COM © 2026