
As a VP of product at Google Cloud, Michael Gerstenhaber primarily focuses on Vertex AI, the company’s integrated platform for implementing enterprise AI. This role provides him with an overarching perspective on how businesses are truly utilizing AI models and what remains to be accomplished to unlock the potential of agentic AI.
During my conversation with Gerstenhaber, one particular concept caught my attention — one I had not encountered before. He explained that AI models are challenging three boundaries simultaneously: raw intelligence, latency, and a third aspect that relates less to pure capability and more to cost — whether a model can be implemented affordably enough to operate at large, unpredictable scales. This offers a fresh perspective on model capacities, particularly beneficial for those aiming to steer frontier models in a new trajectory.
This interview has been modified for brevity and clarity.
Could you begin by sharing your journey in AI thus far, and your role at Google?
I have been engaged in AI for around two years. I spent one and a half years at Anthropic and have been with Google for nearly six months. I lead Vertex AI, which is Google’s developer platform. The majority of our clients are developers creating their own applications. They seek accessibility to agentic patterns and platforms, as well as access to the inferences of the finest models globally. I provide that, but the applications are developed by Shopify, Thomson Reuters, and our various clients in their respective fields.
What attracted you to Google?
I believe Google is exceptional in that it offers everything from the user interface to the infrastructure base. We can construct data centers, procure electricity, and create power plants. We possess our own chips and models, and we manage the inference layer. We oversee the agentic layer as well. We have APIs for memory and for intertwined code writing, along with an agent engine that ensures compliance and governance. Furthermore, we even have the chat interface with Gemini enterprise and Gemini chat for consumers. Therefore, one of the reasons I joined here is that I perceive Google as uniquely vertically integrated, which serves as an advantage for us.
Techcrunch event
Boston, MA
|
June 9, 2026
It’s interesting because, despite the variations between companies, it appears that all three of the major labs are quite similar in capabilities. Is it merely a competition for greater intelligence, or is it more nuanced?
I identify three barriers. Models like Gemini Pro are optimized for raw intelligence. Consider the task of writing code. All you desire is the finest code achievable, regardless of whether it takes 45 minutes, as it must be maintained and deployed. The goal is simply to obtain the best.
Then there’s another barrier relating to latency. If I am providing customer support and require guidance on applying a policy, intelligence is necessary to enact that policy. Are you authorized to process a return? Can I change my seat on an airplane? However, it is irrelevant how correct your information is if it takes 45 minutes to receive an answer. Thus, for such scenarios, you need the most intelligent product within that latency threshold because excess intelligence becomes insignificant once the individual becomes frustrated and hangs up the call.
Finally, there’s one more category, where entities like Reddit or Meta aim to oversee the entire internet. They possess substantial budgets, yet they cannot take enterprise risks on something when scalability is uncertain. They cannot predict how many harmful posts will emerge today or tomorrow. Consequently, they must limit their budget to a model with the highest intelligence they can sustain while ensuring it scales effectively across an infinite range of subjects. Thus, cost becomes exceedingly critical.
I’ve been contemplating why agentic systems are slow to gain acceptance. It seems that the models are available, and I’ve witnessed astonishing demonstrations, yet the significant changes I anticipated a year ago have not materialized. What do you believe is hindering progress?
This technology is fundamentally around two years old, and there is still a considerable lack of infrastructure. We lack methodologies for auditing agent actions. We don’t have frameworks for authorizing data to an agent. These methodologies will necessitate development to be implemented in production. And production often lags behind what technology can achieve. So, two years is insufficient to observe what the intelligence can support in a production environment, and that’s where the challenges lie.
The advancement has been remarkably swift in software engineering since it integrates smoothly with the software development lifecycle. We have a development environment in which it’s acceptable to experiment, followed by a transition from a dev environment to a testing environment. At Google, the code writing process requires two individuals to review the code and both to confirm that it meets the standards to be associated with Google’s name and presented to our clients. Therefore, we have many of those human-in-the-loop procedures that significantly reduce the implementation risk. However, we need to create those methodologies in other areas and for various professions.

