Google Cloud AI leads on three frontiers of modeling capabilities

As VP of business at Google Cloud, Michael Gerstenhaber works primarily on Vertex AI, the company’s unified platform for deploying business AI. It gives him a high-level view of how companies are using AI models, and what needs to be done to unleash the potential of useful AI.

After speaking with Gerstenhaber, I was struck by a fact that I had never heard before. As he said, AI models are pushing three limits at the same time: raw intelligence, response time, and a third quality that is less related to raw power than cost – whether a model can be sent cheaply to run on a large scale, is unknown. It’s a new way of thinking about the possibilities of modeling, and a must for anyone trying to push the boundaries of modeling in a new direction.

These interviews have been edited for length and clarity.

Why don’t you start by walking us through your experiences in AI so far, and what you do at Google.

I have been in AI for almost two years now. I was at Anthropic for a year and a half, I’ve been at Google for about half a year now. I run Vertex AI, a Google platform. Most of our customers are engineers who develop their own software. They want access to agents. They want access to the agent’s platform. They want access to the world’s most intelligent models. I give them that, but I don’t give the programs themselves. That’s for Shopify, Thomson Reuters, and our various clients to provide in their respective fields.

What drew you to Google?

Google I think is unique in the world because we have everything from the interface to the architecture part. We can build a data center. We can buy electricity and build power plants. We have our chips. We have our example. We have a reference section that we control. We have an agent component that we control. We have in-memory APIs, for writing mixed code. We have a top-level support engine that ensures compliance and control. And then we have Gemini chat and Gemini chat for buyers, right? So one of the reasons I came here is because I saw Google as a unique combination, and being a strength for us.

Techcrunch event

Boston, MA
| |
June 9, 2026

It’s odd because, despite all the differences between the companies, it seems like the big three labs really are. near potential. Is it an intellectual competition, or is it more complicated than that?

I see three limits. Models like Gemini Pro are designed to be smart. Think about writing code. You just want the best code you can get, it doesn’t matter if it takes 45 minutes, because I have to maintain it, I have to put it in production. I just want the best.

Then there is this other limitation with latency. If I’m helping customers and I need to know how to use the information, you need the knowledge to use the process. Are you allowed to return? Can I upgrade my seat on the plane? But it doesn’t matter how correct you are if it took 45 minutes to find the answer. So for these reasons, you want something very intelligent within the latency budget, because a lot of intelligence is no longer necessary once someone gets tired and hangs up.

And there’s this last bucket, where someone like Reddit or Meta wants to control the entire internet. They have a lot of money, but they can’t take a risk on a business if they don’t know how it will grow. They don’t know how many toxic articles there will be today or tomorrow. So they have to restrict their budget to the smartest model they can afford, but in a way that is dangerous to most of the education. And because of that, the price is very important.

One of the things I’ve been puzzled about is why agencies are taking so long to act. It sounds like the models are there and I’ve seen some amazing demos, but we’re not seeing the big changes I would have expected a year ago. What do you think is holding it back?

The technology is two years old, and there is still a lot of work to be done. We don’t have a way to measure what agents are doing. We have no way to authorize the data to the provider. There are these types that require work to be done. And production is always the next sign that technology is capable. So two years is not long enough to see what intelligence contributes to production, and that’s when people suffer.

I think it’s moved quickly in software engineering because it fits well with the software development lifestyle. We have a dev environment where it’s safe to break things, and then we promote from the dev environment to the test environment. Coding on Google requires two people to review the code and verify that it is correct to put the Google logo on the back and provide it to our customers. So we have a lot of human resources that make this process much less difficult. But we have to make those designs in other places and for other jobs.

Source link

Leave a ReplyCancel Reply