From Add Me and Pixel Screenshots to Call Notes, the Google Pixel 9 series has a bunch of AI upgrades, and none of them would be possible without the new Tensor G4. Tom’s Guide sits down with Jesse Seed, group product manager for Google Silicon, and Zach Gleicher, product manager for Google DeepMind, about what the Tensor G4 chip can do and how it stands out from Apple’s A series and Qualcomm’s Snapdragon.
Category
🤖
TechTranscript
00:00The Google Pixel 9 is here and a lot of people are going to be talking about new features like the design as well as the display, battery life.
00:07But what really makes this phone stand out is the Tensor G4 chip and all the AI experiences that it enables on this new wave of devices.
00:14And to help us walk us through some of those scenarios and how it can actually help your life get a little bit better is Jesse Seed,
00:20who is on the team that is behind the Google Silicon, as well as Zach Gleischer, who is at DeepMind and they have a deep collaboration with the Tensor G4 team.
00:29So, Jesse, what do you think makes the Tensor G4 chip stand out in a sea of smartphones?
00:34I think the biggest innovation that we made this year was being the first silicon, the first phone to run Gemini Nano with multimodality.
00:41And that unlocks some very cool use cases, one of which is pixel screenshots.
00:47So that's very handy if you're trying to remember things.
00:51I'm sure you got a chance to play with that.
00:54And another feature not related to the Gemini Nano model, but I really love is also the add me feature.
00:59And so those of us that are the photographers of our family or our crew definitely appreciate being able to go back in and, you know, dynamically add the photographer in.
01:09And that's something that we we worked a lot on to tune over 15 different machine learning models and also using the Google augmented reality SDK.
01:19So, yeah, I think those are my top two favorite Tensor enabled pixel experiences this year.
01:24So how do you get something like Gemini Nano to fit on something that's as compact as a phone?
01:30At DeepMind, we we collaborate with a whole bunch of teams across Google and we want to make sure that we're building Gemini models that meet the needs of all Google products.
01:40So as we were developing Gemini in collaboration with Android and Pixel, of course, we realized that there was this need for on-device models.
01:52So we saw this as like a challenge because, you know, on the server, everyone was keeping to push like everyone was pushing for more capable models that were potentially bigger.
02:05And we, on the other hand, had all these interesting constraints that weren't present before on memory constraints, power consumption constraints.
02:13So in partnership with the Tensor team and Pixel, we were able to come together and understand what what are the core use cases for these on-device models?
02:24What are the constraints for these on-device models and actually co-develop a model together, which was a really exciting experience and made it possible to build something that was so capable and able to power these use cases.
02:39For someone who hasn't upgraded their phone in, let's say, like three or four years, what do you think is going to stand out for them when it comes to the Tensor G4 chip?
02:46Improving the what we call fundamentals like power and performance are very important for us.
02:51And so the Tensor G4, which is our fourth generation chip, is our most efficient and our most performant.
02:58And so we believe that users will see that in everyday experiences like web performance or web browsing, as well as app launch and just overall snappiness of the user interface.
03:09So definitely I think they'll be able to experience that in hand.
03:12And what about gaming performance? Because we know that's really important these days for people who are buying a new phone.
03:17In our testing, we actually have seen improved both peak and sustained performance in gaming and common games that run on the platform.
03:25So, yeah.
03:25So I feel like we're almost past the phase where people are no longer afraid of AI and they're more interested in terms of how it's going to help them.
03:32So what are some of the features within Gemini Nano coming to the phone that you're most excited about?
03:36Some of the main motivations that we see for the Tensor team and Pixel team coming to us for on-device use cases is one is better reliability.
03:48So the fact that you don't have to rely on the internet connection, the experience can be reliable and work no matter where you are.
03:57Another is as we think about like potentially privacy, if you don't, if developers don't want the data to actually leave the device and be fully processed on-device, that's possible with having an on-device LLM.
04:11I think that, you know, some of the features that I'm excited about is I think like the pixel screenshots is a really great one.
04:18I think that really showcases how we are able to get these multimodal features that are working on the device.
04:26It can work, as you can see in the demos, it was really snappy, low latency, going to be reliable in how it works.
04:33But it's also a super capable model.
04:35And then all this information is data stored locally on your device, can be processed locally.
04:41So we're really excited that it can enable experiences like that.
04:45I think we're seeing traction for summarization use cases and smart reply and some of these common themes that are happening.
04:53And those are some of the use cases that we're really trying to make sure that the model works especially well for.
05:00So I think like the models are continuing to get better, more capable, and we're going to just see the possibilities of what's possible on-device continuing to expand.
05:08So now that the G4 chip is in all of these phones, how do you balance like the higher performance versus thermals and battery life?
05:16So something like thermal performance and indeed even battery life, they're full system design challenges, right?
05:21It's not just about any one component, like only the chip or only something else.
05:25It's about the entire system.
05:27So what we're so lucky to have is, you know, control of the full stack.
05:31Everything from the silicon all the way up to the higher level user application and everything in between.
05:35So that means that we can tweak and refine year over year.
05:39And so, yes, as you mentioned, the addition of the vapor chamber, that's one concrete thing that we did in the pro line this year to really give a little bit of extra headroom in those high sustained use cases where, you know, you're burning more power.
05:52But, yeah, that's the way we think about it.
05:54It's really like the full system design and how do we improve that year over year.
05:57I think to a certain degree, sometimes like users get intimidated with AI on phones, especially since we're still at the early stages.
06:04So how do you make sure with the Pixel 9 in particular that people are excited and that they actually find these features to begin with?
06:11So I'm sure you've used a Pixel phone through all this process.
06:14There's this very cool thing called Pixel Tips, which I love to use when I get my new Pixel.
06:19And it will actually guide you through some of the new applications or new use cases or new ways that a particular app will work.
06:27So I think that's one way that we can help communicate to users what's the new cool stuff to play with on your new Pixel phone.
06:34I think we saw with Microsoft and Recall, which they had a Recall themselves, that people are a little bit nervous about their phones knowing everything about them.
06:42But I think screenshots is a little bit different if you guys can go into that, because I know it's like more manual.
06:47You're deciding what you want your phone to capture, but at the same time, it can still not know a lot about you.
06:52So how do you make sure that that information stays private and only on the phone?
06:56So, I mean, one of the ways we do it is indeed by having a capable on-device model, right?
07:02So that means that the analysis that's being done on that screenshot, none of it leaves the device.
07:07So that's one way that we're able to address that privacy concern.
07:10I think the other thing is just empowering users to decide what they want to do, like how they want to use something like Gemini, right?
07:18And what use cases they feel comfortable interacting with and what they don't.
07:23So I think it really comes down to user choice.
07:25But in the case of Pixel Spreenshots in particular, that is a fully on-device use case.
07:30Right.
07:31Yeah.
07:32So I don't think third-party benchmarks are going away, because we're going to use them to test these phones and the Tensor G4 chip.
07:38But at the same time, I think we have to start thinking about performance a little bit differently now that the AI era is here.
07:44So from your perspective, how should we be thinking about performance now when it comes to this chip?
07:49That's a great question.
07:50I think it really all comes down to real-world use cases, right?
07:53Like how does this thing actually perform at hand with the way you're actually going to run it?
07:57So I do think that things like how fast the web browsing response is, how fast apps are launching, the quickness and the responsiveness of the user interface, those are all sort of, for everyday use cases, those are good standard things to look at, right?
08:10And then also things like how fast can you capture a picture?
08:15These are all reasonable things that people really do through the course of the day.
08:18And I think those are much more representative, just a few to mention, those are much more representative than, you know, the sort of, you know, oftentimes semi-synthetic benchmarks that we see out there in the industry.
08:31Yeah.
08:31So when it comes to the Gemini Nano model on Pixel phones, from your perspective, when does that phone pass the test in terms of performance?
08:38As we think about benchmarks for LLMs and Gemini, and especially as we think about Gemini Nano, we've seen like an industry, a large focus on academic benchmarks and academic benchmarks.
08:55Like MMLU are, you know, great as it gives a common metric, but they could be gamified and people can optimize for them and it might not capture what you really care about.
09:07So for example, MMLU is a popular academic benchmark that is going to ask like not some of the, it's a very diverse set of questions that it asks the LLM to answer.
09:19But maybe some of the questions that it asks is just information about history, history questions.
09:27Right.
09:27For an on-device model, we don't really care that it knows it can answer history questions.
09:32We think like that's probably a better use case for a server-side model when we care about like use cases like summarization, where it's not important whether you know when Rome fell or something like that.
09:45So what we try to do is we work closely with the partner teams that are building these on-device experiences and we really try to gather benchmarks that they care about and the use cases that they care about so that we can evaluate against those.
10:01And that's how we think about quality, but again, this is what also becomes super important as we think about Gemini Nano versus our server-side flash models and pro is we also have to think about constraints like battery consumption.
10:15And we have to make sure, so we work that like the water, the model performs well and doesn't consume too much battery and that also the latency is, is good.
10:25So we actually partner with the Tensor team to profile our models as we're co-designing these models together to make sure that we are getting an architecture that works well and meets their efficiency, power consumption constraints, and then we collect data for the use cases that they care about and make sure that we can hill climb on those use cases.
10:45Yes, MMLU and other metrics like that are great for us just to make sure that we have automated metrics that we can hill climb against because creating good evals is often a very difficult task, but we do a lot of that co-development together.
11:02That's great.
11:03And I would also just add, I think Zach's really onto something here.
11:05It's also something to be said for, it's not just about traditional maybe metrics of performance, but also quality.
11:10So if you look at things like the quality of responses coming out of the model or even things like quality of the photo, right?
11:16Those are more like, that's what real world users in hand are going to care more about than, you know, some, some number on the side of a box.
11:23Yeah, it's more subjective.
11:25That's true.
11:26And as we think about quality, it's, we, we have, you know, sometimes we have human raters who are evaluating quality.
11:32But what's I think really exciting about the development of Gemini is we could actually use larger Gemini models to what we call like auto raters to evaluate the quality of AI rating it.
11:46Yes, self-grading.
11:47That can be a, that can be a very powerful way for us to iterate more quickly and make sure that we are getting the model to perform well.
11:54Of course, these have their mistakes and issues as well, and that's why doing actual sanity checks with human raters can be helpful too.
12:02All right, so I just wanted to say thank you to you both in terms of taking the time out to talk about this new chip and what's happening with DeepMind behind the scenes and how this is all coming to life.
12:11We're going to test out these phones to see how good they are.
12:13But now we know a little bit more about how much smarter these things are getting.