Google’s Project Soli: the tech behind Pixel 4’s Motion Sense radar
ByBy now, you’ve heard: the new Google Pixel 4 has a tiny radar chip inside it, which allows you to swipe or wave your hand to do a few things. More importantly, Motion Sense (as Google has branded it) is designed to detect your presence. It knows if you’re there. The technology comes from Project Soli, which was first demonstrated publicly in 2015 and is now inside the Pixel 4 as its first major commercial implementation. Responding to a few air gestures is fairly minor, but Google sees the potential for it to eventually become much more.
That’s always the way with new computing interfaces. The mouse and the touchscreen led to giant revolutions in computing, so you see the potential for a new one to do the same thing. It’s a trap Apple CEO Tim Cook himself fell into when he introduced the Digital Crown on the Apple Watch, saying it was as important as the mouse. (It wasn’t.)
Luckily, Google isn’t claiming quite so much for Motion Sense, but it does have a similar problem. The gap between things that Motion Sense could do and what it actually does in this first version is huge. In theory, putting radar on a phone is a revolution. In practice, it could be seen as just a gimmick.
Motion Sense has essentially three types of interactions, according to Brandon Barbello, Pixel product manager at Google. Understanding why each of them matters to the overall experience of using a Pixel 4 is key to understanding why Google thinks Motion Sense is more than the sum of its features — more than just a gimmick.
The first type is presence. The Project Soli radar chip inside the Pixel 4 creates a small bubble of awareness around the phone when it’s sitting on a table. (It’s only on when it’s facing up or out.) It’s a hemisphere with a radius of a foot or two. It just keeps an eye out to ensure you’re nearby and turns off the always-on display if you’re not.
The second type is reach. This isn’t much: the phone just pays attention to see if you’re reaching for it and then quickly turns on the screen and activates the face unlock sensors. If an alarm or ringtone is going, the phone automatically quiets down a bit when it sees you’re reaching for it.
Finally, there are the actual gestures, of which there are only two. You can give it a quick wave to dismiss those calls or snooze alarms. You can also swipe left or right if music is playing to go forward or back. There are a couple more specific things you can do, but Google isn’t opening up gestures to third-party developers for a while.
Most of these features have been done before with other sensors. You’ve been able to wave at phones and have their cameras detect it. When you pick up an iPhone, the accelerometer feels that happen and starts up Face ID. So I asked: will using radar for these features make the experience so much better that people will really notice it?
”It isn’t that it’s so much better that you’re going to notice it,” says Barbello. “It’s that it’s so much better, and you’re not going to notice it. [You will just think] that it’s supposed to be this way.”
ThereThere are technical reasons to prefer a radar chip to a camera, says Ivan Poupyrev, director of engineering for Google’s ATAP division. Radar takes up much less power than a camera, for one thing. For another, it’s... not a camera, and it can’t personally identify you. “If you look at the radar signal, there is no discerning human characteristics,” Poupyrev says.
Project Soli also doesn’t need line-of-sight like a camera. It can even work through other materials. There’s no in-principle reason Soli couldn’t work while the phone is face-down or even in your bag. The Soli technology could also, theoretically, work up to seven meters away (though power would be an issue).
But Soli, instantiated as Motion Sense on the Pixel 4, is much more limited to start. It’s not a strictly technical limitation. Barbello says that “we can sense motions as precise as a butterfly’s wings.” I believe that, by the way, at least in theory. Three years ago, when Poupyrev first showed me Project Soli, I was able to turn a virtual dial on a smartwatch with the tiniest movement of my thumb and finger hovering above it.
Still, Google has had to overcome some technical issues to get radar to work at this miniature scale. Fairly late in the project, Poupyrev admits that his team had to junk their original machine learning models and start over from scratch. “It was down to the wire.” Plus, he adds, “it’s not like I can go buy a book … We had to invent every single thing. We had to invent from scratch. Every time you go read a paper about it, it assumes this gigantic radar to track satellites with an aperture of 10 meters.”
“Getting this new technology into the phone is nothing short of a technology miracle, from my perspective,” Poupyrev says. So while, in theory, Project Soli could detect anything from a butterfly’s wings to a person standing seven meters away behind a wall, in practice, it probably needs more training to get there.
Mainly, though, Google is limiting what Motion Sense can do because it is creating an entirely new interaction language, and it needs to limit the vocabulary at the beginning. Otherwise, it would be overwhelming — or at least super annoying.
Take the swipe gesture: waving your hand over the phone to go forward or back a track when music is playing. “What is swiping?” Poupyrev asks. “We spent weeks trying to figure out swiping ... We have dozens and dozens of ways people do swiping.” Some people do a “these aren’t the droids” gesture, some just flit their hand, some hold it flat, some sideways.
If Google had to show you a tutorial detailing exactly what way to hold your hand when you do a swipe, the jig would be up. You’d hate it. There is a tutorial — featuring pokémon — but it just handles the basics.
Google even learned that we don’t actually have a common language for what “swipe left” and “swipe right” mean. Different people move their hands in different directions depending on their mental model of whether the direction refers to their hand or the thing they’re virtually interacting with. Google ended up having to put a preference setting in for those who wanted to flip the defaults.
I keep going back to the idea that none of this is strictly necessary and therefore kind of gimmicky. Do I really need my phone to wake up a half-second before I touch it, saving me a tap on the screen? Is it really that difficult to hit the snooze button?
It’s not. But then again, Google isn’t promising the world here. It’s just promising a slightly nicer, more seamless experience. Poupyrev points out that lots of people use the little autoreply buttons instead of just typing out “yes” manually. “At the end of the day, the technology that wins is what’s easy to use,” he argues. “It’s just as simple as that. And removing a small amount of friction is what gets people more and more adapted to the technology.”
If you just look at “what it does for you,” what Poupyrev calls “the toothbrush use case,” he argues you’re missing the point. The point isn’t to layer on a ton of new features; it’s to improve “the emotional components of interaction.”
Some of that is just fun. In addition to that pokémon tutorial, there’s a Pokémon wallpaper you can wave at or even tickle, and a game co-developed with UsTwo. Most of it, though, is what Barbello says is “making the core device functions a lot more natural.”
Poupyrev puts it a different way: “How does the toothbrush make you happy? You know it makes your life better, but it doesn’t make you happy. A pokémon makes you happy.”
Vox Media has affiliate partnerships. These do not influence editorial content, though Vox Media may earn commissions for products purchased via affiliate links. For more information, see our ethics policy.