Snapchat knew it from the start, but in recent months Google and Facebook have all but confirmed it: The keyboard, slowly but surely, is fading into obscurity.
Last week at Google’s annual developer conference, the company presented its vision for how it expects its users—more than a billion people—to interact with technology in the coming years. And for the most part, it didn’t involved typing into a search box. Instead, Google’s brass spent its time on stage touting the company’s speech recognition skills and showing off Google Lens, a new computer vision technology that essentially turns your phone camera into a search engine.
Technology has once again reached an inflection point. For years, smartphones relied on hardware keyboards, a holdover from the early days of cell phones. Then came multitouch. Spurred by the wonders of the first smartphone screens, people swiped, typed, and pinched. Now, the way we engage with our phones is changing once again thanks to AI. Snapping a photo works as well, if not better, than writing a descriptive sentence in a search box. Casually chatting with Google Assistant, the company’s omnipresent virtual helper, gets results as fast, if not faster, than opening Chrome and navigating from there. The upshot, as Google CEO Sundar Pichai explained, is that we’re increasingly interacting with our computers in more natural and emotive ways, which could mean using your keyboard a lot less.
Ask the people who build your technology, and they’ll tell you: The camera is the new keyboard. The catchy phrase is becoming something of an industry-wide mantra to describe the constant march towards more visual forms of communication. Just look at Snapchat. The company bet its business on the fact that people would rather trade pictures than strings of words. The idea proved so compelling that Facebook and Instagram unabashedly developed their own versions of the feature. “The camera has already become a pervasive form of communication,” says Roman Kalantari, the head creative technologist at the design studio Fjord. “But what’s the next step after that?”
For Facebook and Snapchat, it was funhouse mirror effects and goofy augmented reality overlays—ways of building on top of photos that you simply can’t with text. Meanwhile, Google took a decidedly more utilitarian approach with Lens, turning the camera into an input device much like the keyboard itself. Point your camera at a tree, and it’ll tell you the variety. Snap a pic of the new restaurant on your block, and it’ll pull up the menu and hours, even help you book a reservation. Perhaps the single most effective demonstration of the technology was also its dullest—focus the lens on a router’s SKU and password, and Google’s image recognition will scan the information, pass it along to your Android phone, and automatically log you into the network.
This simplicity is a big deal. No longer does finding information require typing into a search box. Suddenly the world, in all its complexity, can be understood just by aiming your camera at something. Google isn’t the only company buying into this vision of the future. Amazon’s Fire Phone from 2014 enabled image-based search, which meant you could point the camera at a book, or a Blue-ray disc, or a box of cereal, and have the item shipped to you instantly via Amazon Prime. Earlier this year, Pinterest launched the beta version of Lens, a tool that allows user to take a photo of an object in the real world and surface related objects on the Pinterest platform. “We’re getting to the point where using your camera to discover new ideas is as fast and easy as typing,” says Albert Pereta, a creative lead at Pinterest, who led the development at Lens.
Translation: Words can be hard, and it often works better to show than to tell. It’s easier to find the mid-century modern chair with a mahogany leather seat you’re looking for when you can share what it looks like, rather than typing a string of precise keywords. “With a camera, you can complete the task by taking a photo or video of the thing,” explains Gierad Laput, who studies human computer interaction at Carnegie Mellon. “Whereas with a keyboard, you complete this task by typing a description of the thing. You have to come up with the right description and type them accordingly.”
The caveat, of course, is that the image recognition needs to be accurate in order to work. You have agency when you type something into a search box—you can delete, revise, retype. But with a camera, the devices decides what you’re looking at and, even more crucially, assumes what information you want to see in return. The good (or potentially creepy) news is that with every photo taken, search query typed, and command spoken, Google learns more about you, which means over time your results grow increasingly accurate. With its deep trove of knowledge in hand, Google seems determined to smooth out the remaining rough edges of technology. It’ll probably still be a while before the keyboard goes extinct, but with every shot you take on your camera, it’s getting one step closer.