Yesterday, the 46-year-old Google veteran who oversees its search engine, Amit Singhal, announced his retirement. And in short order, Google revealed that Singhal’s rather enormous shoes would be filled by a man named John Giannandrea. On one level, these are just two guys doing something new with their lives. But you can also view the pair as the ideal metaphor for a momentous shift in the way things work inside Google—and across the tech world as a whole.
Giannandrea, you see, oversees Google’s work in artificial intelligence. This includes deep neural networks, networks of hardware and software that approximate the web of neurons in the human brain. By analyzing vast amounts of digital data, these neural nets can learn all sorts of useful tasks, like identifying photos, recognizing commands spoken into a smartphone, and, as it turns out, responding to Internet search queries. In some cases, they can learn a task so well that they outperform humans. They can do it better. They can do it faster. And they can do it at a much larger scale.
If AI is the future of Google Search, it’s the future of so much more.
This approach, called deep learning, is rapidly reinventing so many of the Internet’s most popular services, from Facebook to Twitter to Skype. Over the past year, it has also reinvented Google Search, where the company generates most of its revenue. Early in 2015, as Bloomberg recently reported, Google began rolling out a deep learning system called RankBrain that helps generate responses to search queries. As of October, RankBrain played a role in “a very large fraction” of the millions of queries that go through the search engine with each passing second.
As Bloomberg says, it was Singhal who approved the roll-out of RankBrain. And before that, he and his team may have explored other, simpler forms of machine learning. But for a time, some say, he represented a steadfast resistance to the use of machine learning inside Google Search. In the past, Google relied mostly on algorithms that followed a strict set of rules set by humans. The concern—as described by some former Google employees—was that it was more difficult to understand why neural nets behaved the way it did, and more difficult to tweak their behavior.
These concerns still hover over the world of machine learning. The truth is that even the experts don’t completely understand how neural nets work. But they do work. If you feed enough photos of a platypus into a neural net, it can learn to identify a platypus. If you show it enough computer malware code, it can learn to recognize a virus. If you give it enough raw language—words or phrases that people might type into a search engine—it can learn to understand search queries and help respond to them. In some cases, it can handle queries better than algorithmic rules hand-coded by human engineers. Artificial intelligence is the future of Google Search, and if it’s the future of Google Search, it’s the future of so much more.
Sticking to the Rules
This past fall, I sat down with a former Googler who asked that I withhold his name because he wasn’t authorized to talk about the company’s inner workings, and we discussed the role of neural networks inside the company’s search engine. At one point, he said, the Google ads team had adopted neural nets to help target ads, but the “organic search” team was reluctant to use this technology. Indeed, over the years, discussions of this dynamic have popped up every now and again on Quora, the popular question-and-answer site.
These technologies may sacrifice some control. But the benefits outweigh the sacrifice.
Edmond Lau, who worked on Google’s search team and is the author of the book The Effective Engineer, wrote in a Quora post that Singhal carried a philosophical bias against machine learning. With machine learning, he wrote, the trouble was that “it’s hard to explain and ascertain why a particular search result ranks more highly than another result for a given query.” And, he added: “It’s difficult to directly tweak a machine learning-based system to boost the importance of certain signals over others.” Other ex-Googlers agreed with this characterization.
Yes, Google’s search engine was always driven by algorithms that automatically generate a response to each query. But these algorithms amounted to a set of definite rules. Google engineers could readily change and refine these rules. And unlike neural nets, these algorithms didn’t learn on their own. As Lau put it: “Rule-based scoring metrics, while still complex, provide a greater opportunity for engineers to directly tweak weights in specific situations.”
But now, Google has incorporated deep learning into its search engine. And with its head of AI taking over search, the company seems to believe this is the way forward.
It’s true that with neural nets, you lose some control. But you don’t lose all of it, says Chris Nicholson, the founder of the deep learning startup Skymind. Neural networks are really just math—linear algebra—and engineers can certainly trace how the numbers behave inside these multi-layered creations. The trouble is that it’s hard to understand why a neural net classifies a photo or spoken word or snippet of natural language in a certain way.
“People understand the linear algebra behind deep learning. But the models it produces are less human-readable. They’re machine-readable,” Nicholson says. “They can retrieve very accurate results, but we can’t always explain, on an individual basis, what led them to those accurate results.”
Ways do exist to trace what is happening inside these multi-layered creations.
What this means is that, in order to tweak the behavior of these neural nets, you must adjust the math through intuition, trial, and error. You must retrain them on new data, with still more trial and error. That’s doable, but complicated. And as Google moves search to this AI model, it’s unclear how the move will affect its ability to defend its search results against claims of unfairness or change the results in the face of complaints.
These concerns aren’t trivial. Today, Google is facing an European anti-trust investigation into whether it unfairly demoted the pages of certain competitors. What happens when it’s really the machines making these decisions, and their rationale is indecipherable? Humans will still guide these machines, but not in the same way they were guided in the past.
In any event, deep learning has arrived on Google Search. And the company may have used other forms of machine learning in recent years, as well. Though these technologies sacrifice some control, Google believes, the benefits outweigh that sacrifice.
To be sure, deep learning is still just a part of how Google Search works. According to Bloomberg, RankBrain helps Google deal with about 15 percent of its daily queries—the queries the system hasn’t seen in the past. Basically, this machine learning engine is adept at analyzing the words and phrases that make up a search query and deciding what other words and phrases carry much the same meaning. As a result, it’s better than the old rules-based system when handling brand new queries—queries Google Search has never seen before.
But over time, systems like this will play an even greater role inside Internet services like Google Search. At one point, Google ran a test that pitted its search engineers against RankBrain. Both were asked to look at various web pages and predict which would rank highest on a Google search results page. RankBrain was right 80 percent of the time. The engineers were right 70 percent of the time.
This doesn’t detract from Singhal’s work. He joined Google in 2000, and a year later was named a Google Fellow, the highest honor Google bestows on its engineers. For most of Google’s history, he has ruled the company’s search engine, and that search engine pretty much ruled the Internet.
But machine learning is rapidly changing that landscape. “By building learning systems, we don’t have to write these rules anymore,” John Giannandrea told a room full of reporters inside Google headquarters this fall. “Increasingly, we’re discovering that if we can learn things rather than writing code, we can scale these things much better.”