At 34, Peter Mika is one of the youngest research scientists at Yahoo Inc. A specialist in Web mining and Internet search at Yahoo Labs in Barcelona, he spoke in an interview about the future of search, how new ideas are incubated at Yahoo Labs and what has changed since Marissa Mayer took over as chief executive officer (CEO). Edited excerpts:
As a researcher, how do you deal with failed projects at Yahoo Labs?
Failure is good, which is a strange thing to say in any company. If you do not fail, you’re not taking enough risks. In that sense it’s a unique challenge for managing a larger research lab such as Yahoo to encourage experimentation and we do that often. So we try things that are very “hacker-like”—we try them very often.
So we have our own research engineers working alongside us who are engineers, but supporting researchers specifically and they help us implement our ideas, get it to some point where we can show it and decide whether to pursue it or not.
And of course we need the input from the other side, we need the inputs from the product managers who are thinking about the product, thinking about the consumer more and they have the insight of what will be successful as a product, what will be successful in the market.
How would you define the future of Internet search?
I think the answer has been the same all along. We’re really at the beginning of search. So surprisingly, given the investment and effort that went into search, I think there is nobody in the industry that would tell you that search is finished, that we’re done doing search. Some of these things worked very well, so there are certain types of queries that you can answer perfectly, but there are many, many unresolved queries.
What are these unresolved queries?
I can give you a few examples of ambiguous queries. When you are searching for the term jaguar, it can be a car, it can be an animal, among many other things. Ambiguity comes in because of subjectivity. For example, cheap digital camera: What does it mean to be cheap for a digital camera? Or Barcelona nightlife. Is that something to do at night in Barcelona?
Queries where there is some information—an example would be ‘Brad Pitt’ and ‘zombies’. So there’s a movie that came out that features Brad Pitt and it’s about zombies. But maybe you don’t know the name, you forgot the name. If you knew the name, it would be easy because you would’ve typed World War Z, but you forgot and just have contextual clues. In this case, the search engine may return information about zombies, it may return information about Brad Pitt, but you’re really after this movie which connects these two concepts. So that information is hidden or missing from the queries.
Another example would be if you want to search for me, but you forgot my name. So you type in 34-year-old computer scientist working for Yahoo. Currently this will not work in a search engine. You’ll get information about Yahoo, you’ll get information about Barcelona or 34-year-old people, but what you’re really looking for is something that connects all these—a search engine that understands if you’re looking for a person, one that certifies the age of the person, and one that understands that Yahoo is the workplace of the person and that Barcelona is his location. So something that has very, very deep understanding, much deeper than the level of individual keywords.
I think it’s a big understanding problem—that’s the way I would phrase it. If you think of Wikipedia, it is a good chunk of human knowledge. It is not necessarily the largest website—it currently has something like 12 million articles, which is not even a large collection of text, but it includes some very fundamental knowledge that we as humans have.
How do you get better at Internet search?
As I already mentioned, you need some kind of representation of the world and what we call ‘ontologies’—conceptual structures that can tell you that in a world there are certain persons, they have an age and age is a number, that they have an affiliation, that they work at some place, with a company. So if you see Yahoo, it’s probably the company Yahoo. If you say it in the context of a person, then probably the person is working at Yahoo. So these kind of clues that are in the query, you’re connecting them to what you know in the background, the background knowledge that you have.
You need to do some fairly advanced analytics—you need to be able to recognize entities like Yahoo or Barcelona in a query and then you need to find the right interpretation.
Searching on the Internet becomes tough when you do not remember exactly what you are looking for. Can search engines help in such instances?
We are getting closer. So some of these services are out there and at Yahoo, for example, we provide related entities. So if you type in Obama, we try to figure out if it is Barack Obama, the politician, or a town in Japan called Obama. It could be that town as well. But if you type in Obama, it’s more likely that you’re thinking of the US President. So we take that interpretation, and then we try to give you related entities to questions. Such as what other politicians might be related to Barack Obama. In case of celebrities, what are the actors that might be related, etc.
What’s the road map for search at Yahoo?
It’s an area of active investment, so I can tell you that we will definitely be continuing our partnership with Bing, working with them in collaboration to improve the quality, improve the ranking, improve the ads, so that the ads are also working in collaboration. Mobile is a big emphasis at the moment. As you know, the world is moving to mobile—Marissa in particular is very keen on educating our engineering workforce to be mobile enabled. The goal is to get 50% of our engineers to be educated in mobile technologies and mobile products. And some of the other products that are coming out need to be very specific to mobile.
What’s changed at Yahoo in all the CEO transitions over few years? What’s changed with Mayer coming on board?
There is incredible energy, incredible motivation that is coming—she is very much a technical person, so she is very much product focused and I think that makes people connect with her much more than a…business-oriented person. So she is very much aware of the work we do. And she brought a number of interesting new ideas that motivated people to come to work and wanting to work. And wanting to be the best—not the second best.