Mumbai: Last week, Google Inc. introduced a major improvement in its search algorithm to help it better determine when to give users more up-to-date and relevant results.

Delivering results: Google’s Amit Singhal (left) and Rajan Patel

An algorithm is a set of mathematical steps that solves a problem. A Google Fellow is a designation the company reserves for its elite master engineers in the area of “ranking algorithm".

Tweaking algorithms to yield better search results is not new to Google. Over the past decade, the owner of the world’s most popular search engine has introduced many innovations such as PageRank, named after Google’s co-founder and chief executive Larry Page, which works by counting the number and quality of links to a page to determine a rough estimate of how important the website is.

In February, it had introduced another algorithmic improvement christened “Panda" to improve user experience by catching and demoting “low-quality" sites that did not provide useful original content or otherwise add much value while simultaneously giving better rankings to high-quality sites—“those with original content and information such as research, in-depth reports and insightful analysis".

Singhal, along with numerous other Google scientists, analysts and engineers, continuously works on refining searches with a good reason. The search firm answers more than one billion questions a day from people in 181 countries and 146 languages. It indexes millions of Web pages, but the challenge is to return only the most relevant and most recent result, depending on the context of the query. Moreover, the company in February had a nearly 90% share of the global search engine market, according to research firm StatCounter.

Microsoft Corp.’s Bing search engine garnered a mere 4.37% share and was marginally ahead of Yahoo at 3.93%.

To maintain its huge lead, Google has to ensure that its searches remain relevant. With this in mind, the company’s engineers have been making around 500 changes to its search algorithms every year, or at least one change a day, according to Singhal.

Google’s latest change, incidentally, builds on the momentum from its “Caffeine" Web indexing system, which was introduced in 2010, and allows the company to crawl and index the Web for fresh content quickly. Different searches have different “freshness" needs.

Algorithmic change

Changes to algorithms at Google undergo extensive quality evaluation before being released in the public domain. A typical algorithmic change begins as an idea from a engineer. It is then implemented on a test version of Google. “Before and after" results pages are generated and presented to “raters"—people who are trained to evaluate search quality. If the feedback is positive, Google may run what it terms as a “live experiment", where it tries out the updated algorithm on a very small percentage of Google users to see (also called a “Sandbox"), for instance, how many searchers click the new top result more often.

In 2010, Google ran 13,311 precision evaluations to test whether potential algorithm changes had a positive or negative impact on the precision of its results. Based on all these experimentation, evaluation and analysis, it introduced 516 improvements to search.

The process is highly automated. In very few cases, Google has manual controls to address spam and security concerns such as malware and viruses. Google also manually intervenes in search results for legal reasons, for instance to remove child sexual abuse content (child pornography) or copyright infringing material.

Despite these improvements, some researchers have raised questions over the efficacy of the searches. According to an Experian Hitwise report released in August, more than 81.3% of searches in Yahoo Search resulted in a visit to a website, with Bing a close second at 80.6%. By contrast, Google’s success rate was significantly lower at 67.6%.

The report partly attributed Google’s “not-so-accurate" performance to the massive number of library books in its database. To scan millions of older works, Google and its library partners used optical character recognition (OCR) programmes “that are not 100% foolproof—especially when processing old texts typeset in archaic fonts or with foreign-language characters".

Due to OCR errors, noted the report, Google Books contains a huge number of word misidentifications that can lead Internet users down false trails, especially when users conduct “one-word searches".

Patel said he’s “not too sure how the report arrived at this conclusion".

leslie.d@livemint.com

Keep up with latest news/ views on the livemint Facebook page

Follow livemint on Twitter for latest news / views / full coverage

Close