Yahoo open sources its Big Data supported search tech
Months after being acquired by Verizon, Yahoo has decided to open source its Big Data processing and serving engine called Vespa. The technology was used exclusively for search queries on key Yahoo products, including Yahoo
News and Flickr, among others.
Verizon-owned Oath, which serves as the parent company of Yahoo, claims that Vespa processes and serves content and ads almost 90,000 times every second, with latencies in the tens of milliseconds. It is even supposed to handle keyword and image searches on a huge scale, with a few hundred queries per second on tens of billions of images.
Developer teams can leverage Vespa to pick content through SQLlike queries and text search, organise matches and generate data-driven pages, as well as write data in realtime. The technology is capable of distributing data and computation over several machines at once.
“By releasing Vespa, we are making it easy for anyone to build applications that can compute responses to user requests, over large data sets, in real-time and at Internet scale — capabilities that up until now have been within the reach of only a few large companies,” Vespa’s distinguished architect Jon Bratseth wrote in a blog post. Vespa can be run on-premise or in the cloud and comes both in Docker images and rpm packages. Its code is available in a GitHub repository along with detailed documentation.