BitFunnel is a library for high-performance full-text search at internet scale. It is based on a probabilistic algorithm that identifies and ranks documents according to queries involving keywords, phrases, and mathematical expressions. It powers much of the production traffic at Bing.
We are interested in advancing the academic discussion around how systems like BitFunnel are designed, implemented, and deployed across many of thousands of nodes.
To this end, we have:
Released the complete BitFunnel library code, unabridged, as it exists in production today, under the MIT license. There is no expectation that this code snapshot will build or run outside of Bing. Its primary value is for reference – a historical document showing the actual code running in production at one point in time.
Started a new open source project which is a first attempt at translating the core ideas of BitFunnel to a general purpose library that can be used outside of the Bing data centers.
Started a blog covering the theory behind BitFunnel, the rationale for important design decisions and the lessons we learned when we first created the system and those we’re learning today as we bring it to open source.
If this is your first time visiting, it is worth looking at the first couple of blog entries (On the Road to Open Source).