In this post introducing the new search solution implemented at LinkedIn, you can find a pretty good list of the requirements for a good search tool. In the form of what were the showstoppers hit with the previous solution:
- Rebuilding a complete index is extremely difficult
- Live updates are at an entity granularity
- Scoring is inflexible
- Too many small open sources components
On top of these, add flexibility and extensibility, something that is important for every critical component, but much more so for search which depends so heavily on the format, behavior, and fine tunning.
The rest of the post dives into some details of the new solution, which is a distributed layer of extensions on top of Lucene, code named Galene.
Original title and link: LinkedIn’s new search platform