eBay: All content tagged as eBay in NoSQL databases and polyglot persistence
Thursday, 28 March 2013
Graph Based Recommendation Systems at eBay
Slidedeck from eBay explaining how they have implemented a graph based recommendation system based on,—surprise! not a graph database—Cassandra.
Original title and link: Graph Based Recommendation Systems at eBay (©myNoSQL)
Monday, 16 July 2012
eBay's Cassandra Data Modeling Best Practices
Jay Patel (architect at eBay):
Our Cassandra deployment is not huge, but it’s growing at a healthy pace. In the past couple of months, we’ve deployed dozens of nodes across several small clusters spanning multiple data centers. You may ask, why multiple clusters? We isolate clusters by functional area and criticality. Use cases with similar criticality from the same functional area share the same cluster, but reside in different keyspaces.
This first post is focused on two old techniques that have been applied even with relational databases:
- model data around query patterns
- de-normalize and duplicate for read performance.
Original title and link: eBay’s Cassandra Data Modeling Best Practices (©myNoSQL)
via: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
Monday, 7 May 2012
eBay, Wal-Mart Search for Revved-Up Search Engines
Reuters reporting about eBay and Wal-Mart’s work to improve their search engines:
The search engine project takes time because eBay’s online marketplace has so much variable information from millions of listings that are described differently by each seller - something known as unstructured data in the tech world.
This is not much of a NoSQL story, but there’s something I’m reading between the lines: when talking about creating better search solutions making search work at scale is not mentioned, implying this is a solved problem. The focus is on handling unstructured data and creating better relevancy algorithms.
I have no details about the architecture of the new version of eBay search, but I have found this diagram of eBay’s Voyager in a slidedeck by Dan Pritchett from around 2007:

Original title and link: eBay, Wal-Mart Search for Revved-Up Search Engines (©myNoSQL)
via: http://www.reuters.com/assets/print?aid=USBRE84319420120504
Thursday, 6 October 2011
eBay Exec Urges Hadoop Community...
Darren Bruntz, senior director of e-commerce at eBay:
I think we will stay on our setup of the three platforms for a few more years, but Hadoop could be a more compelling offering if the open source community and its contributors got some more focus and energy, as you would have a whole community of people working on new tools and features,

Cumulative Lines of Code Contributed to Apache Hadoop Trunk Timeline through June 2011
Where is eBay in this list of Hadoop contributors?
Original title and link: eBay Exec Urges Hadoop Community… (©myNoSQL)
via: http://www.computing.co.uk/ctg/news/2114546/ebay-exec-urges-hadoop-community-focused
Thursday, 22 September 2011
Big Data Is Going Mainstream: Facebook, Yahoo!, eBay, Quantcast, and Many Others
Shawn Rogers has a short but compelling list of Big Data deployments in his article Big Data is Scaling BI and Analytics. This list also shows that even if there are some common components like Hadoop, there are no blueprints yet for dealing with Big Data.
-
Facebook: Hadoop analytic data warehouse, using HDFS to store more than 30 petabytes of data. Their Big Data stack is based only on open source solutions.
-
Quantcast: 3,000 core, 3,500 terabyte Hadoop deployment that processes more than a petabyte of raw data each day
-
University of Nebraska-Lincoln: 1.6 petabytes of physics data Hadoop cluster
-
Yahoo!: 100,000 CPUs in 40,000 computers, all running Hadoop. Also running a 12 terabyte MOLAP cube based on Tableau Software
-
eBay: has 3 separate analytics environments:
- 6PB data warehouse for structured data and SQL access
- 40PB deep analytics (Teradata)
- 20PB Hadoop system to support advanced analytic workload on unstructured data
Original title and link: Big Data Is Going Mainstream: Facebook, Yahoo!, eBay, Quantcast, and Many Others (©myNoSQL)
Tuesday, 2 August 2011
eBay Deploys 100TB of Flash Storage
eBay is a prime example of the benefits of flash. Nimbus Data CEO Thomas Isakovich told me that eBay had only 2.5TB of flash installed six months ago before recently upgrading to 100TB. Within the PayPal division, where Nimbus is deployed, Isakovich said eBay has cut power costs by 78 percent, cut its rack space by half and is able to better meet performance demand overall by spinning up virtual machines even faster.
This probably marks the start of a new trend where flash is used not only for storing hot data.
Original title and link: eBay Deploys 100TB of Flash Storage (©myNoSQL)
via: http://gigaom.com/cloud/ebay-deploys-100tb-of-flash-storage/
Saturday, 4 June 2011
Hadoop at eBay
Anil Madan[1] presenting on Hadoop at eBay:
The talk will illustrate how Hadoop has become a critical center piece of infrastructure for eBay, running on thousands of servers. I will also discuss how it fuels our derived data pipeline which in turn affects just about all our services. Attendees will understand how we have integrated Hadoop into our existing data warehouse and how we are leveraging components of the ecosystem like HBase, Pig, and Hive for different research and production use cases.
Saturday, 13 November 2010
Videos from Hadoop World
There was one NoSQL conference that I’ve missed and I was really pissed off: Hadoop World. Even if I’ve followed and curated the Twitter feed, resulting in Hadoop World in tweets, the feeling of not being there made me really sad. But now, thanks to Cloudera I’ll be able to watch most of the presentations. Many of them have already been published and the complete list can be found ☞ here.
Based on the twitter activity on that day, I’ve selected below the ones that seemed to have generated most buzz. The list contains names like Facebook, Twitter, eBay, Yahoo!, StumbleUpon, comScore, Mozilla, AOL. And there are quite a few more …
Monday, 8 November 2010
eBay, Hadoop, HBase
From ☞ DBMS2:
eBay sees Hadoop as an interesting tool for certain special purposes:
- eBay likes Hadoop for certain tasks such as image analysis.
- eBay doesn’t like Hadoop for anything that requires data movement, such as a join.
- Similarly, eBay doesn’t like HBase.
But based on reports from Hadoop World it looks like eBay usage of Hadoop is quite wide:
- eBay had a 4 node cluster in 2007, a 28 and a 10 node cluster in 2009, a 500+ nodes cluster in 2010
- 4200 processors, 4.3 PB of data on CentOS 1U 48 GB RAM datanodes.
- production cluster will be 8500 procs, 16PB
Original title and link: eBay, Hadoop, HBase (NoSQL databases © myNoSQL)