NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



DataSift: All content tagged as DataSift in NoSQL databases and polyglot persistence

DataSift :The Story So Far and Focus for the Future

DataSift, the company with an impressive BigData architecture, has raised another $7.2mil to accelerate the growth:

Maybe our biggest surprise to me is the breadth of use-cases that we’re seeing companies use our Social Data platform for. Social Media Monitoring and “Breaking News” are obvious applications that companies build.  But we’re seeing everything from Business Intelligence, Stock-Trading models, Public-health applications and social-TV guides being built with DataSift.  A big part of Nick’s vision has always been to democratize the social data market – opening it to both small and large companies. We are starting to see this play-out.

I don’t think anyone would contradict me when saying that these are just the tip of the iceberg when speaking about Big Data.

Original title and link: DataSift :The Story So Far and Focus for the Future (NoSQL database©myNoSQL)


Big Data Investment Network Map

Very interesting visualization of some of the companies in the Big Data market connected through their venture capital and investment firms by Benedikt Koehler and Joerg Blumtritt over Beautiful Data blog:

Big Data Investment Network Map

Click to see larger size

There’s only one company I couldn’t find on this map: Hortonworks.

Original title and link: Big Data Investment Network Map (NoSQL database©myNoSQL)

DataSift Using MySQL, HBase, Memcached to Deal With Twitter Firehose

A new great article from Todd Hoff dissecting the DataSift architecture:

DataSift architecture

Click for a larger image

In terms of data store, DataSift architecture includes:

  • MySQL (Percona server) on SSD drives
  • HBase cluster (currently, ~30 hadoop nodes, 400TB of storage)
  • Memcached (cache)
  • Redis (still used for some internal queues, but probably going to be dismissed soon)

Leave whatever you were doing and go read it now.

Original title and link: DataSift Using MySQL, HBase, Memcached to Deal With Twitter Firehose (NoSQL database©myNoSQL)

DataSift PubSub: From Redis to Kafka and 0mq

DataSift moves from Redis PubSub to Kafka and 0mq :

Kafka is still a young project, but it’s maturing fast, and we’re confident enough to use it in production (as a matter of fact, we’ve been using it for months now) in front of our HBase cluster and to collect monitoring events sent from all our internal services. We chose Kafka especially for its persistent storage (which is essentially a partitioned binary log), but we plan to do some analytics via its support for Hadoop soon. And its distributed nature (coordination beetween consumers and brokers is done via Zookeeper) makes it very appealing too.

Original title and link: DataSift PubSub: From Redis to Kafka and 0mq (NoSQL database©myNoSQL)