ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Advantages of developing NoSQL applications on .NET platforms using FatDB [sponsor]

Words from this week’s sponsor, FatCloud:


FatDB is a full implementation of NoSQL databases for Windows .Net development, extending database functionality by integrating a Map Reduce work queue, file management system, a high speed cache, and application services. Therefore, FatDB is uniquely suited to as a platform to construct applications that are scalable, reliable, responsive to market changes, and cost effective. FatDB enables powerful, scalable applications providing the agility and performance required through:

  • Reduces complexity. Applications are developed faster.
  • Increases elasticity. Applications can quickly respond to shifts in demands.
  • Portability. Applications can move to the cloud and back.

From these operating factors, FatDB is ideally suited for:

  • Mobile. Great when trying to accommodate unpredictable usage, requiring applications to be elastics to cope with changes in demand.
  • Financial Services. Financial applications requiring real-time data access with extremely high availability.
  • E-Commerce. Provides flexible data structures to capitalize on new market opportunities.
  • Manufacturing. Systems must respond against peak production, providing insight into trends and feedback mechanics.

Simply, FatDB can help you develop NoSQL applications in .Net with less effort and significantly less cost, higher quality and performance, for demanding cloud-based applications. Download a free Developer’s edition at FatCloud.

Original title and link: Advantages of developing NoSQL applications on .NET platforms using FatDB [sponsor] (NoSQL database©myNoSQL)


Banks and the Ethernal Consistency Example or What trumps consistency

Todd Hoff extracts and expands on some thoughts about BASE vs ACID from Eric Brewer’s NoSQL: Past, Present, Future published on InfoQ:

Consistency it turns out is not the Holy Grail. What trumps consistency is:

  • Auditing
  • Risk Management
  • Availability

But the cornerstone of the availability vs consistency conversation is:

Availability correlates with revenue and consistency generally does not.

✚ Over time Michael Stonebraker has been the most prominent supporter of exactly the opposite argument.

✚ Remember Emin Gün Sirer’s The NoSQL Partition Tolerance Myth? He used the bank example too.

Original title and link: Banks and the Ethernal Consistency Example or What trumps consistency (NoSQL database©myNoSQL)

via: http://highscalability.com/blog/2013/5/1/myth-eric-brewer-on-why-banks-are-base-not-acid-availability.html


Wikipedia Adopts MariaDB

The technical details of Wikipedia’s migration from MySQL to MariaDB:

As a read-heavy site, Wikipedia aggressively uses edge caching. Approximately 90% of pageviews are served entirely from the edge while at the application layer, we utilize both memcached and redis in addition to MySQL. Despite that, the MySQL databases serving English Wikipedia alone reach a daily peak of ~50k queries/second. Most are read queries served by load-balanced slaves, depending on consistency requirements. 80% of the English Wikipedia query load (up to 40k qps) are typically handled by just two database servers at any given time. Our most common query type (40% of all) has a median execution time of ~0.2ms and a 95th percentile time of ~50ms. To successfully use MariaDB in production, we need it to keep up with the level of performance obtained from Facebook’s MySQL fork, and to behave consistently as traffic patterns change.

As you can see in this post, the only “political” point made is hidden within true reasons:

Equally important, as supporters of the free culture movement, the Wikimedia Foundation strongly prefers free software projects; that includes a preference for projects without bifurcated code bases between differently licensed free and enterprise editions. We welcome and support the MariaDB Foundation as a not-for-profit steward of the free and open MySQL related database community.

Slightly different to Wikipedia Migrates to MariaDB.

Original title and link: Wikipedia Adopts MariaDB (NoSQL database©myNoSQL)

via: https://blog.wikimedia.org/2013/04/22/wikipedia-adopts-mariadb/


MySQL in the Cloud: Discontinuing of Xeround Cloud Database Public Service

Cloud and MySQL related:

We are deeply sorry to announce that Xeround’s public cloud offering will be discontinued soon. All Xeround FREE database instances will be terminated on May 8th, and the paid plans terminated on May 15th.

This was announced on May 1st.

✚ This only means more for Amazon RDS.

Original title and link: MySQL in the Cloud: Discontinuing of Xeround Cloud Database Public Service (NoSQL database©myNoSQL)

via: http://xeround.com/blog/2013/05/discontinuing-of-xeround-cloud-database-public-service


Microsoft Azure Sales Top $1 Billion Challenging Amazon

Last week I’ve seen some Amazon Web Service’s revenue guestimates. Bloomberg posted an article about Microsoft Azure and related programs (?) revenue: $1 billion.

Interesting numbers:

  • market share: Amazon Web Services 71%, Microsoft Azure 20%
  • Azure grew 48% in the last 6 months
  • Gartner estimates the infrastructure segment of the cloud market at $6.17 billions in 2012 and growing to $30.6 billions in 2017
  • Gartner estimates total cloud market at $108.9 billions in 2012 and growing to $237.2 billions in 2017. (nb: I find this one weird as it includes online advertising and other less-cloudy-services-imo).

Amazon hasn’t given many details about the AWS platform, except 3 numbers:

  1. number of objects stored in S3. This has been doubling every year for the last 4 years
    1. Q4 2012: 1.3trillions
    2. Q3 2011: 566b
    3. Q4 2010: 262b
    4. Q4 2009: 102b
    5. Q4 2008: 40b
    6. Q4 2007: 14b
    7. Q4 2006: 2.9b
  2. number of requests per second AWS
  3. number of EMR clusters (?) spun

According to some slides from last October/November:

  1. S3 stored over 1.3 trillion objects
  2. AWS handles over 830k requests/s
  3. 3.7mil EMR clusters spun since 2010

While I don’t have any data about RDS and Dynamo, it would be great if Microsoft would release any details about Azure.

✚ If AWS has a market share of 71% and Azure 20%, that leaves Google plus others with 9%. Makes me wonder how accurate this data is.

Original title and link: Microsoft Azure Sales Top $1 Billion Challenging Amazon (NoSQL database©myNoSQL)

via: http://www.bloomberg.com/news/2013-04-29/microsoft-azure-sales-top-1-billion-challenging-amazon.html


Wikipedia Migrates to MariaDB... but facts are facts

Jon Buys:

There was, and continues to be, concern over Oracle’s treatment of the open source competitor to their own Oracle database. I personally have wondered what motivation, if any, Oracle has to maintain MySQL. They may simply be milking the revenue stream created by MySQL AB until the well goes dry. Since MariaDB is surpassing MySQL in performance and community goodwill, that day may come sooner rather than later.

A couple of little known things:

  1. Oracle has been house for InnoDB since 2005. InnoDB was and continues to be the default, recommended engine for MySQL. Before and after Oracle acquired MySQL through Sun Microsystems.
  2. Oracle has been house for Sleepycat’s BerkleyDB since 2006. Those products are definitely not dead. Community-wise maybe they haven’t put much effort into extending it.

Facts are facts.

Original title and link: Wikipedia Migrates to MariaDB… but facts are facts (NoSQL database©myNoSQL)

via: http://ostatic.com/blog/wikipedia-migrates-to-mariadb


US patent office embraces MarkLogic

It looks like the US Patent and Trademark Office will, at least, get some better search functionality across its database:

Now, MarkLogic, an a NoSQL database provider specializing in MXL data services, is working with the US Patent and Trademark Office (USPTO) in an effort to make applications easier to complete, and make the review process easier to do. XML is one way to render different types of information like word documents, PDFs, and images into a central database and present the information back to users. Previously, if an individual or company wanted to apply for a patent or trademark, they had to look through paper manuals and applications to understand how to present their unique inventions for review.

I just hope MarkLogic could implement some sort of triggers or rules that would deny completely unreasonable patents.

Original title and link: US patent office embraces MarkLogic (NoSQL database©myNoSQL)

via: http://civsourceonline.com/2013/04/25/us-patent-office-embraces-big-data/


Hadoop Drives Down Costs

Darryl K. Taft reporting the experience of using Hadoop at UC Irvine Medical Center:

Because they were bleeding money, the team wanted a cost-effective solution. “Our target was $500 per terabyte. We were at $100,000 per terabyte with the old system,” Peterson said. “With our Hadoop cluster, we’re now at $900 per terabyte.”

How are these costs calculated?

  1. Fixed costs: hardware, any one time licenses
  2. Recurring costs: hardware replacement, energy, HR

Is this all?

Original title and link: Hadoop Drives Down Costs (NoSQL database©myNoSQL)

via: http://www.eweek.com/print/cloud/hadoop-drives-down-costs-drives-up-usability-with-sql-convergence/


Cloudera Impala 1.0 Release Notes and A Couple of Questions

This is what I’ve been looking for since posting about Impala 1.0: the release notes. From the new features list:

  • support for ALTER TABLE
  • REFRESH for a single table
  • Hints for specifying particular join strategies
  • Dynamic resource management, allowing high concurrency for Impala queries

Question: if I remember correctly Impala uses a single process on each machine to execute queries.

  1. is it multi-threaded?
  2. does it do any memory/CPU management so one query is not completely exhausting any of these resources?
  3. what happens with the queries executing when this process fails?

Original title and link: Cloudera Impala 1.0 Release Notes and A Couple of Questions (NoSQL database©myNoSQL)


Cloudera Impala Brings SQL Querying To Hadoop

InformationWeek about today’s Impala 1.0 release:

Impala supports direct querying of data in the Hadoop Distributed File System (HDFS) and HBase (NoSQL database) indexes, and Cloudera claims it’s 3X to 30X faster than Hive. Beta customers report results that are falling into that range. Six3 Systems, for example, a systems integrator serving federal agencies, has seen at least 14X faster querying than Hive, according to analytics developer Wayne Wheeles.

Original title and link: Cloudera Impala Brings SQL Querying To Hadoop (NoSQL database©myNoSQL)

via: http://www.informationweek.com/big-data/news/software/information-management/cloudera-impala-brings-sql-querying-to-h/240153861


Impala 1.0 - That was fast

Cloudera announces Impala 1.0 GA release.

That was fast—I guess this is one of the (little) advantages of having Hortonworks working on Stinger, Pivotal on HAWQ, Qubole offering Hive, Pig and Sqoop as-a-Service

Original title and link: Impala 1.0 - That was fast (NoSQL database©myNoSQL)


Redis on Windows Stress Tests

Claudio Caldato1 reports on the advance the Microsoft team is making towards releasing a stable, (very well) tested version of Redis for Windows:

In phase I of our stress testing, we put Redis on Windows through various tests with execution times ranging from 1 to 16 days, and configurations ranging from a simple single-master setup to more complex configurations such as the one shown below, with one master and four replicas.

The team also published the details of their stress tests here


  1. Claudio Caldato is Principal Program Manager Lead in Microsoft Open Technologies team. 

Original title and link: Redis on Windows Stress Tests (NoSQL database©myNoSQL)

via: http://blogs.msdn.com/b/interoperability/archive/2013/04/22/redis-on-windows-stable-and-reliable.aspx