NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



SQL Server: All content tagged as SQL Server in NoSQL databases and polyglot persistence

Count Distinct Compared on Top 4 SQL Databases

Performance and query plans for count distinct :

Truly, the gauntlet had been thrown, and we are here to answer. We ran the queries on Postgres 9.3, MySQL 5.6, SQL Server 2012 SE 11.0, and Oracle SE1 11.2.

count distinct performance

Interestingly, but quite expected, the query plans for queries in SQL Server and Oracle were identical. What’s intriguing is how with a more “naïve” query plan, they both outperformed MySQL and PostgreSQL.

Original title and link: Count Distinct Compared on Top 4 SQL Databases (NoSQL database©myNoSQL)


SQL Server 2014 Backup Basics

A very detailed intro to backing up SQL Server 2014 by Grant Fritchey:

A backup is nothing more than a copy that is created as a type of insurance policy in the event that the original goes away. The same applies to SQL Server database backups to an extent, but database backups are not simply a file copy. They are a very specific type of copy that is aware of the transactional nature of SQL Server. This copy will be created in such a way as to deal with transactions that are ‘in flight,’ that have not yet been completed. Simply copying the files that define a database will not deal with transactions and can lead to serious data corruption. For this reason, you should, in most circumstances, use the native backup processes or, third party tools that work directly with the native processes such as Red Gate SQL Backup. There are some large scale systems that will need to work with non-standard backup mechanisms such as SAN snapshots. These are far outside the scope of this article.

I was looking for more details about the impact on the databases and how it works with clustered instances so I could compare it with the backup solutions in NoSQL databases.

Original title and link: SQL Server 2014 Backup Basics (NoSQL database©myNoSQL)


SQL Server's Future

Brent Ozar about the state and future of the things in the SQL Server space:

In SQL Server 2012 and beyond, we’ve got:

  • AlwaysOn Availability Groups – high availability, disaster recovery, and scale-out reads
  • Hekaton - in-memory storage with optimized stored procedures and new data formats on disk
  • Column store indexes – faster data retrieval for certain kinds of queries

Call me maybe crazy, but I don’t see really widespread adoption for any of these.

Leaving crazyness aside, I’m wondering if these features are not of interest for SQL Server users then what is would SQL Server users want to see?

Hekaton is something new for me to read about.

✚ Here’s something interesting about Hekaton:

By late fall 2009, Larson and his colleagues had come up with a design and a simple prototype for an in-memory database engine that showed huge performance improvements. They had moved away from a partitioned approach, which essentially treated a multicore processor as a distributed system, to a latch-free, also called lock-free, design that focused on removing the barriers to scalability present in current systems.

✚ There’s a paper about the MVCC implementation in Hekaton: High-Performance Concurrency Contorl Mechanisms for Main-Memory Databases.

Original title and link: SQL Server’s Future (NoSQL database©myNoSQL)


Microsoft SQL Server 2012 High Availability Solutions

The recent announcement of the Microsoft SQL Server 2012 release emphasized the high availability features added to this version. Here is what I could find after some digging through the documentation:

  • AlwaysOn Failover Cluster Instances: As part of the SQL Server AlwaysOn offering, AlwaysOn Failover Cluster Instances leverages Windows Server Failover Clustering (WSFC) functionality to provide local high availability through redundancy at the server-instance level—a failover cluster instance (FCI). An FCI is a single instance of SQL Server that is installed across Windows Server Failover Clustering (WSFC) nodes and, possibly, across multiple subnets. On the network, an FCI appears to be an instance of SQL Server running on a single computer, but the FCI provides failover from one WSFC node to another if the current node becomes unavailable.

    This is explained in more detail on AlwaysOn Failover Cluster Instances (SQL Server).

  • AlwaysOn Availability Groups: The AlwaysOn Availability Groups feature is a high-availability and disaster-recovery solution that provides an enterprise-level alternative to database mirroring. Introduced in SQL Server 2012, AlwaysOn Availability Groups maximizes the availability of a set of user databases for an enterprise. An availability group supports a failover environment for a discrete set of user databases, known as availability databases, that fail over together. An availability group supports a set of read-write primary databases and one to four sets of corresponding secondary databases. Optionally, secondary databases can be made available for read-only access and/or some backup operations.

    More documentation about AlwaysOn Availability groups can be found here.

  • Database mirroring: This feature will be removed in a future version of Microsoft SQL Server.

  • Log shipping: SQL Server Log shipping allows you to automatically send transaction log backups from a primary database on a primary server instance to one or more secondary databases on separate secondary server instances.

    This is the well-known master-slave setup. More details can be found here.

Also worth checking the availability of these feature per SQL Server 2012 editions:

SQL Server 2012 Hgih Availability

Original title and link: Microsoft SQL Server 2012 High Availability Solutions (NoSQL database©myNoSQL)

Hadoop Interoperability in Microsoft SQL Server and Parallel Data Warehouse

In the data deluge faced by businesses, there is also an increasing need to store and analyze vast amounts of unstructured data including data from sensors, devices, bots and crawlers. By many accounts, almost 80% of what businesses store is unstructured data — and this volume is predicted to grow exponentially over the next decade.  We have entered the age of Big Data. Our customers have been asking us to help store, manage, and analyze both structured and unstructured data — in particular, data stored in Hadoop environments.  As a first step, we will soon release a Community Technology Preview (CTP) of two new Hadoop connectors — one for SQL Server and one for PDW.  The connectors provide interoperability between SQL Server/PDW and Hadoop environments, enabling customers to transfer data between Hadoop and SQL Server/PDW.  With these connectors, customers can more easily integrate Hadoop with their Microsoft Enterprise Data Warehouses and Business Intelligence solutions to gain deeper business insights from both structured and unstructured data.

The time of data silos is long gone and the little giant is making the right moves.

Patrick Durusau

Original title and link: Hadoop Interoperability in Microsoft SQL Server and Parallel Data Warehouse (NoSQL database©myNoSQL)


SQL Server and SQL Azure Comparison

SQL Azure provides relational database functionality as a utility service. Cloud-based database solutions such as SQL Azure can provide many benefits, including rapid provisioning, cost-effective scalability, high availability, and reduced management overhead.

If you are ready for the cloud — keep in mind this is not an easy question as proved by Netflix cloud migration and Reddit’s experience, going from on-premise SQL Server to SQL Azure doesn’t seem to involve drawbacks.

But what I’m really curious about is how SQL Azure compares to Amazon RDS.

Original title and link: SQL Server and SQL Azure Comparison (NoSQL databases © myNoSQL)


MongoDB and SQL Server Basic Speed Tests

Having used MongoDb almost exclusively with the NoRM C# driver for several months now, this is something that I have always wanted to do, just to satisfy my own curiosity.

Unfortunately “basic” is the wrong word. Just another useless benchmark.

Original title and link for this post: MongoDB and SQL Server Basic Speed Tests (published on the NoSQL blog: myNoSQL)