mysql: All content tagged as mysql in NoSQL databases and polyglot persistence
I’ve finally had the time to go through the release notes and documentation of the recent release of MySQL 5.6. My first throughts when skimming over the announcement were:
- why is online DDL support so low on the list?
- why so much of the announcement is about performance?
- how is Oracle going to position the Memcached-based access to InnoDB considering their other key-value database Oracle NoSQL database?
Here’s the opening part of the “DBA and Developer Guide to MySQL 5.6:
At a glance, MySQL 5.6 is simply a better MySQL with improvements that enhance every functional area of the database kernel, including:
- Better Performance and Scalability
- Improved InnoDB storage engine for better transactional throughput
- Improved Optimizer for better query execution times and diagnostics
- Better Application Availability with Online DDL/Schema changes
- Better Developer Agility with NoSQL Access with Memcached API to InnoDB
- Improved Replication for high performance, self-healing distributed deployments
- Improved Performance Schema for better instrumentation
- Improved Security for worry-free application deployments
- And other Important Enhancements
Almost half of the document focuses on the performance improvements in the InnoDB. If this is the part that interests you, I strongly encourage you to read the doc as my notes about this part are very short:
- InnoDB did a lot of improvements in handling threads and locks
- this will allow MySQL 5.6 to work more efficiently on beefier machines with over 24 cores. The shape of the TPS/CPU threads looks almost linear.
- the transactional throughput graph shows improvements, but the shape suggests that MySQL 5.6 tops at around 96 concurrent connections
- SSDs are mentioned but after digging a bit deeper, it’s difficult to say how much of a difference these changes make.
The next section covers online DDL/schema changes. To my surprise, it’s only a paragraph long, while I was expecting more details considering how many complains I’ve heard about this in the past and how advanced PostgreSQL is. There’s indeed another document, “Overview of Online DDL“, that provides more details:
Basically, starting with this version, many DDL operations do allow concurrent data access, but the many of the operations remain very expensive (some requiring copying all data row by row). Better, but not awesome.
The next section talks about the Memcached-based API for accessing InnoDB data, basically a mechanism offering key-value access that overpasses the SQL layers. I couldn’t find a direct answer to my question “how is Oracle positioning this solution compared to Oracle NoSQL database”. Plus the use of NoSQL term feels weird: “NoSQL access to InnoDB”, “the new NoSQL API for InnoDB”, “NoSQL benchmarking”. I wouldn’t go as far to say that Oracle’s marketing is trying to trivialize the term NoSQL, but it definitely feels like it was one of the top checkboxes that the department had to check.
The last part I was interested into (based on my past experience of completely random and unexplained replication failures) was about replication improvements. I didn’t get much out of this document and I’ll have to read the “MySQL replication: High availability - building a self-healing replication topology whitepaper“:
- global transaction identifiers: “enable replication transactional integrity to be tracked through a replication master/slave topology”
- a new set of Python utilities to use global transaction identifiers
- schema level multi-threaded slave replication
- new row-based replication
- new crash-safe slaves: “stores Binlog positional data within tables so slaves can automatically roll back replication to the last committed event before failure, and resume replication without administrator intervention” (nb: this seems to be the issue I’ve seen before when being responsible for a production master-slave x 2 setup).
Technically, MySQL 5.6 seems a solid improvement over the previous version. But Oracle also needs to address the lack of openness concerns raised by Fedora and OpenSUSE communities.
Original title and link: MySQL 5.6 - What’s New ( ©myNoSQL)
The season of predictions is here. Chris Kanaracus in an all-bold post, quoting analysts:
Jon Reed: “Expect SAP to purchase an up-and-coming “big data” product or vendor, and perhaps several, including at least one that specializes in integration with the Hadoop framework for large-scale data processing”.
I’m still scratching my head to come up with the long list of product or vendors specialized in integration of Hadoop that SAP could acquire.
Curt Monash: “Expect plenty of additional adoption for Hadoop. Everybody has the ‘big bit bucket’ use case, largely because of machine-generated data. Even today’s technology is plenty good enough for that purpose, and hence justifies initial Hadoop adoption.”
What I hope to see happening is that besides the companies putting together the building blocks to make Hadoop friendly enough (real work) and the companies claiming integration with Hadoop (not that fantastic work), there’ll be some companies that take the Hadoop stack and built tools whose immediate impact on the business can be measured. Basically vertical solutions applying the Hadoop stack to specific markets, segments, and scenarios.
The main challenge of “Big Data” these days is not that there isn’t value behind it. It’s the measurability of this value. What each company looking into Big Data tries to answer is what value does big data carry for my case? This is a founded question as not every company has an infinite budget, time, and magic resource pool.
Curt Monash: “Usually when the topic of alternative databases comes up, the incumbent is often Oracle or IBM DB2. But in 2013, MySQL could be playing the latter role. NoSQL and NewSQL products often are developed as MySQL alternatives.
Until now NoSQL companies have understood that the competition is not with each. The huge market that relational databases have it covered has enough potential to welcome a few solid NoSQL solutions and there’s no long term need to fight over the few people that already paid attention to them.
Make your bets.
Original title and link: Three Analyst Predictions for 2013: Hadoop, SAP, and MySQL vs NoSQL ( ©myNoSQL)
It’s unfortunate that the post focuses mostly on the usage of Spring and RabitMQ and the slidedeck doesn’t dive deeper into the architecture, data flows, and data stores, but the diagrams below should give you an idea of this truly polyglot persistentency architecture:
The slide deck presenting architecture principles and numbers about the platform after the break.