mysql: All content tagged as mysql in NoSQL databases and polyglot persistence
Thursday, 27 September 2012
Test Driving Database Indexes
Myron Marston:
Database indexes are conceptually very simple, but in practice, I’ve found that it’s hard to predict when they’ll get used and what indexes a given table needs. On a project at work I came up with the idea to test-drive my database indexes, just like I test-drive the rest of my code. I’d like to share the approach I came up with.
A very interesting idea at least for MySQL users.
Original title and link: Test Driving Database Indexes (©myNoSQL)
via: http://myronmars.to/n/dev-blog/2012/09/test-driving-database-indexes
Tuesday, 25 September 2012
Provisioned IOPS for Amazon RDS
Werner Vogels:
Following the huge success of being able to provision a consistent, user-requested I/O rate for DynamoDB and Elastic Block Store (EBS), the AWS Database Services team has now released Provisioned IOPS, a new high performance storage option for the Amazon Relational Database Service (Amazon RDS). Customers can provision up to 10,000 IOPS (input/output operations per second) per database instance to help ensure that their databases can run the most stringent workloads with rock solid, consistent performance.
Amazon is the first company I know of championing guaranteed performance SLAs. Until recently most of the SLAs were referring to availability, resilience, and redundancy. But soon performance-based SLAs will become the norm for other service providers. I’d also expect appliance vendors to be asked for similar guarantees sooner than later.
Original title and link: Provisioned IOPS for Amazon RDS (©myNoSQL)
via: http://www.allthingsdistributed.com/2012/09/provisioned-iops-rds.html
Monday, 6 August 2012
Big Data at Aadhaar With Hadoop, HBase, MongoDB, MySQL, and Solr
It’s unfortunate that the post focuses mostly on the usage of Spring and RabitMQ and the slidedeck doesn’t dive deeper into the architecture, data flows, and data stores, but the diagrams below should give you an idea of this truly polyglot persistentency architecture:
The slide deck presenting architecture principles and numbers about the platform after the break.
Sunday, 5 August 2012
A Quick Test of the New MySQL Memcached Plugin With (J)Ruby
Gabor Vitez:
With a new post hitting Hacker News again on MySQL’s memcached plugin, I really wanted to do a quick-and-dirty benchmark on it, just to see what good it is – does this interface offer any extra speed when compared to SQL+ActiveRecord? Does it have it’s place in the software stack? How much work is needed to get this combination off the ground?
When running into micro-benchmarks, I really try my best to figure out if they hide any value. But it’s hard to find any in one that uses different libraries, runs both the benchmark and the server on the same machine and uses no concurrency.
Original title and link: A Quick Test of the New MySQL Memcached Plugin With (J)Ruby (©myNoSQL)
via: http://elevat.eu/blog/2012/08/a-quick-test-of-the-new-mysql-memcached-plugin-with-ruby/
Tuesday, 17 July 2012
Klout Data Architecture: MySQL, HBase, Hive, Pig, Elastic Search, MongoDB, SSAS
Just found slideck (embedded below) describing the data workflow at Klout. Their architecture includes many interesting pieces combining both NoSQL and relational databases with Hadoop and Hive and Pig and traditional BI. Even Excel gets a mention in the slides:
- Pig and Hive
- HBase
- Elastic Search
- MongoDB
- MySQL
Generating Meaningful Test Data Using a MySQL Function
Ronald Speelman:
You can use this MySQL function to generate names, (e-mail)addresses, phone numbers, urls, bit values, colors, IP address, etc.. As usual, the code is provided in a zipfile and the code is fully documented.
The last couple of days I’ve been looking for generating some good test data in JSON format1, so if you are aware of something please drop me a note.
-
Right now I’m using as input a corpus of combines JSON files I’ve found online, but I’m not happy with the solution. ↩
Original title and link: Generating Meaningful Test Data Using a MySQL Function (©myNoSQL)
via: http://moinne.com/blog/ronald/mysql/howto-generate-meaningful-test-data-using-a-mysql-function
Tuesday, 26 June 2012
MySQL Is Bazillion Times Faster Than MemSQL
Domas Mituzas about the MemSQL vs MySQL benchmark:
Though I usually understand that those claims don’t make any sense, I was wondering what did they do wrong. Apparently they got MySQL with default settings running and MemSQL with default settings running, then compared the two. They say it is a good benchmark, as it compares what users get just by installing standard packages.
That is already cheating, because systems are forced to work in completely different profiles.
The first paragraph of the post summarizes very well the general feeling about benchmarks:
I don’t like stupid benchmarks, as they waste my time.
I think that most of the generic benchmarks are stupid, even if some generic numbers are considered interesting by software engineers. Benchmarks designed around specific scenarios of applications will most of the time give more realistic results. But even those are difficult to design and account for all the configuration options, scaling, or changes of the use cases.
Original title and link: MySQL Is Bazillion Times Faster Than MemSQL (©myNoSQL)
Friday, 15 June 2012
A Tragically Comedic Security Flaw in MySQL
In short, if you try to authenticate to a MySQL server affected by this flaw, there is a chance it will accept your password even if the wrong one was supplied. The following one-liner in bash will provide access to an affected MySQL server as the root user account, without actually knowing the password.
$ for i in `seq 1 1000`; do mysql -u root --password=bad -h 127.0.0.1 2>/dev/null; done
Don’t try this at home. Or if you try it, don’t tell anyone the result.
Original title and link: A Tragically Comedic Security Flaw in MySQL (©myNoSQL)
Thursday, 7 June 2012
Tumblr Jetpants: A Toolkit for Huge MySQL Topologies
Today, we’re happy to announce the open source release of Jetpants, Tumblr’s in-house toolchain for managing huge MySQL database topologies. Jetpants offers a command suite for easily cloning replicas, rebalancing shards, and performing master promotions. It’s also a full Ruby library for use in developing custom billion-row migration scripts, automating database manipulations, and copying huge files quickly to multiple remote destinations.
This toolkit helps Tumblr manage its 60bn rows totaling 21TB of data in a MySQL cluster.
Original title and link: Tumblr Jetpants: A Toolkit for Huge MySQL Topologies (©myNoSQL)
via: http://engineering.tumblr.com/post/24612921290/jetpants-a-toolkit-for-huge-mysql-topologies
MySQL KILL Command: How It Works
Have you ever tried to kill a query, but rather than just go away, it remained among the running ones for an extended period of time? Or perhaps you have noticed some threads makred with killed showing up from time to time and not actually dying. What are these zombies? Why does MySQL sometimes seem to fail to terminate queries quickly? Is there any way to force the kill command to actually work instantaneously?
Bookmarked.
Original title and link: MySQL KILL Command: How It Works (©myNoSQL)
via: http://www.dbasquare.com/2012/05/15/why-do-threads-sometimes-stay-in-killed-state-in-mysql/
Thursday, 24 May 2012
MySQL Is Done. NoSQL Is Done. It's the Postgres Age
Jeff Dickey enumerates some of the new features available in PostgreSQL—schema-less data, array columns, queuing, full-text searching, geo-spatial indexing—concluding that PosgreSQL has now everything an application needs:
Postgres has taken the features out of all of these tools and integrate it right inside the platform. Now you don’t need to spin up a mongo cluster for non-rel data, rabbitmq cluster for queueing, solr box for searching. You can just have a single postgres server. That saves a huge ops headache since each of those clusters/boxes have to be durable, replicated, and scalable.
Sounds a bit too optimistic? As we’ve learned from the NoSQL space there are no silver bullets:
Now obviously, there’s a glaring downside with this approach: you get one box. Maybe a read slave or something, but really, you can’t scale it.
As you can imagine I disagree with most of the points, the only exception being that it is great to see so many useful features packaged with PostgreSQL—these are definitely going to make like easier for some of the developers.
But when talking about MySQL and NoSQL being done:
- MySQL is done, except it has a huge community, there are tons of developers very familiar with it, and last but not least MySQL powers massive deployments. This last part matters a lot.
- NoSQL is done, except many NoSQL solutions tackle different problem spaces providing optimal solutions for these by staying focused. Neither Oracle, nor MongoDB, nor PosgreSQL will be able to solve all problems. The wider range of problems they are covering, the less optimal solutions they are providing for corner case or extreme scenarios.
Original title and link: MySQL Is Done. NoSQL Is Done. It’s the Postgres Age (©myNoSQL)
Pinterest Architecture Numbers
Todd Hoff caught some new numbers about Pinterest architecture and from those the ones interesting from the data point of view:
-
125 EC2 memcached instances, from which 90 for production and 35 for internal usage:
Another 90 EC2 instances are dedicated towards caching, through memcache. “This allows us to keep a lot of data in memory that is accessed very often, so we can keep load off of our database system,” Park said. Another 35 instances are used for internal purposes.
-
70 master MySQL databases on EC2
- sharded at 50% capacity
- backup databases in different regions
Behind the application, Pinterest runs about 70 master databases on EC2, as well as another set of backup databases located in different regions around the world for redundancy.
In order to serve its users in a timely fashion, Pinterest sharded its database tables across multiple servers. When a database server gets more than 50% filled, Pinterest engineers move half its contents to another server, a process called sharding. Last November, the company had eight master-slave database pairs. Now it has 64 pairs of databases. “The sharded architecture has let us grow and get the I/O capacity we need,” Park said.
-
80 million/410TB objects stored in S3
- no details about Redis
Original title and link: Pinterest Architecture Numbers (©myNoSQL)
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling


