PostgreSQL: All content tagged as PostgreSQL in NoSQL databases and polyglot persistence
While the offer is clear and valuable in itself:
- 99.99% uptime
- 99.999999999% (eleven nines) durability
- read-only asynchronous replicas
- database cloning
I’ve been reading all posts about the announcement looking for the answer to the most obvious question: why would you use Heroku’s Postgres service from outside the Heroku platform?
As far as I can tell:
- the network latency will be significant
- network partitions will occur (more often than having both you application and data in the same DC)
- transfer costs will be significant
So what is the answer?
Media coverage :
- DatabaseJournal: Salesforce Heroku Offers Standalone Cloud-Based PostgreSQL Database — DatabaseJournal.com
- InfoQ: Heroku Launches Postgres as a Standalone Service.
- ZDNet: Heroku launches cloud Postgres database | Cloud | ZDNet UK
- eWeek: Heroku Launches PostgreSQL Database-as-a-Service - Application Development - News & Reviews - eWeek.com
- PCWorld: Salesforce.com’s Heroku Launches Stand-alone Database Service | PCWorld Business Center
- Tools Journal: Heroku Launches PostgreSQL Database As A Service
- CloudBeat: Heroku debuts SQL database-as-a-service for developers | VentureBeat
- SiliconANGLE: Heroku Launches Standalone PostGres Database-as-a-Service | ServicesANGLE
- ReadWriteWeb: Heroku Launches PostgreSQL Standalone Service - ReadWriteCloud
- GigaOm: Heroku launches SQL Database-as-a-Service — Cloud Computing News
- ITProPortal: Heroku Announces PostgreSQL Database-as-a-Service for Developers | ITProPortal.com
Original title and link: Standalone Heroku Postgres’ Unanswered Question ( ©myNoSQL)
Not much information available yet on the project page but looks like bidirectional integration of PosgreSQL and Hadoop.
The Postgres Plus Connector for Hadoop provides developers easy access to massive amounts of SQL data for integration with or analysis in Hadoop processing clusters. Now large amounts of data managed by PostgreSQL or Postgres Plus Advanced Server can be accessed by Hadoop for analysis and manipulation using Map-Reduce constructs.
When speaking about PostgreSQL and Hadoop, the first thing that comes to my mind is Daniel Abadi’s HadoopDB that became not long ago the technology behind his startup which has already raised $9.5mil.
Original title and link: Postgres Plus Connector for Hadoop in Private Beta ( ©myNoSQL)
- part 1: goals and building blocks
- part 2: geo data, PostGIS, and TileStache
- part 3: client side and MongoDB
Original title and link: Tutorial: Building Interactive Maps With Polymaps, TileStach, and MongoDB ( ©myNoSQL)
Three articles about dbShards:
highscalability.com: Product: DbShards - Share Nothing. Shard Everything
What Kind Of Customer Are You Targeting With DbShards? Who Ends Up Using Your Product And Why?
The primary customers for dbShards fit into two categories:
- fast-growing Web or online applications (e.g., Gaming, Facebook apps, social network sites, analytics)
- any application involved in high volume data collection and analysis (e.g., Device Measurement). Any application that requires high rates of read/write transaction volumes with a growing data set is a good candidate for the technology.
I’ve checked the customers page and I don’t see any company listed there that corresponds to the first point above. As regards the second category, read on.
insert performance with dbShards + MySQL + InnoDB is 1500-3000 inserts per shard per second, scaling almost linearly with the number of shards. I forgot to ask how many shards this had been tested for.
I assume you are aware of some numbers for NoSQL databases. Not to mention the 750k qps NoSQLized MySQL.
dbShards has good join performance when – you guessed it! – everything being joined is co-located shard-by-shard, because the tables were distributed on the same shard key and/or replicated across each shard. Cory can’t imagine why you’d want to do an inner join under any other circumstances.
While there’s no surprise in the above quote, I’m not sure how to correlate it with the fact that dbShards targets data analysis clients.
dbms2.com: dbShards update
dbShards’ replication scheme works like this:
- A write initially goes to two places at once — to the DBMS and a dbShards agent, both running on the same server.
- The dbShards agent streams to the dbShards agent on the replica server, and receipt of the streamed write is acknowledged.
- At that point the commits start. (Cory seemed to say that the commit on the primary server happens first, but I’m not sure why.)
In essence, two-phase database commit is replaced by two-phase log synchronization.
Anyone could explain how are these different?
I know all this may come out as too negative. But while I think dbShards has a decent set of features, some of the statements out there are not doing it any favors.