Datomic: All content tagged as Datomic in NoSQL databases and polyglot persistence
Last week I spend some time on implementing the Blueprints interface on top of Datomic. The RDF and SPARQL feel of the Datomic data model and query approach makes it a good target for implementing a property graph. I finished the implementation and all unit tests are passing. Now, what makes it really cool is that it is the only distributed “temporal” graph database that I’m aware of. It allows to perform queries against a version of the graph in the past.
This is the first solution I’m reading about addressing the time dimension in a graph model.
Original title and link: Distributed Temporal Graph Database Using Datomic ( ©myNoSQL)
I waited for the Datomic announcement with great excitement, and I’d like now to share some thoughts, hoping they will be food for more comments or blog posts.
Datomic certainly provides interesting features, most notably:
- Clojure-style data immutability, separating entity values in time.
- Declarative query language with powerful aggregation capabilities.
But unfortunately, my list of concerns is way longer, maybe because some lower level aspects weren’t addressed in the whitepaper, or maybe because my expectations were really too high. Let’s try to briefly enumerate the most relevant ones:
Datomic provides powerful aggregation/processing capabilities, but violates one of the most important rules in distributed systems: collocating processing with data, as data must be moved from storage to peers’ working set in order to be aggregated/processed. In my experience, this is a huge penalty when dealing with even medium-sized datasets, and just answering that “we expect it to work for most common use cases” isn’t enough.
My comment: The answer to similar comments pointed to the local caches. But I think it is still a very valid observation.
In-process caching of working sets usually leads in my experience to compromising overall application reliability: that is, the application usually ends up spending lots of time dealing with the working set cache, either faulting/flushing objects or gc’ing them, rather than doing its own business.
Transactors are both a Single Point Of Bottleneck and Single Point Of Failure: you may don’t care about the former (which I’d do btw), but you have to care about the latter.
My comment: The Datomic paper contains an interesting formulation about the job of transactors for reads and writes:
When reads are separated from writes, writes are never held up by queries. In the Datomic architecture, the transactor is dedicated to transactions, and need not service reads at all!
In an ACID system, both reads and writes represent transactions though.
You say you avoid sharding, but being transactors a single point of bottleneck, when the time you have too much data over a single transactor system will come, you’ll have to, guess what, shard, and Datomic has no support for this apparently.
There’s no mention about how Datomic deals with network partitions.
I think that’s enough. I’ll be happy to read any feedback about my points.
As Sergio Bossa, I’d really love to hear some answers from the Datomic team.
Original title and link: Thoughts About Datomic ( ©myNoSQL)
If you got curious about Datomic, below are two videos from its creators, Rich Hickey and Stuart Halloway, providing an intro to Datomic and its query language Datalog. About 30 minutes in total.
Datomic: Distributed Database Designed to Enable Scalable, Flexible and Intelligent Applications, Running on Next-Generation Cloud Architectures
I’m just starting to read about Datomic a distributed database designed to enable scalable, flexible and intelligent applications, running on next-generation cloud architectures. Skimming through the whitepaper:
- it goes after relational databases calling out the same problems as those formulated by Michael Stonebraker
- it also goes after NoSQL databases (ACID transactions, Joins, logical query language)
- my first reaction is that it’s pretty similar to Gigaspaces, Terracotta, etc.
- it uses an interesting immutable timestamped data model (CQRS?)
- it’s completely unclear to me why the production service must be AWS
- the team behind Datomic: Rich Hickey (Clojure), Stuart Halloway (Clojure), Justin Gehtland
Original title and link: Datomic: Distributed Database Designed to Enable Scalable, Flexible and Intelligent Applications, Running on Next-Generation Cloud Architectures ( ©myNoSQL)