The new architecture of Evident ClearStone APM is using both Cassandra and Neo4j:
Cassandra is implemented as a time-series data store for storing all the real-time data and historical data. Our implementation uses Apache Cassandra 0.7 with the Hector client APIs for Cassandra. With Cassandra 0.7, we can dynamically create and evolve column families for storing all the performance data. The performance data is normalized by metric. We have also partitioned our column families based on the granularity of the data sets.
Neo4j is implemented as an inventory database used for maintaining all the managed resources (i.e. processes, hosts, clusters, etc.) of the application environment. It is used to store current state of all the resources, relationships among the resources, and correlated events to these resources. Anytime there are events associated with a resource, we keep a timeline of such events married to a snapshot of the associated resource(s) in the inventory at the time of the event occurrence. We felt the use of a graph database like Neo4j was ideal for storing metadata for the resources and mapping relationships and correlated events.
Bill Nigh writes about some of the challenges of moving from an RDBMS to NoSQL technologies:
- lack of queries and triggers for data changes
- too much schema-less freedom in Neo4j graph database or the challenge of doing data modeling
So Evident is not only offering tools for monitoring NoSQL solutions, but they are using them internally for their product. Practice what you preach.
Original title and link: Cassandra and Neo4j Used by Evident ClearStone APM (NoSQL databases © myNoSQL)