GoldenOrb: All content tagged as GoldenOrb in NoSQL databases and polyglot persistence
Thursday, 8 March 2012
Big Graph-Processing Library From Twitter: Cassovary
Cassovary is designed from the ground up to efficiently handle graphs with billions of edges. It comes with some common node and graph data structures and traversal algorithms. A typical usage is to do large-scale graph mining and analysis.
If you are reading this you’ve most probably heard of Pregel—if you didn’t then you should check out the Pregel: a system for large-scale graph processing paper and then how Pregel and MapReduce compare—and also the 6 Pregel inspired frameworks.
The Cassovary project page introduces it as:
Cassovary is a simple “big graph” processing library for the JVM. Most JVM-hosted graph libraries are flexible but not space efficient. Cassovary is designed from the ground up to first be able to efficiently handle graphs with billions of nodes and edges. A typical example usage is to do large scale graph mining and analysis of a big network. Cassovary is written in Scala and can be used with any JVM-hosted language. It comes with some common data structures and algorithms.
I’m not sure yet if:
- Cassovary works with any graphy data source or requires FlockDB—which is more of a persisted graph than a graph database
- Cassovary is inspired by Pregel in any ways or if it’s addressing a limited problem space (similarly to FlockDB)
Update: Pankaj Gupta helped clarify the first question (and probably part of the second too):
At Twitter we use flockdb as our real-time graphdb, and export daily for use in cassovary, but any store could be used.
Original title and link: Big Graph-Processing Library From Twitter: Cassovary (©myNoSQL)
via: http://engineering.twitter.com/2012/03/cassovary-big-graph-processing-library.html
Thursday, 3 November 2011
6 Pregel-Inspired Frameworks
A quick overview of 6 Pregel-inspired frameworks (Apache Hama, GoldenOrb, Apache Giraph, Phoebus, Signal/Collect, and HipG):
So, to summarize, what Hama, GoldenOrb and Giraph have in common is: Java platform, Apache License (and incubation), BSP computation. What they differ for: Hama offers BSP primitives not graph processing API (so it sits at a lower level), GoldenOrb provides Pregel’s API but requires the deployment of additional software to your existing Hadoop infrastructure, Giraph provides Pregel’s API (and is kind of complete at the current state) and doesn’t require additional infrastructure.
Original title and link: 6 Pregel-Inspired Frameworks (©myNoSQL)
via: http://blog.acaro.org/entry/google-pregel-the-rise-of-the-clones
Thursday, 25 August 2011
Paper: Graph Based Statistical Analysis of Network Traffic
Published by a group from Los Alamos National Lab (Hristo Djidjev, Gary Sandine, Curtis Storlie, Scott Vander Wiel):
We propose a method for analyzing traffic data in large computer networks such as big enterprise networks or the Internet. Our approach combines graph theoretical representation of the data and graph analysis with novel statistical methods for discovering pattern and timerelated anomalies. We model the traffic as a graph and use temporal characteristics of the data in order to decompose it into subgraphs corresponding to individual sessions, whose characteristics are then analyzed using statistical methods. The goal of that analysis is to discover patterns in the network traffic data that might indicate intrusion activity or other malicious behavior.
The embedded PDF and download link after the break.
Friday, 1 July 2011
GoldenOrb: Ravel Google Pregel Implementation Released

Announced back in March, Ravel has finally released GoldenOrb an implementation of the Google Pregel paper—if you are not familiar with Google Pregel check the Pregel: Graph Processing at Large-Scale and Ricky Ho’s comparison of Pregel and MapReduce.
Until Ravel’s GoldenOrb the only experimental implementation of Pregel was the Erlang-based Phoebus. GoldenOrb was released under the Apache License v2.0 and is available on GitHub.
GoldenOrb is a cloud-based open source project for massive-scale graph analysis, built upon best-of-breed software from the Apache Hadoop project modeled after Google’s Pregel architecture.
Original title and link: GoldenOrb: Ravel Google Pregel Implementation Released (©myNoSQL)
Tuesday, 19 April 2011
Graph Databases: Distributed Traversal Engines
Marko A.Rodriguez:
In the distributed traversal engine model, a traversal is represented as a flow of messages between elements of the graph. Generally, each element (e.g. vertex) is operating independently of the other elements. Each element is seen as its own processor with its own (usually homogenous) program to execute. Elements communicate with each other via message passing. When no more messages have been passed, the traversal is complete and the results of the traversal are typically represented as a distributed data structure over the elements. Graph databases of this nature tend to use the Bulk Synchronous Parallel model of distributed computing. Each step is synchronized in a manner analogous to a clock cycle in hardware. Instances of this model include Agrapa, Pregel, Trinity, GoldenOrb, and others.
None of these graph databases offers distributed traversal engines.
Original title and link: Graph Databases: Distributed Traversal Engine (NoSQL databases © myNoSQL)
via: http://markorodriguez.com/2011/04/19/local-and-distributed-traversal-engines/
Tuesday, 29 March 2011
Ravel Hopes to Open-Source Graph Databases
Ravel, an Austin, Texas-based company, wants to provide a supported, open-source version of Google’s Pregel software called GoldenOrb to handle large-scale graph analytics.
Is it a new graph database or a Pregel implementation? Watch the interview for yourself and tell me what do you think it is?
via: http://gigaom.com/cloud/ravel-hopes-to-open-source-graph-databases/