NoSQL libraries: All content tagged as NoSQL libraries in NoSQL databases and polyglot persistence
Monday, 7 May 2012
ZkFarmer: Tools for Managing Distributed Server Farms Using Apache ZooKeeper
With ZkFarmer, each server registers itself in one or several farms. Thru this registration, hosts can expose arbitrary information about their status.
On the other end, ZkFarmer helps consumers of the farm to maintain a configuration file in sync with the list of hosts registered in the farm with their respective configuration.
In the middle, ZkFarmer helps monitoring and administrative services to easily read and change the configuration of each host.
Currently ZkFarmer provides the following functionality:
- Registering ZooKeeper services to one or several farms (joining a farm)
- Listing farms and hosts
- Read and Write farms content
- Farm monitoring
- Syncing farm configuration
Original title and link: ZkFarmer: Tools for Managing Distributed Server Farms Using Apache ZooKeeper (©myNoSQL)
Monday, 2 April 2012
Automatic Async and Sync Pipelining of Redis Commands
The nuts and bolts of implementing synchronous and asynchronous Redis clients supporting pipelining:
In this post I describe different approaches for client-libraries to implement Redis protocol pipelining. I will cover synchronous as well as asynchronous (event-driven) techniques and discuss their respective pros and cons: Synchronous client APIs require the library user to explicitly pipeline commands, potentially yielding optimal protocol performance, but at the cost of additional bookkeeping when handling replies. Asynchronous client libraries, on the other hand, allow automatic pipelining, while being less efficient in their pipelining behavior.
Original title and link: Automatic Async and Sync Pipelining of Redis Commands (©myNoSQL)
Wednesday, 7 March 2012
Asyncdynamo: Amazon DynamoDB Async Python Library by Bitly
Bitly’s new asynchronous Amazon DynamoDB Python client:
Asyncdynamo requires Boto and Tornado to be installed, and must be run with Python 2.7. It replaces Boto’s synchronous calls to Dynamo and to Amazon STS (to retrieve session tokens) with non-blocking Tornado calls. For the end user its interface seeks to mimic that of Boto Layer1, with each method now requiring an additional callback parameter.
Available on GitHub.
Original title and link: Asyncdynamo: Amazon DynamoDB Async Python Library by Bitly (©myNoSQL)
via: http://word.bitly.com/post/18861837158/introducing-asyncdynamo
Monday, 5 March 2012
Neo4j and JRuby: Expressive Graph Traversals With Jogger
Jogger gives you named traversals and is a little bit like named scopes. Jogger groups multiple pacer traversals together and give them a name. Pacer traversals are are like pipes. What are pipes? Pipes are great!!
The most important conceptual difference is, that the order in which named traversals are called matter, while it usually doesn’t matter in which order you call named scopes.
Knowing how Gremlin and Cypher compare, question is how is Jogger compared to Cypher?
Original title and link: Neo4j and JRuby: Expressive Graph Traversals With Jogger (©myNoSQL)
Wednesday, 22 February 2012
Automating Cassandra Operations and Management With Netflix's Priam Tool
A new open source tool from Netflix, Priam—back in November, Netflix has released Curator, a ZooKeeper library—, used to simplify and automate the operations and management of a Cassandra cluster:
Priam is a co-process that runs alongside Cassandra on every node to provide the following functionality:
- Backup and recovery
- snapshot and incremental backups
- compression and multipart off-site uploading
- data recovery and data testing
Bootstrapping and automated token assignment
Priam automates the assignment of tokens to Cassandra nodes as they are added, removed or replaced in the ring. Priam relies on centralized external storage (SimpleDB/Cassandra) for storing token and membership information, which is used to bootstrap nodes into the cluster. It allows us to automate replacing nodes without any manual intervention, since we assume failure of nodes, and create failures using Chaos Monkey. The external Priam storage also provides us valuable information for the backup and recovery process.
Centralized configuration management: All our clusters are centrally configured via properties stored in SimpleDB, which includes setup of critical JVM settings and Cassandra YAML properties.
- RESTful monitoring and metrics: provides hooks that support external monitoring and automation scripts. They provide the ability to backup, restore a set of nodes manually and provide insights into Cassandra’s ring information. They also expose key Cassandra JMX commands such as repair and refresh.
Original title and link: Automating Cassandra Operations and Management With Netflix’s Priam Tool (©myNoSQL)
via: http://techblog.netflix.com/2012/02/announcing-priam.html
Quick Guide to MongoDB and Python With PyMongo
A tutorial on PyMongo from Rick Copeland covering:
- configuration options for MongoDB
- documents structure, inserts and batch inserts
- querying and indexing
- deleting
- updating
One thing that’s nice about the pymongo connection is that it’s automatically pooled. What this means is that pymongo maintains a pool of connections to the mongodb server that it reuses over the lifetime of your application. This is good for performance since it means pymongo doesn’t need to go through the overhead of establishing a connection each time it does an operation. Mostly, this happens automatically. you do, however, need to be aware of the connection pooling, however, since you need to manually notify pymongo that you’re “done” with a connection in the pool so it can be reused.
Original title and link: Quick Guide to MongoDB and Python With PyMongo (©myNoSQL)
via: http://blog.pythonisito.com/2012/01/moving-along-with-pymongo.html
Tuesday, 21 February 2012
An Introduction to Scalding, the Scala and Cascading MapReduce Framework From Twitter
A fantastic guide to Twitter’s Scala and Cascading MapReduce framework Scalding from Edwin Chen1:
In 140: instead of forcing you to write raw map and reduce functions, Scalding allows you to write natural code like
// Create a histogram of tweet lengths. tweets.map('tweet -> 'length) { tweet : String => tweet.size }.groupBy('length) { _.size }
Looking at the code samples, this looks a lot like Apache Pig. But the Scalding documentation compares it to Scrunch/Scoobi and points to the answers in this Quora thread:
The main difference between Scalding (and Cascading) and Scrunch/Scoobi is that Cascading has a record model where each element in your distributed list/table is a table with some named fields. This is nice because most common cases are to have a few primitive columns (ints, strings, etc…).
-
Edwin Chen is data scientist at Twitter ↩
Original title and link: An Introduction to Scalding, the Scala and Cascading MapReduce Framework From Twitter (©myNoSQL)
via: http://blog.echen.me/2012/02/09/movie-recommendations-and-more-via-mapreduce-and-scalding/
Thursday, 16 February 2012
Storing Django Sessions in DynamoDB with django-dynamodb-sessions
Pros:
- reduces read/write access to your main database
- all DynamoDB benefits:
- fully manged solution
- scalable
- fast and predictable performance
Cons (or more of when not to use it):
- if your application is not running in the AWS cloud
- the size of the sessions is bigger than 64KB
Original title and link: Storing Django Sessions in DynamoDB with django-dynamodb-sessions (©myNoSQL)
via: http://gc-taylor.com/blog/2012/2/15/django-dynamodb-sessions-ready/
Wednesday, 8 February 2012
Connection Management in MongoDB and CongoMongo
Are connections pooled or not? Konrad Garus digs to find the answer:
Easy. Too easy and comfortable. Coming from the old good and heavy JDBC/SQL I felt uneasy with the connection management. How does it work? Does it just open a connection and leave it dangling in the air the whole time? Might be good for a quick spike in REPL, but not for a real application which needs concurrency, is supposed to be running for days and weeks, and so on. How do you maintain it properly?
Original title and link: Connection Management in MongoDB and CongoMongo (©myNoSQL)
via: http://squirrel.pl/blog/2012/02/04/connection-management-in-mongodb-and-congomongo/
Thursday, 5 January 2012
Getting Started With Ruby and Neo4j Using Neography
Getting started with Ruby and Neo4j is very easy. Follow these steps and you’ll be up and running in no time.First we install the neography […]
The traversal API looks really nice and comes in two flavors: the Neo4j REST API and a Ruby-esque one.
Original title and link: Getting Started With Ruby and Neo4j Using Neography (©myNoSQL)
via: http://maxdemarzi.com/2012/01/04/getting-started-with-ruby-and-neo4j/