python: All content tagged as python in NoSQL databases and polyglot persistence
Monday, 15 August 2011
The Stories of the Revamped Riak Java Client and Improvements in Python Client
If you read the story of the MongoDB Erlang driver, you’ll probably enjoy reading about Riak’s revamped Java client or the improvements in the Riak’s Python client .
Original title and link: The Stories of the Revamped Riak Java Client and Improvements in Python Client (©myNoSQL)
Monday, 27 June 2011
Building an Ad Network Ready for Failure
The architecture of a fault-tolerant ad network built on top of HAProxy, Apache with mod_wsgi and Python, Redis, a bit of PostgreSQL and ActiveMQ deployed on AWS:
The real workhorse of our ad targeting platform was Redis. Each box slaved from a master Redis, and on failure of the master (which happened once), a couple “slaveof” calls got us back on track after the creation of a new master. A combination of set unions/intersections with algorithmically updated targeting parameters (this is where experimentation in our setup was useful) gave us a 1 round-trip ad targeting call for arbitrary targeting parameters. The 1 round-trip thing may not seem important, but our internal latency was dominated by network round-trips in EC2. The targeting was similar in concept to the search engine example I described last year, but had quite a bit more thought regarding ad targeting. It relied on the fact that you can write to Redis slaves without affecting the master or other slaves. Cute and effective. On the Python side of things, I optimized the redis-py client we were using for a 2-3x speedup in network IO for the ad targeting results.
Original title and link: Building an Ad Network Ready for Failure (©myNoSQL)
via: http://dr-josiah.blogspot.com/2011/06/building-ad-network-ready-for-failure.html
Thursday, 26 May 2011
Watch System Logs in Real Time with Redis Pub/Sub
Little trick to get access to your logs:
This is a little python log handler that lets you stream your log data over a redis pub/sub channel, so you can monitor your system in real time from any redis client.[…] I am also interested in mapping the python logging names (foo.bar.baz) to redis channels that would enable you to listen to arbitrary applications, or subscribe to a collection using the redis wildcard subscriptions.
On GitHub.
Original title and link: Watch System Logs in Real Time with Redis Pub/Sub (NoSQL databases © myNoSQL)
via: http://jedp.posterous.com/a-python-logging-handler-for-redis-pubsub
Friday, 28 January 2011
CouchDB for Timely and Budget Limited Projects
CouchOne has published another CouchDB success story:
The team felt a relational database and full web application framework would be overkill and inflexible. With these factors in mind they chose to use Flask and CouchDB. An additional factor that appealed to the developers was CouchDB’s native storage of JSON documents. They communicate and collect various data with the Facebook Graph API, which speaks JSON — so it just made sense to dump that data straight into CouchDB.
The final application was delivered in aproximately one week — though the article forgets to mention the size of team that worked on the project.
Original title and link: CouchDB for Timely and Budget Limited Projects (NoSQL databases © myNoSQL)
Thursday, 2 December 2010
redis_graph: Redis-based Graph Database for Python
redis_graph is a graph database implemented in Python. It shows how awesome Redis is as the implementation is under 40 lines of code.
The perfomance should be excellent, while scaling it might be an issue. I would not recommend using it if you are storing nodes in the millions range.
Except being fun — as fun as storing a social graph in Redis or implementing a social graph using Redis — I’d say using a real graph database is probably a better approach.
Original title and link: redis_graph: Redis-based Graph Database for Python (NoSQL databases © myNoSQL)
Thursday, 4 November 2010
Pylons & MongoDB: User Registration & Login
For Pylons users:
About a year ago, Chris Moos came up with a nice tutorial on how to integrate Pylons with CouchDB. Well, times have changed: Pylons is now 1.0 (thus, syntactical differences), and I’m into MongoDB for my datastore. This document is revised/edited/rewritten/and updated for MongoDB.
Django users seem to be a bit more active in the NoSQL space though.
Original title and link: Pylons & MongoDB: User Registration & Login (NoSQL databases © myNoSQL)
via: http://devinfee.com/blog/2010/11/04/pylons-mongodb-user-registration-login/
Tuesday, 2 November 2010
Hadoop and Elastic MapReduce at Yelp
A story of using Hadoop at Yelp and migrating it to Amazon Elastic MapReduce:
We used to do what a lot of companies do, which is run a Hadoop cluster. We had a dozen or so machines that we otherwise would have gotten rid of, and whenever we pushed our code to our webservers, we’d push it to the Hadoop machines.
It was also not so cool. You couldn’t really tell if a job was going to work at all until you pushed it to production. But the worst part was, most of the time our cluster would sit idle, and then every once in a while, a really beefy job would come along and tie up all of our nodes, and all the other jobs would have to wait.
Yelp has released their Python library for running MapReduce jobs on Hadoop or Amazon Elastic MapReduce on ☞ GitHub.
Original title and link: Hadoop and Elastic MapReduce at Yelp (NoSQL databases © myNoSQL)
via: http://engineeringblog.yelp.com/2010/10/mrjob-distributed-computing-for-everybody.html
Friday, 15 October 2010
MongoDB: Designing Trees using mongodm
The first reason that bring me away from the great mongoengine API is that’s there’s no way to easily manage recursive trees.
Sounds like someone agrees with me.
Original title and link: MongoDB: Designing Trees using mongodm (NoSQL databases © myNoSQL)
via: http://dev-solutions.fr/post/designing-trees-in-mongodb-using-mongodm
Monday, 11 October 2010
MongoDB: Best Python Mapper
I have found on ☞ Quora a detailed comparison of two of the most popular MongoDB Python ODMs: ☞ MongoKit and ☞ MongoEngine (nb the post is using the term ORM, but I guess that’s just out of habit):
I prefer the manner of declaring field types in MongoEngine to MongoKit, but that’s just me. If you’re coming from something like the Django ORM, MongoEngine is very similar.
If you need to make on-the-fly modifications to document schemas at runtime, MongoKit is the way to go. MongoKit also allows you to bypass validation.
Finally, compare the MongoKit and MongoEngine documentation. I find the MongoEngine documentation a much more useful reference (it’s easier to navigate and read — very much my opinion though):
While I’m not a very experienced Pythonista, nor have I used any of these libraries, I must confess that I’m finding both being too much inspired from ORMs. There is structure in document databases, but enforcing all the rules and strictness of a relational model seems a bit too restrictive. Plus, it is unclear on how you can actually take advantage of document databases schemaless when using these libraries.
Original title and link: MongoDB: Best Python Mapper (NoSQL databases © myNoSQL)
Wednesday, 15 September 2010
Redis: Implementing Auto Complete or How to build Trie on Redis
In the days the news are about instant searches and auto complete, Salvatore Sanfilippo (@antirez) shows how to use Redis sorted sets and corresponding commands (ZRANGE, ZRANK) to implement autocompletion:
The initial code in Ruby:
already got ported to Python:
As Ilya Grigorik (@igrigorik) commented, this is building a ☞ Trie with Redis.
Original title and link: Redis: Implementing Auto Complete or How to build Trie on Redis (NoSQL databases © myNoSQL)
Monday, 30 August 2010
Tornado Sees Some NoSQL Activity
Tornado, the non-blocking web server and tools open sourced by FriendFeed before their acquisition, seems to get some NoSQL activity. While Django is leading the way in the Python world, judging by the NoSQL projects happening around Node.js, one could say that Tornado, with its non-blocking architecture, may be an interesting alternative.
Thomas Pelletier has ☞ a blog post about a simple websocket + Tornado + Redis Pub/Sub protocol integration:
The principle is very simple: when your user loads the page, she is automatically added to a list of “listeners”. An independent thread is running: it listens for messages from Redis with the subscribe command, and send a message through Websocket to every registered ”listener”. In this example, the user can send a message to herself with a simple AJAX-powered form, which calls a view with a payload (the message), and the view publish it via the publish command of Redis.
This is basically a web chat! If you want to have fun, you can then add a roster, with a presence system, authentication etc…
There’s also a ☞ GitHub project called Trombi:
Trombi is an asynchronous CouchDB client for Tornado.
And I’m pretty sure there are other projects I’ve missed (but you can leave a comment to add them to the list).
Original title and link for this post: Tornado Sees Some NoSQL Activity (published on the NoSQL blog: myNoSQL)
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling