NoSQL Ecosystem News: All content tagged as NoSQL Ecosystem News in NoSQL databases and polyglot persistence
A simple in-process key/value document store for node.js. nStore uses a safe append-only data format for quick inserts, updates, and deletes. Also a index of all documents and their exact location on the disk is stored in in memory for fast reads of any document. This append-only file format means that you can do online backups of the datastore using simple tools like rsync. The file is always in a consistent state.
nStore can be found on ☞ GitHub
- Richard Boulton: ☞ Using Redis as a backend for Xapian. An interesting analysis of how a dedicated search engine would work with a Redis backend. Meanwhile others try to simply store the reverted index into Redis¶
- Paul Rosania: ☞ Point-and-Click install of MongoDB on OS X 10.5+. Not that it was difficult before, but nice to have! ¶
- Doug Judd: ☞ Why We Started Hypertable, Inc. … or welcome to the Hypertable Inc. blog. ¶
- Surya Surabarapu: ☞ Terrastore Scala Client. First Terrastore library in our NoSQL libraries list ¶
- Firstly a very nice trick from Christian: ☞ Couch Potato Bookmarklet - Lazy Features for CouchDB’s Futon: a bookmarklet for bulk deleting CouchDB documents
So, I created the CouchDB Potato Bookmarklet (because I’m lazy). The bookmarklet creates a new delete column and provides a “Delete Documents” link to delete all the checked documents. I also added a “Select All Documents” which only selects non-design documents (so that I don’t accidentally delete a CouchDB view). These links can be found in the right navigation column under the “Recent Databases” section.
- Embedded below is a video of J. Chris Anderson talking about CouchDB @ E-VAN
- Last, (my) internet has been “spammed” by tons of redirects to the article ☞ CouchDB Moves to the Cloud with Couchio. According to the article:
“We’ll be including Apache Lucene full-text indexing,” Katz said. “That’s an add-on for CouchDB that people usually have to download and build themselves.”
- Jake Luciani: ☞ I killed Thrudb for the love of Cassandra. Healthy reaction for the NoSQL community! ¶
So what’s the message here? In 2007 there were very few nosql dbs. Today there are way too many of them. It’s time to consolidate around the best of breed. I can do my part by killing off Thrudb.
- awksedgreep: ☞ NoSQL, a DBAs Perspective. Another healthy reaction, this time from a DBA appreciating NoSQL solutions. ¶
In order to summarize this post I’ll just start with this. We will shortly have Redis deployed in our production environment. I can’t really tell you how happy this makes me.
As a DBA, and more importantly an “Open Source DBA”, I find many of the new NoSQL options intriguing.
- Redis 2.0RC1 is out. You can download it from ☞ here and review the very long list of changes ☞ here¶
- And because it is Friday we can have some fun with the CouchDB team’s rap song: ¶
Now if you think as @jpmens:
CouchDB needs very serious work on advertising; the sequel video is almost worse than the song. #fail
myNoSQL can help you with both the advertising and community ;-)!
- ☞ Hummingbird: a MongoDB-based real time web traffic visualization tool available on ☞ GitHub. ¶
And a presentation of Hummingbird
- ☞ Smart Notes 2 Couch: is an open source free tool to migrate your Lotus Domino data to CouchDB. ¶
For those of us that haven’t made it to ☞ nosql:eu conference I’ve extracted below some (hopefully most) of the most interesting twits from the conference. I’ve also post slides of the presentations as these are coming.
Check also the best twits from 1st day @ nosql:eu
- kevinweil: Modifying my talk in realtime for #nosqleu. Adding Cassandra, HBase, FlockDB to already existing discussion of Scribe, Hadoop, Pig.
- emileifrem: Cassandra is simply the best in its category. Check out @spyced’s latest deck: http://bit.ly/8ZgaDh #nosql #nosqleu
- maslett: RT @natishalom: @maslett with the planned support for memcache - gigaspaces turns memcache to a real NoSQL alternative IMO #nosqleu
Note: personally I’d find that quite confusing. If Gigaspaces is not anymore an elastic cache, then what is it?
- danharvey: #nosqleu question for today: how do you backup casandra / HBase for user/dev errors? The failure back up is built in.
- Werner: Arrived at #nosqleu for the first presentation of the day.
- awhitehouse: @werner We should challenge assumptions that DB partitioning papers make; sometimes smallest possibilities are treated as reality. #nosqleu
- AndySeaborn: #nosqleu ☞ http://www.mcjones.org/System_R/SQL_Reunion_95/sqlr95.html
- awhitehouse: @werner: We should all read “The 1995 SQL Reunion: People, Project, and Politics” ☞ http://www.mcjones.org/System_R/SQL_Reunion_95/index.html #nosqleu
- AndySeaborne: #nosqleu ☞ http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-TN-1997-018.pdf
- tlossen: “in real systems, there are no corners to cut” — werner vogels about the importance of occam’s razor in systems design #nosqleu
- hungryblank: RT @tlossen: “in real systems, there are no corners to cut” — werner vogels about the importance of occam’s razor in systems design #nosqleu
- matwall: @Werner #nosqleu nosql is about choice, not a fight between SQL and new tech.
- tlossen: “you should all read the multics book” — werner vogels #nosqleu
- martinbtt: #nosqleu @Werner “on the birth of dynamo”
- tlossen: “real systems are pretty nasty things” — werner vogels #nosqleu
- tlossen: “scaling amazon was all about the database, every year scaling out, scaling out ….” — werner vogels #nosqleu
- tlossen: “scalability, availability, performance, cost-effectiveness are all in the end dominated by data management” — werner vogels #nosqleu
- martinbtt: “The Amazon homepage is constructed by 200-300 different web services”. #Werner #soa #nosqleu
- maslett: Amazon CTO @werner: “It all comes down to data management… that’s where the scalability is… that’s where most of the costs are” #nosqleu
- tlossen: “i HATE eventual consistenty” — werner vogels #nosqleu
- maslett: .@werner: “What we all want is strongly consistent systems - this eventual consistency stuff is a compromise.” #nosqleu
- tlossen: “your customers will ALWAYS use your system in a way you did not expect” — werner vogels #nosqleu #dynamo
- mfiguiere: #nosqleu Werner Vogels: “Customer put something in the shopping cart, they are about to give you money, that should ALWAYS works !”
- monkchips: “In 2004 we felt we could no longer rely on commercial [relational] systems to operate at Amazon scale”. @werner vogels Amazon CTO, #nosqleu
- buzzkills: “there were no comercial systems that could support amaon’s scale” [for many of their use cases] @Werner #nosqleu
- tlossen: “at scale, ALL of this shit happens” — werner vogels on datacenter SNAFUs like flooding from the roof down etc. #nosqleu
- tlossen: “scaling amazon = upgrading cessna to 747 in mid-flight” — werner vogels #nosqleu
- tlossen: “object storage is FOREVER” — werner vogels on data outliving software #nosqleu
- tlossen: “don’t forget, hardward LIES to you!” — werner vogels #nosqleu
- awhitehouse: @werner: “Economies of scale are mostly about people” (and the knowledge they need to run your system) #nosqleu
- tlossen: “we really have to dive deep and understand all the problems from top to bottom” - werner vogels on INTELLECTUAL economies of scale #nosqleu
- beobal: “economies of scale are not just about technologies, it has a lot to do with people” @werner #nosqleu
- tlossen: “transparency is EVIL” — werner vogels about NFS etc. #nosqleu
- tlossen: “remember that storage is a very long-lasting relationship” — werner vogels #nosqleu
- maslett: .@werner: “We shouldn’t all be doing this.” #nosqleu Companies should be focused on their business, not their databases.
- simonw: Werner Vogels: “S3 is a better key/value store than Dynamo” (due to list/prefix operators) #nosqleu
- tlossen: “if you keep your system simple, it drives simplicity at the customer side as well” — werner vogels on importance of occam’s razor #nosqleu
- awhitehouse: “Simplicity needs to happen at the interface” … the API to your system drives the architecture @werner at #nosqleu
- seanparsons : @Werner’s talk at #nosqleu was illuminating about the focus on managing interaction between systems.
- CooperDino : WernerVogels at #NoSQLeu: When u do trillions of ops per day even the slightest probability becomes reality
- martinbtt : Fantastic talk by @Werner at #nosqleu - loads of useful tech nuggets to take away. Great start to the day so far.
- CooperDino : WernerVogels at #NoSQLeu: Bruce Lindsay & Jim Gray are our heroes, we should all read about their data sys work in the 70s
- CooperDino : @Werner at #NoSQLeu: Last time Amazon was down was 2004 & it was related to an RDB crashing
- CooperDino : @Werner at #NoSQLeu: 70% of storage operations in Amazon are key/value
- CooperDino : WernerVogels at #NoSQLeu: If u have2 jump thru lots of hoops 2use any DB then it prob wrong choice. #JOOB is fresh choice4 #dotNet
- CooperDino : @Werner at #NoSQLeu: Customers will not look at a DB in isolation, they will always look at where it sits in big picture
- matwall : Head buzzing from inspiring talk from @werner at #nosqleu
- monkchips : now @kevinweil (twitter’s analytics lead) presents via skype video… just showed us some very dark twitter offices ;-) #nosqleu
- awhitehouse : Big hand to @kevinweil for giving his talk from Twitter HQ at 3am local time. #nosqleu
- matwall : @kevinweil say twitter increase userbase by 300K per day, generate 7Tb of data *per day* #nosqleu
- buzzkills : Twitter gave up on syslog because it didn’t scale #nosqleu
- thobe : This is me contributing to the 300GB of twitter data generated while @kevinweil talk about it on #nosqleu
- tlossen : “you write log lines — scribe does the rest” — kevin weill about logging at scale #nosqleu
- buzzkills : @buzzkills apparently faceyb wrote scribe, Twitter are big contris #nosqleu (thx to @ianmeyers for correction)
- matwall : @kevinweil from Twitter describing their Scribe -> Hadoop -> Pig pipeline for data alanysis at #nosqleu Very interesting, I want one.
- tlossen : “want less java in your life? use pig!” — kevin weill, giving advice on hadoop #nosqleu
- matwall : @kevinweill on datamining user data: It’s easy to answer questions, it’s hard to ask the right questions. #nosqleu
- wwwicked : Loving the simplicity of a Pig script versus the equivalent Hadoop/Java code #nosqleu
- beobal : “value the system that promotes innovation, iteration” @kevinweil #nosqleu
- monkchips : facebook’s scribe at master - GitHub ☞ http://github.com/facebook/scribe a logging system for client performance data, also used by twitter. #nosqleu
- awhitehouse : @kevinweil: Twitter does most of its data analysis in Pig - scripts can call user-defined functions coded in Java (v. powerful) #nosqleu
- matwall : Twitter using Apache Mahout coupled with Pig for machine learning when examining user behaviour #nosqleu
- andrewgarner : Totally sold on Pig #nosqleu
- wwwicked : A friend of mine said “NoSQL is retarded”. The more I’ve heard over the past 2 days, more more I realise he’s wildly wrong #nosqleu
- emileifrem : @wwwicked Term is retarded. Notion all RDBMSes will be replaced is retarded. That we’re heading to a polyglot persistence era isnt. #nosqleu
- monkchips : “we’re trying to move all tweets to Cassandra”. @kevinweil Twitter #nosqleu
Note: You can read the whole story in myNoSQL exclusive Cassandra @ Twitter: And interview with Ryan King
- tlossen : “better eventual consistency than POTENTIAL consistency” — kevin weil on reasons to use cassandra at twitter #nosqleu
- maslett : Twitter is working with Digg to create real-time analytics for Cassandra. Plans to open source. #nosqleu
- msk_y : RT @buzzkills: Twitter store their log files in Lzo compressed, protocol buffers format on hdfs #nosqleu
- kingsleydavies : #nosqleu CouchDB used at BBC - typically used as a KVS and is used in iPlayer and parts of the homepage…
- tlossen : “you can throw rocks and stones at it, and it just keeps going” — enda farrell (bbc) about robustness of couchdb #nosqleu
- matwall : @endafarrell CouchDb restarts in < 1sec. Occasionally restart in production as restarts are far less than TCP timeout! #nosqleu
- tlossen : enda farrell shared a neat idea: “pre-sharding” — running 4 instances of couchdb on every node [couchdb @ bbc talk] #nosqleu
- matwall : @endafarrell “Having things that just work and are simple from the users perspective is brilliant” #nosqleu
- CooperDino : #NoSQLeu: BBC web site handles 200m requests per day on 1.5TB of data using 8 servers & #CouchDB
- monkchips : exciting! presentation at #nosqleu from Comcast chief engineer @jon_moore : Why Big Enterprises Are Interested in NoSQL
- matwall : Agree with @jon_moore at #nosqleu : storage is a means to a business end, nosql contains intrinsic risk
- benoitc : idealized api of comcast looks like the #couchdb one get,post, get _views #nosqleu
- matwall : @jon_moore at #nosqleu Can I add more capacity without adding too many more sysadmins? Can my admins work 9-5?
- matwall : @jon_moore at #nosqleu Is there a company behind product to provide operational support? Important for commoditization
- monkchips : surprising requirement of the #nosqleu conference. NoSQL providers take note: Enterprises expect JMX support. java ain’t dead. devops?
- wwwicked : #nosqleu @jon_moore made a fair point re: my comment about analytics on KV stores; may not be best idea but “they” will want to do it anyway
- timanglade : Totally awesome break-down of the CAP theorem (in the context of Multiple Datacenters) by the amazing @jon_moore. Refreshingly enlightening.
- kingsleydavies : loving the name *Tokyo Tyrant* and a great, upbeat start to @makoto_inoue preso… #nosqleu
- matwall : @makato_inoue Says that @al3xandu’s site myNoSL is like “Hello magazine for nosql” :) #nosqleu
- matwall : Can we have a 3 hour workshop with @makato_inoue please? He’s great! #nosqleu
- kingsleydavies : +1 yeah… I fear we wont have enough time :-( RT @matwall: Can we have a 3 hour workshop with @makato_inoue please? He’s great! #nosqleu
- maslett : great presentation on the highly random world of Tokyo Cabinet/Tyrant by @makoto_inoue #nosqleu
- maslett : Quote of the day: “myNoSQL is the Hello magazine of NoSQL" #nosqleu
- michaeltiberg : #nosqleu conference is to an end - attendees seems to be satisfied and that makes my day
Check also the best twits from 1st day @ nosql:eu
Check also the nosql:eu presentations from 1st day
On the Birth of Dynamo - Werner Vogels
Nothing here yet :-(.
Twitter’s use of Cassandra, Pig and HBase - Kevin Weil
Slides from Kevin Weil (@kevinweil) presentation on Twitter’s use of Cassandra, Pig and HBase
CouchDB at the BBC - Enda Farrell
Nothing here yet :(
Why Big Enterprises are Interested in NoSQL - Jon Moore
Slides from Jon Moore (@jon_moore) presentation: Why Big Enterprises are Interested in NoSQL
Memory as the New Disk: Why Redis Rocks - Tim Lossen
Slides from Tim Lossen (@tlossen): Memory as the New Disk: Why Redis Rocks
Tokyo Cabinet, Tokyo Tyrant and Kyoto Cabinet - Makoto Inoue
Nothing here yet :(
Notes from the field: NoSQL tools in Production - Matthew Ford
Slides from Matthew Ford (@matthewcford) Notes from the field: NoSQL tools in Production presentation
nosql:eu live twitter stream
Check also the nosql:eu presentations from 1st day
According to the latest news from ☞ nosql:eu — in case you didn’t make it you can still follow the nosql:eu live twitter feed — Jonathan Ellis (@spyced), project chair for Cassandra, ex-Rackspace, has started a company named ☞ Riptano to focus on Cassandra. While I don’t have yet details of Riptano’s plans (services, development, Cassandra-as-a-Service, etc), I’m wishing Jonathan and Matt good luck!
Update: According to Matt Pfeil, Riptano will initially focus on 3 products: technical support, professional services, and training.
CNET has published recently ☞ an interview with Matt.