NoSQL event: All content tagged as NoSQL event in NoSQL databases and polyglot persistence
Wednesday, 7 March 2012
The Continuing Story of Hadoop: Summarizing the Strata Conference
Jeff Kelly summarizing the Strata conference:
You know a technology is headed to the mainstream when the two “Elite” sponsors of the premier event designed to showcase that technology are Microsoft and EMC. Neither company is known for adopting and promoting emerging open source technologies, to put it mildly. But there they both were at Strata Conference, the event dedicated to open source Big Data approaches like Hadoop and NoSQL, topping the list of event sponsors. They were followed not far behind by fellow IT giants and Strata “Impact” sponsors IBM and Oracle.
Filing this under great events I’m missing while being 10000miles away.
Original title and link: The Continuing Story of Hadoop: Summarizing the Strata Conference (©myNoSQL)
via: http://wikibon.org/blog/strata-conference-the-continuing-story-of-hadoop/
Friday, 1 July 2011
Hadoop Summit 2011 in Review
For those of us that haven’t been at the Hadoop Summit 2011:
Ryan Rosario
The main takeaway from Hadoop Summit 2010 was Cascalog. I predict the main takeaway from Hadoop Summit 2011 is Spark.
Anant Jhingran
My essential points are that the “birthers” (where hadoop has been born) and “adopters” (where hadoop will be used in enterprises) have a strong intersection today, modulo some extras on both sides…
However, at t = 3 years from now, we can either go separate ways because of different demands… or come together […]
Dave Cahill
[Hadoop] No longer a West Coast early adopter phenomenon. Hadoop isn’t quite mainstream, but almost, not quite at enterprise level purchasing but getting close.
Barton George interviewing with Eric Baldescwieler
A 4 minutes interview with the Eric Baldescwieler, CEO of Hortonworks, the Yahoo! Hadoop spin-off:
Announcements
- Cloudera Enterprise 3.5 : Full lifecycle management of Apache Hadoop deployments featuring the Service and Configuration Manager, Activity Monitor, Enhancements to Resource Manager and Authorization Manager
- Karmasphere Studio Community Hadoop Virtual Appliance for developers: a free virtual machine imagine including Apache Hadoop, Ubuntu Linux, the Eclipse IDE and Karmasphere Studio Community.
Last, but not least you can read Derrick Harris’ overview post .
Original title and link: Hadoop Summit 2011 in Review (©myNoSQL)
Sunday, 19 June 2011
Data Scientist Summit Videos
After seeing the excerpt from Jonathan Harris’ talk at Data Scientist Summit I really wanted to post a link to some of the videos. But they are all behind a registration gateway. Just in case you want to watch them—there are indeed some interesting titles— you’ll find them here.
Original title and link: Data Scientist Summit Videos (NoSQL database©myNoSQL)
Monday, 16 May 2011
Turning BigData into Stories
Ryan Rosario summarizing a panel from Data Scientist Summit, featuring Pete Skomoroch (LinkedIn), Sharon Franks Chiarella (Amazon Mechnical Turk), Gil Elbaz (Factual) and Toby Segaram (Google):
you can’t turn data into a story without joining the data with, well, other data.
Original title and link: Turning BigData into Stories (NoSQL databases © myNoSQL)
Sunday, 27 March 2011
The Many Faces Of MapReduce - Hadoop and Beyond
The best panel from Structure Big Data 2011. Featuring Amr Awadallah[1], Mike Hoskins[2], Dwight Merriman[3], Todd Papaioannou[4], Ben Werther[5], the DataStax Brisk official announcement, and a cool parallel between Hadoop processing and cooking approaches from Amr. A must see.
Videos from MongoUK Event Thanks to SkillsMatter
10gen continued its MongoDB popularization tour around the world with three events in Europe: London, Paris, and Berlin. SkillsMatter, the organizers of MongoUK have recorded all the sessions and made them available here
Here is the list of the talks:
- Welcome by Eliot Horowitz
- Nosh Petigara: Building your 1st MongoDB application
- Richard Kreuter: Mastering the MongoDB shell
- Meghan Gill: MongoDB community resources
- Richard Kreuter: Schema design: data as documents
- Mathias Stearn: MongoDB Internals: Storage Engine
- Graham Tackley: MongoDB at the Guardian
- Russell Smith: Geo & Capped collections with MongoDB
- Richard Kreuter: Indexing and Query Optimizer
- Geoff Watts: BSON and ZMQ
- Mathias Stearn: Administration
- Eliot Horowitz: Open Q&A with Eliot Horowitz
- Ashok Subramanian & Stephen Rose: Project Phoenix
- Phillipp Krenn: Morphia: MongoDB for Java Developers
- Eliot Horowitz: Scaling with MongoDB
- Neil Bertlett: MongoDB as a backing store of Eclipse MF
- Nosh Petigara: Deployment strategies
- David Mytton: Monitoring MongoDB
- Eliot Horowitz: MongoDB Project Roadmap
Original title and link: Videos from MongoUK Event Thanks to SkillsMatter (NoSQL databases © myNoSQL)
Friday, 25 March 2011
Does Big Data Need Big Budgets?
If you’d ask me this question, I’m sure my initial answer would be: “absolutely”. And I guess I would not be alone. But is that the right answer?
While watching GigaOm’s Structure Big Data event, there were two talks that gave me a different perspective on this question.
Firstly, it was the interview with Kevin Krim, the Global Head of Bloomberg Digital, which told the story of adopting, mining, and materializing Big Data inside a corporation that didn’t believe in it, nor did it allocate large budgets to it. The result: collecting more than a terabyte of data every day from 100 data points for every pageview and running 15 different parallel algorithms to make recommendations that led sometimes to 10x clickthrough rates. The interview is embedded at the end of this post.
The second story, coming from Pete Warden, founder of OpenHeatMap, is even more exciting. Pete has used a combination of right tools deployed on the cloud to mine Facebook data: 500 million pages for $100 — that was the cost before being sued by Facebook.
Pete Warden distilled his experience with these tools and has made available at datasciencetoolkit.org a collection of data tools and open APIs in both an Amazon AMI format to be run on the cloud and as a VMWare image to run locally. I highly recommend watching Pete’s talk which I’ve embedded below.
While it depends on what definition of BigData we’d use, both these talks are leading to a simple conclusion:
- you need imagination to get started with Big Data
- you need to use the right tools for getting good results
Is this going to work at the scale of Twitter, LinkedIn, Facebook, Google? Probably not. But before getting at that size, you need to start somewhere. And both these talks suggest a clear answer to the question “does big data need big budgets?”: not always.
Monday, 14 March 2011
Hadoop and NoSQL Databases at Twitter
Three presentations covering the various NoSQL usages at Twitter:
-
Kevin Weil talking about data analysis using Scribe for logging, base analysis with Pig/Hadoop, and specialized data analysis with HBase, Cassandra, and FlockDB on InfoQ
-
Ryan King’s presentation from last year’s QCon SF NoSQL track on Gizzard, Cassandra, Hadoop, and Redis on InfoQ
-
Dmitriy Ryaboy on Hadoop from Devoxx 2010:
By looking at the powered by NoSQL page and my records, Twitter seems to be the largest adopter of NoSQL solutions. Here is an updated version of who is using Cassandra and HBase
- Twitter: Cassandra, HBase, Hadoop, Scribe, FlockDB, Redis
- Facebook: Cassandra, HBase, Hadoop, Scribe, Hive
- Netflix: Amazon SimpleDB, Cassandra
- Digg: Cassandra
- SimpleGeo: Cassandra
- StumbleUpon: HBase, OpenTSDB
- Yahoo!: Hadoop, HBase, PNUTS
- Rackspace: Cassandra
And probably many more missing from the list. But that could change if you leave a comment.
Original title and link: Hadoop and NoSQL Databases at Twitter (NoSQL databases © myNoSQL)
Thursday, 17 February 2011
Facebook Messages: FOSDEM NoSQL Event
From this year’s FOSDEM, Facebook talking about the technology behind the messaging platform:
Original title and link: Facebook Messages: FOSDEM NoSQL Event (NoSQL databases © myNoSQL)
Tuesday, 8 February 2011
Reconstructing Linked Data and Graph Databases
ReadWriteWeb has published a very interesting story of a project presented at last week’s Strata conference aiming to reconstruct linked data based on public data sources like Flickr and OpenStreetMap using a somehow classical”fuzzy matching” approach.
build a detailed database of information about places in Afghanistan, using only public sources on the Web. The goal is to describe in detail the towns and cities including everything from names, locations and populations, as well as lists and coordinates for schools, mosques, banks and hotels.
My gut feeling is that mixing in some graph database would make this problem not necessarily easier to address, but it would bring in a different angle to tackle it. Fuzzy matching is a search-based approach with an inductive flavor, while using a graph databases could bring in a deductive approach.
Original title and link: Reconstructing Linked Data and Graph Databases (NoSQL databases © myNoSQL)
Friday, 31 December 2010
Happy New Year SQLers and NoSQLers
Just want to wish all readers and friends I’ve made over here a great and exciting 2011. Happy New Year!
Now let’s get the party started!
Original title and link: Happy New Year SQLers and NoSQLers (NoSQL databases © myNoSQL)
Sunday, 5 December 2010
To SQL or not to SQL Panel at CODEBITS IV
A panel discussion on NoSQL, NoSQL databases, and relational databases, featuring Salvatore Sanfilippo[1]
, Lenz Grimmer[2]
, Filipe David Borba Manana[3]
, and a forth person from SAPO whose name I couldn’t spell:
Original title and link: To SQL or not to SQL Panel at CODEBITS IV (NoSQL databases © myNoSQL)
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling