Twitter: All content tagged as Twitter in NoSQL databases and polyglot persistence
“There’s not better way to write general-purpose Hadoop MapReduce programs when specialized tools like Hive and Pig aren’t quite what you need.”
Watch the video and slides after below.
✚ At Twitter, the creators of Scalding, different teams use different libraries for dealing with different scenarios.
✚ Dean Wampler is the co-author of the Programming Scala book so his preference for Scala is understandable.
✚ Do you know any other teams or companies using Scalding instead of Cascading or Cascalog?
Original title and link: An Overview of Scalding ( ©myNoSQL)
Google’s paper about their large-scale distributed systems tracing solution Dapper which inspired Twitter’s Zipkin:
Here we introduce the design of Dapper, Google’s production distributed systems tracing infrastructure, and describe how our design goals of low overhead, application-level transparency, and ubiquitous deployment on a very large scale system were met. Dapper shares conceptual similarities with other tracing systems, particularly Magpie  and X-Trace , but certain design choices were made that have been key to its success in our environment, such as the use of sampling and restricting the instrumentation to a rather small number of common libraries.
Download or read the paper after the break.
Interesting pull request submitted to Redis by Pierre-Yves Ritschard:
This is similar to what snowflake and the recent boundary solution do, but it makes sense to use redis for that type of use cases for people wanting a simple way to get incremental ids in distributed systems without an additional daemon requirement.
Snoflake is the network service for generating unique ID numbers at high scale used (and open sourced) by Twitter. Flake is an Erlang decentralized, k-ordered id generation service open sourced by Boundary.
Original title and link: Redis: Adding a 128-Bit K-Ordered Unique Id Generator ( ©myNoSQL)