NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



It Is About Apache Hive, but What Is a SerDe?

The original title of the article is “How-to: Use a SerDe in Apache Hive“, so I knew it was something about Hive, but still had no idea what SerDe is:

The SerDe interface allows you to instruct Hive as to how a record should be processed. A SerDe is a combination of a Serializer and a Deserializer (hence, Ser-De). The Deserializer interface takes a string or binary representation of a record, and translates it into a Java object that Hive can manipulate. The Serializer, however, will take a Java object that Hive has been working with, and turn it into something that Hive can write to HDFS or another supported system. Commonly, Deserializers are used at query time to execute SELECT statements, and Serializers are used when writing data, such as through an INSERT-SELECT statement.

On one side we have the Spring frameworks with names like PreAuthenticatedGrantedAuthoritiesWebAuthenticationDetails1, then we have YouAreDeadException and end with SerDe. No middle ground in the Java world.

  1. Jacek found this Spring class name which has 59 characters. His post is from 2011, so who knows if there isn’t a longer one since then. 

Original title and link: It Is About Apache Hive, but What Is a SerDe? (NoSQL database©myNoSQL)