ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Data Science & The Role of the Data Scientist

From the Wikibon blog infographic about data science and the data scientist:

Data science can be broken down into four essential parts:

  • mining data: collecting and formatting the information
  • statistics: information analysis
  • interpret: representation or visualization
  • leverage: implications of the data, application of the data, interaction using the data and predictions formed from studying it

The skills of a data scientist:

  • Hacking and Computer Science: knowing how to take advantage of computers and the internet to create data-mining formulas
  • Expertise in Mathematics, Statistics, Data Mining: Pulling important statistics and coherently organizing them using mathematic prowess and computer formulas
  • Creativity and Insight: Knowing what statistics are important and how to leverage them

In a recent post under the title Data beats math, Jeff Jonas[1] wrote:

Over the years, folks have often asked me what kind of math am I using to create large scale, real-time, context accumulating systems (e.g., NORA).  Some fond of Bayesian speculate I am using Bayesian techniques.  Some ask if I am using neural networks or heuristics.  A math professor said I was doing advanced work in the field of Set Theory.

My answer is always, “I don’t know any math.  I didn’t finish high school.  But I can explain how it works, step-by-step, and it is really quite simple.”

So data science starts with the passionate interest for the data. Then you are adding tools, processes, algorithms, and science to discover the secrets hidden inside data.

The role of the Data scientist


  1. Jeff Jonas: Chief Scientist, IBM Entity Analytics Group and an IBM Distinguished Engineer  

Original title and link: Data Science & The Role of the Data Scientist (NoSQL databases © myNoSQL)