Data science: All content tagged as Data science in NoSQL databases and polyglot persistence
Great decision flowchart created by Aaron Cordova to help answer the question: what tools should I use to process my data:
Original title and link: SQL or Hadoop: What Tools Should I Use to Process My Data? ( ©myNoSQL)
Watch this interview with DJ Patil, formerly LinkedIn chief scientist and now data scientist in residence at Greylock Partners, to find the answer.
Teaser: a passion for really getting to an answer.
After seeing the excerpt from Jonathan Harris’ talk at Data Scientist Summit I really wanted to post a link to some of the videos. But they are all behind a registration gateway. Just in case you want to watch them—there are indeed some interesting titles— you’ll find them here.
From the Wikibon blog infographic about data science and the data scientist:
Data science can be broken down into four essential parts:
- mining data: collecting and formatting the information
- statistics: information analysis
- interpret: representation or visualization
- leverage: implications of the data, application of the data, interaction using the data and predictions formed from studying it
The skills of a data scientist:
- Hacking and Computer Science: knowing how to take advantage of computers and the internet to create data-mining formulas
- Expertise in Mathematics, Statistics, Data Mining: Pulling important statistics and coherently organizing them using mathematic prowess and computer formulas
- Creativity and Insight: Knowing what statistics are important and how to leverage them
Over the years, folks have often asked me what kind of math am I using to create large scale, real-time, context accumulating systems (e.g., NORA). Some fond of Bayesian speculate I am using Bayesian techniques. Some ask if I am using neural networks or heuristics. A math professor said I was doing advanced work in the field of Set Theory.
My answer is always, “I don’t know any math. I didn’t finish high school. But I can explain how it works, step-by-step, and it is really quite simple.”
So data science starts with the passionate interest for the data. Then you are adding tools, processes, algorithms, and science to discover the secrets hidden inside data.