Very often I jump to using Python for any sort of data processing. And I totally forget about the powerful tools available on pretty much every Linux/Mac box1.
Jeroen Janssens’s 7 command-line tools for data science presents 6 command line tools for fetching, filtering and transforming data: jq, json2csv, csvkit, scrape, xml2json, sample.
Then Leonardo Trabuco’s Working with data on the command line gives a quick roundup of the standard Linux tools:
If you understand the philosophy of Linux tools and get familiar with some of the tools listed above — I’ve never got too deep into
sed almost always tricks me, you’ll be able to do some nice data processing experimentation directly from the command line.
The one excuse I usually find for myself when doing this is that debugging command line tools behavior is not as pleasant as debugging some Python scripts. _Sort of an OK argument, but still an excuse._ ↩
Original title and link: Data processing command line-style ( ©myNoSQL)