Derrick Harris writing about 28msec, still-in-stealth-mode, generic query language:
Their solution was to create a platform able to extract data from any of
these sources, transform it into a standard format, and then let users
analyze it using a single query language that looks a lot like the SQL they
already know. 28msec is based on the open source JSONiq and Zorba query
languages and will be available as a cloud service.
This sounds like a variant of an ETL process: Extract-Transform-Query. But it got me thinking of what Daniel Abadi has wrote about the difference between Hadapt and PolyBase, HAWQ—just replace Hadoop with another source of data and SQL with JSONiq:
[…] they all can access data in Hadoop, but there needs to be some sort of structured schema defined in order for the database to understand how to access it via SQL. So, bottom line, Polybase/SQL-H/Hawq let you dynamically get at data in Hadoop/HDFS that could theoretically have been stored in the DBMS all along, but for some reason is being stored in Hadoop instead of the DBMS.
The question is not if this process will work (ETL processes have been around for quite a while), but what can you do to optimize this extract-transform-query process.
Original title and link: 28msec - query data from any source in real time ( ©myNoSQL)