Shep is what will enable seamless two-way data-flow across the systems (nb: Hadoop and Splunk), as well as opening up two-way compute operations across data residing in both systems.
- Query both Splunk and Hadoop data, using Splunk as a “single-pane-of-glass”
- Data transformation utilizing Splunk search commands
- Real-time analytics of data streams going to mutliple destinations
- Splunk as data warehouse/marts for targeted exploration of HDFS data
- Data acquisition from logs and apis via Splunk Universal Forwarder
And in case you don’t know much about Splunk here’s a short interview with Erik Swan, CTO and co-founder of Splunk recorded at Hadoop World 2011 by Barton George of Dell:
What I don’t understand though is why announcing an open source project, but keeping it behind a private beta.
Original title and link: Combining Splunk and Hadoop: Introducing Shep ( ©myNoSQL)