Splunk: All content tagged as Splunk in NoSQL databases and polyglot persistence
Monday, 29 April 2013
Boundary for Splunk app for correlating alerts
Alex Williams for TechCrunch:
Boundary‘s application performance monitoring technology is now integrated into Splunk‘s enterprise platform, providing a window into apps that increasingly are distributed across cloud and on-premise virtualized environments.
At first I thought this means Boundary will use Splunk as the backend for the data. But Boundary is a service so that’s not the case. Plus Splunk can already be used for network management and monitoring.
According to the post, “Splunk real-time alerts are tagged as annotations in Boundary’s time-series graphs. Customers can then correlate alerts against application flow and performance data.” So basically this is monitoring your monitoring system, right?
Original title and link: Boundary for Splunk app for correlating alerts (©myNoSQL)
Thursday, 14 March 2013
Hadoop and Splunk Use Cases
Good post from Splunk about the use cases where Hadoop and Splunk coexist and cooperate:
The Splunk and Hadoop communities can benefit from each other’s strengths. Below are several examples of customers that use both environments.
- Splunk then Hadoop
- Splunk: collects, visualizes and analyzes the data
- Hadoop: ETL and other batch processing
- Hadoop then Splunk
- Hadoop: collects the data
- Splunk: visualization
- Bi-directional: Splunk and Hadoop collect different artifacts and share the data that Hadoop needs for ETL or batch analytics and Splunk needs for real-time analysis and visualization
- Splunk monitors Hadoop
Original title and link: Hadoop and Splunk Use Cases (©myNoSQL)
via: http://blogs.splunk.com/2012/11/28/hadoop-and-splunk-use-cases/
Tuesday, 24 April 2012
Splunk Surges After Pricing Above Range in Web Data IPO
The stock, listed on the Nasdaq Stock Market under the symbol SPLK, climbed to $35.48 at the close in New York. The San Francisco-based company raised $229.5 million in its IPO, selling 13.5 million shares at $17 apiece, it said in a statement yesterday. Splunk’s market value of $1.57 billion at the time of the sale jumped to $3.28 billion.
This is a good answer to my question why Splunk filed for IPO. Plus it’s a solid sign that investors see the potential of the Big Data market.
Original title and link: Splunk Surges After Pricing Above Range in Web Data IPO (©myNoSQL)
Monday, 19 March 2012
Big Data Market Analysis: Vendors Revenue and Forecasts
I think this is the first extensive Big Data report I’m reading that includes enough relevant and quite exhaustive data about the majority of players in the Big Data market, plus some captivating forecasts.
As of early 2012, the Big Data market stands at just over $5 billion based on related software, hardware, and services revenue. Increased interest in and awareness of the power of Big Data and related analytic capabilities to gain competitive advantage and to improve operational efficiencies, coupled with developments in the technologies and services that make Big Data a practical reality, will result in a super-charged CAGR of 58% between now and 2017.

While there are many stories behind these numbers and many things to think about, here is what I’ve jotted down while studying the report:
- it’s no surprise that “megavendors” (IBM, HP, etc.) account for the largest part of today’s Big Data market revenue
- still, the revenue ratio of pure-players vs megavendors feels quite unbalanced: $311mil out of $5.1bil
- the pure-player category includes: Vertica, Aster Data, Splunk, Greenplum, 1010data, Cloudera, Think Big Analytics, MapR, Digital Reasoning, Datameer, Hortonworks, DataStax, HPCC Systems, Karmasphere
- there are a couple of names that position themselves in the Big Data market that do not show up in anywhere (e.g. 10gen, Couchbase)
- this could lead to the conclusion that the companies that include hardware in their offer benefit of larger revenues
- I’m wondering though what is the margin in the hardware market segment. While not having any data at hand, I think I’ve read reports about HP and Dell not doing so well due exactly to lower margins
- see bullet point further down about revenue by hardware, software, and services
- this could explain why so many companies are trying their hand at appliances
- by looking at the various numbers you can see that those selling appliances usually have a large corporation behind supporting the production costs for hadware and probably the cost of the sales force
- in the Big Data revenue by vendor you can find quite a few well-known names from the consulting segment
- the revenue by type pie lists services as accounting for 44%, hardware for 31%, and software for 13% which might give an idea of what makes up the megavendors’ sales packages
- most of the NoSQL database companies and Hadoop companies are mostly in the software and services segment
Great job done by the Wikibon team.
Original title and link: Big Data Market Analysis: Vendors Revenue and Forecasts (©myNoSQL)
via: http://wikibon.org/wiki/v/Big_Data_Market_Size_and_Vendor_Revenues
Friday, 16 March 2012
Polyglot Persistence Architecture at Socialize: Splunk for MapReduce & Big Data Analysis
Very informative post on Socialize blog about their data flow and the data analysis stack used to processing it. The post is missing the architecture diagram, so I took the time to reconstruct it based on the details in the article:

Click to view full size diagram of Socialize architecture
The traditional solution is to use aggregate functions in the RDBMS such as count() to get the aggregate results but this presents a few problems at a large scale:
- Aggregating rows in a database creates unneeded load on the server
- Data could be stored in multiple sharded databases and the aggregated results would be inaccurate.
- Data could be stored in other datastore like a NoSQL datastore or even flat log files.
- Data is stored in an uncommon format across many sources.
Original title and link: Polyglot Persistence Architecture at Socialize: Splunk for MapReduce & Big Data Analysis (©myNoSQL)
Monday, 16 January 2012
Splunk, the Search Engine for Machine Data Company, Files for IPO. Why?
Splunk, the company which recently announced Shep a solution combining Splunk’s tool for collecting, monitoring, analyzing, searching, and reporting on massive streams of real-time and historical machine data with Hadoop, has filed for IPO.
Giving the following facts:
- you are in a market (Big Data, Web of Things) that is confirmed to see tremendous growth
- you have over 3300 customers,including a majority of the Fortune 100
- your revenues almost doubled year-over-year
the real question to be answered is why filing for IPO?
None of the posts I’ve read (TechCrunch, GigaOM, CTO Vision) gives any answers.
The very next question is who is going to be next rushing to capitalize on the growing trends of Big Data. Many names sprang to mind, but firstly what are your bets?
Original title and link: Splunk, the Search Engine for MacHine Data Company, Files for IPO. Why? (©myNoSQL)
Tuesday, 6 December 2011
Combining Splunk and Hadoop: Introducing Shep
Shep is what will enable seamless two-way data-flow across the systems (nb: Hadoop and Splunk), as well as opening up two-way compute operations across data residing in both systems.
- Query both Splunk and Hadoop data, using Splunk as a “single-pane-of-glass”
- Data transformation utilizing Splunk search commands
- Real-time analytics of data streams going to mutliple destinations
- Splunk as data warehouse/marts for targeted exploration of HDFS data
- Data acquisition from logs and apis via Splunk Universal Forwarder
And in case you don’t know much about Splunk here’s a short interview with Erik Swan, CTO and co-founder of Splunk recorded at Hadoop World 2011 by Barton George of Dell:
What I don’t understand though is why announcing an open source project, but keeping it behind a private beta.
Original title and link: Combining Splunk and Hadoop: Introducing Shep (©myNoSQL)
Wednesday, 30 November 2011
Explaining Hadoop to Your CEO
Dan Woods (Forbes):
The answer is, yes, Hadoop could be helpful, but there are other technologies as well. For example, technologies such as Splunk allow you to explore big data sets in a way that’s more interactive than most Hadoop implementations. Splunk not only lets you play with big data; you can also distill it and visualize it. Pervasive’s DataRush allows you to write parallel programs using a simplified programming model, and then process lots of data at scale. 1010data allows you to look at a spreadsheet that has a trillion rows, as well as handle time series data. EMC Greenplum and Teradata Aster Data and SAP HANA will also want a crack at your business. If you take any of these technologies and combine them with QlikView, Tableau, or TIBCO Spotfire, you can figure out what a big data set means to your business very quickly. So if your job is understanding the business value of the data, Hadoop is one of many things that you should analyze.
Translation:
Blah blah blah Big Data, blah blah blah list of vendors, blah blah blah Big Data
It might even work for a dummy CEO.
Original title and link: Explaining Hadoop to Your CEO (©myNoSQL)
via: http://www.forbes.com/sites/danwoods/2011/11/03/explaining-hadoop-to-your-ceo/
