This is the second in a short series around Strata San Jose 2016. If you haven’t read part one, you can check out Hadoop Is Finally Over here.
In our last episode, we talked about new tween Hadoop, which celebrates its 10th birthday this year. We also looked at the death of Hadoop, MapReduce, and visualization. If you haven’t read part one, go check it out!
Not to get too George RR Martin on you, but there’s another death coming. It’s not who you think.
SPLUNK IS OVER…
[Disclosure: I work with Splunk in my day job, and have been a fan since the first t-shirt came out. These thoughts do not purport to represent Splunk or my day job.]
[Further disclosure: I’ll mention another company in this section, and I was a customer of the cofounder’s previous employer for many years. I worked with him in that role, and I think he’s cool. He did not pay me to say that. Yet.]
Splunk is also a tween now. They’ve been selling software for just over ten years, and I’d guess a fair number of their 11,000+ customers don’t even think of them as big data. But they are.
Every so often I’ve heard that Splunk is dead. The stock dropped a few bucks, a new company comes out doing log/machine data analysis, someone finds an incriminating photo of a pony and thinks it’s Buttercup… But somehow they’re still growing, and are on track to become one of only two dozen (or fewer) software companies to reach a $5 billion revenue level in the coming years, in the entire history of software companies.
That doesn’t mean Splunk is the only solution out there, or the best for every use case. If any one tool solved every problem at every scale for every customer, there would be little use for software innovation or startups. So when a new contender shows up, especially when its founder pedigree is impressive and pertinent.
Rocana, co-founded by Eric Sammer of Cloudera fame, is just about two years old now, and they’re looking to take a nibble out of Splunk’s market share. They’re focusing on Splunk’s original turf, operational analytics, and building their platform on top of Hadoop–something Splunk has some capability for with Hunk, but Rocana does it natively, which wasn’t really an option ten years ago when Splunk released their first version.
As hinted several times earlier, not every solution fits every problem, no matter what kind of hammer you use. Splunk is making a lot of noise and a lot of headway in the security analytics space, with their security add-ons, the award-winning Enterprise Security premium offering, and the 2015 Caspida acquisition that became their User Behavioral Analytics (UBA) offering. Rocana isn’t looking at that space. They’re just looking at operational analytics at this point, building a customer base, adding and enhancing features, and getting settled into the market.
The other non-elephant in the room of course is Elastic, the commercial offering around the ELK (Elasticsearch, Logstash, and Kibana) stack. The components of ELK and Elastic are a bit more mature than the application stack for Rocana, but not nearly as mature as Splunk. There are also performance considerations around Elastic, although I’ve heard that an upcoming release will start to take a bite out of those issues.
Which should you use? Whatever fits your use cases of course. Yep, my favorite weaselly answer to this sort of question.
- If you’re looking for an established offering with pre-built integrations for hundreds of technologies and use cases, and with an established (and rabid in the best way) user base, Splunk should be first on your list.
- If you’re looking for a security platform based on your machine data and log aggregation, and don’t mind one that’s won a few awards, Splunk is also the place to start.
- If you’re focusing on operational analytics and aren’t swayed by SplunkBase, try Splunk and Rocana.
- If you’re focusing on operational analytics and have cost concerns around Splunk, either poke more enthusiastically at your Splunk sales rep or test-drive Rocana.
- If you’re a bit more on the DIY side, or are in an organization that strongly advocates for open source software, you will probably want to look at Elastic (and keep an eye on release notes to see how the hardware efficiency is going).
Any of the three offerings have free trials and free entry-level offerings so you can test them yourself.
And don’t be afraid to consider more than one tool. As I’ve explained at a couple of Splunk Live events (and .conf last year), lots of companies will use multiple tools for different use cases, and even stack them (put Platfora or a custom dashboard in front of Splunk? Use Scribe or Flume or a Syslog replacement to dump everything into HDFS first? Sure, if it meets your needs, go for it. If nothing else, use your free tier license for each of the products and feed the same test data in, and see what gives you the best results, latency, performance, and cost efficiencies.
WHERE DO WE GO FROM HERE?
Do a data audit in your environment, finding the data you have access to and can work with. Think about what business value you can derive from that data, and who can use it to your business’s benefit. Check out the trials and free tier offerings from Elastic, Rocana, and Splunk, to see which will move you forward down your data path(s).
Oh, and start thinking about whether .conf or Strata NYC is more your style this year–Strata+Hadoop World NYC is the last week of September at the Javits Center in Manhattan as usual, but for a change .conf2016, Splunk’s worldwide user conference is the same week at the Walt Disney World Swan & Dolphin facility.
[Spoiler: Since my teenager hasn’t been to Disney World since she was Splunk’s age, I will probably need to go to .conf this year.]