Just a few months ago, I was seated next to Jon Gray, CEO of Cask, at a dinner I attended thanks to a very gracious invitation from Mike Olson at Cloudera. In my final two years at Intel, I helped create the Datacenter Software Division. That group was a startup inside Intel, and by the spring of 2014 after 24 years at a great company, I’d been bitten with the desire to try my entrepreneurial skills outside.
In almost a decade in the datacenter business at Intel, I saw the transformation of enterprise technology take root. Innovation in the datacenter is happening most in the high-performance computing segment, where performance demands are insatiable, and the Internet services segment, where scale and efficiency are unprecedented.
Both of these segments are by their heritage based on open source software, industry standards, and commodity hardware. I saw open source software evolve from cheaper versions of established packaged apps to becoming a driving force of innovation beyond anything available. I believed then, as I do now, that we’re in the midst of what one of my heroes, Andy Grove, termed an inflection point in the shift away from proprietary enterprise technologies.
More recently at Intel, I was exposed to the impact and potential for big data solutions, particularly those based on Hadoop and related projects. My team launched Intel’s distribution of Hadoop and helped drive the strategy that led to Intel’s partnership with Cloudera. I think it’s virtually impossible to overestimate the impact big data solutions will have. We’re just at the infancy of harnessing the value of data, yet organizations are already transforming business models and delivering solutions and services impossible to imagine just a few years ago.
Yes, big data can be abused and there are too many examples of questionable or even creepy uses of big data. However, I look at the power of big data to make our world safer by finding threats quicker, to make our society healthier as big data is applied to medicine, and even to make our planet more sustainable as we use data to more effectively find and use energy. You can call me an unqualified advocate for the power of big data technology.
In my time building a business around Hadoop, I observed two key challenges limiting the ability of organizations to tap the potential of their data with available technology. First, organizations need compelling use cases to drive business value. Batch processing can only get organizations so far. The second challenge is the skills gap – there are lots of analysts and programmers with skills in Java and SQL, but too few that can manage the complexity of Hadoop.
Back to my dinner with Jon. Not only did we hit it off personally, as I subsequently did with the rest of the Cask team, but I saw immediately the value the team had created to address both of the big challenges to Hadoop adoption. The Cask Data Application Platform (CDAP) uses data and application virtualization on Hadoop to enable a much broader range of use cases, including those requiring both batch and real-time processing. Just as importantly, CDAP abstracts many of the low level details of the various Hadoop projects, making it possible for SQL and Java programmers to more quickly become Hadoop developers. When Jon shared with me the plan to release all of CDAP to open source, I was hooked.
Talented team, compelling technology, world-class investors, a huge market opportunity, and a group of people I’m thrilled to be working side-by-side with – what a hell of a combination! I wanted to find a great startup to follow up my career at a great company, and just a few weeks in, I’m certain I found it.