Strata Data NYC 2017 Recap: Multi-Cloud, AI and More

Max Herrmann

Max Herrmann is the Chief Marketing Officer at Cask, where he leads the company's marketing efforts across product, communications and demand generation. Prior to Cask, Max marketed data and cloud infrastructure at GridGain and Microsoft.

Max Herrmann

Last month in New York City, it was good bye Strata+Hadoop World, welcome Strata Data Conference! While the name of one of the most important big data industry events has changed, its importance has not. Clearly, Hadoop was no longer the only (or even predominant) elephant in the room at this year’s conference, and even Apache Spark with its potential to disrupt older data architectures was less of a topic than the actual business problems that Spark, Hadoop and emerging cloud technologies can help solve in concert. Here is a quick summary of a few things that stood out for me at this year’s Strata Conference:

Hybrid and Multi-Cloud Environments are the New Normal

Data exists and is usually available for analytics wherever it is born or lands first – in data warehouses, operational databases, at rest or in flight, on-premises or in the cloud. Clearly, there is no one-size-fits-all solution (just as there hasn’t been for on-premises Hadoop or Spark clusters). Rather, IT teams in companies large and small are diligently evaluating their use cases and requirements, as well as different infrastructures and vendors, after which they make decisions on what technologies to choose and how to integrate between them. In many cases, this will mean a hybrid, federated environment, which, as Jon Gray, Cask founder and CEO, pointed out during his Strata talk this year, will allow for optimized architectures intelligently mashed up from complimentary data and cloud technologies, support increased workload and deployment (including ‘lift & shift’) flexibility, and accelerate business agility. In fact, this year at Strata, it felt like the virtual, multi-cloud data lake has finally become a real “thing”. As one of its key benefits, it will allow IT architects to move data and workloads seamlessly between different types of environments, as long as ‘portability’ is designed in right from the start.

The Emergence of Unified Platforms

While I didn’t notice any new product categories introduced at this year’s Strata Conference, it struck me that a number of exhibiting vendors started to advertise “unified” capabilities in their offerings – maybe as a way to tackle any concerns around complexity head-on that customers might associate with new multi-stack, multi-cloud big data environments? E.g., Databricks messaged the performance, productivity and cost benefits of their “Unified Analytics Platform”, while iguazio touted their new “Unified Data Platform”. (Recognizing early on the importance of helping customers overcome the complexities of traditional point products and data siloes, Cask introduced its own “Unified Integration Platform” for big data at last year’s Strata+Hadoop World in New York.)

Artificial Intelligence is Getting Real

Much has been written in the past couple of years about the re-birth of artificial intelligence, machine learning and deep learning, powered by the explosive growth of data and virtually unlimited compute cycles available in the cloud, and there was certainly no shortage of this type of content at this year’s Strata Conference either (more than half of all the keynote and breakout sessions at Strata made some reference to AI/ML/DL). In fact, often motivated by the desire to simplify the lives of data scientists and increase the level of automation, advanced analytics seems to get even more attention as a way to derive insights from data more efficiently and quickly, before the data becomes stale. At Strata, many exhibiting vendors appeared to be prepared and ready to provide AI/ML/DL extensions along with their traditional big data offerings.

Whether people refer to their big data initiatives as Hadoop-based, Spark-driven or just data and cloud environments, it definitely became clear at this year’s Strata conference that a single technology stack will not meet their infrastructure needs of the future. If you are already thinking about the best way to tackle the integration challenges posed by multiple stacks, virtual data lakes and hybrid environments, take the CDAP Cloud Sandbox for a spin and send us your feedback and any questions via chat or our website.




<< Return to Cask Blog