Cask Blog



Combining Hadoop and Spark in a Data Processing Pipeline

Tony Duarte

  CDAP includes an Application Development Framework so that Developers can build entire Applications with existing Big Data technologies – technologies such as Apache Hadoop, Apache Spark, Apache HBase, Apache Hive and more. CDAP has been used by Fortune 50 customers to help them do Data Ingestion and Data Egress from their data lakes and to help them … Read more


CDAP 4.1 – More Enterprise-Grade Hardening, Pre-Built Solutions and Enhanced UX

Nishith Nand

We are happy to announce the release of Cask Data Application Platform (CDAP) version 4.1. This new release brings with it some major enhancements and significant new capabilities in the platform, as well as new, ready-to-use solutions offered via Cask Market. CDAP 4.1 improves security by allowing fine grained secure impersonation. It introduces replication so … Read more



Monitoring Key Hadoop Operational Statistics using CDAP

Bhooshan Mogal

The Cask Data Application Platform (CDAP) is the first Unified Integration Platform for Big Data. It provides users with higher level abstractions and APIs over complex, low-level systems for building  Big Data applications. It does the heavy lifting involved in integrating various platforms in the Apache Hadoop ecosystem, to provide a single end-to-end platform. To … Read more


CDAP 4 – Introducing Cask’s Big Data App Store, Cask Market, plus Cask Wrangler, a new UI and more

Vinisha Vyasa

We are very happy to introduce the general availability of the 4th generation of Cask’s flagship product – CDAP 4. This release builds on what we learned over the past few years from our users and the community. This post summarizes the major enhancements in CDAP 4, namely, New & Revamped User Experience, Cask’s “Big … Read more


Integrating CDAP with Microsoft Azure HDInsight

We recently announced the integration of CDAP with the Microsoft Azure HDInsight platform. This post will give a behind-the-scenes look at this integration. First, a bit about the integration itself. Azure HDInsight is an Apache Hadoop and Spark distribution powered by the cloud. This means that it handles any amount of data, scaling from terabytes … Read more


Actions in Cask Hydrator

Chris Lu

This summer as an intern at Cask, I had the opportunity to work on Cask Hydrator. Since its launch in 2015, Cask Hydrator has been a broadly used and important application on CDAP to help users easily build and run big data pipelines. I helped evolve Hydrator further by adding the Action function to it. … Read more