Cask Blog

Bhooshan Mogal

Bhooshan Mogal is a Software Engineer at Cask, where he is working on making data application development fun and simple. Before Cask, he worked on a unified storage abstraction for Hadoop at Pivotal and personalization systems at Yahoo.

Event Based Triggers for CDAP Pipelines

Bhooshan Mogal

Data Engineering groups in large enterprises are typically decentralized. Teams develop specialized skill sets in particular areas of data processing, and have specific charters. For example, a team may be responsible for data acquisition. Another may be responsible for cleansing, transforming, normalizing and analyzing data. Another team of data scientists may be responsible for consuming … Read more


Monitoring Key Hadoop Operational Statistics using CDAP

Bhooshan Mogal

The Cask Data Application Platform (CDAP) is the first Unified Integration Platform for Big Data. It provides users with higher level abstractions and APIs over complex, low-level systems for building  Big Data applications. It does the heavy lifting involved in integrating various platforms in the Apache Hadoop ecosystem, to provide a single end-to-end platform. To … Read more


Powering BI with ODBC Connectors for CDAP

Bhooshan Mogal

Open Database Connectivity (ODBC) is the de-facto standard API for accessing data stored in relational databases. ODBC drivers allow applications across a variety of platforms (especially non-Java) to access relational databases in a manner independent from the implementation and the operating system. In this blog we will discuss the integration between CDAP Datasets and Tableau … Read more



Announcing CDAP 3.2 – Hydrator and much more!

Bhooshan Mogal

We are excited to announce the Cask Data Application Platform (CDAP) 3.2 release. This release brings many enhancements to existing CDAP features as well as lays the foundation for upcoming, advanced features—all designed to further simplify data application development. Cask Hydrator CDAP 3.2 introduces Cask Hydrator—a highly functional framework and UI to support self-service batch … Read more


CDAP Workflows: In Comparison with Apache Oozie

Bhooshan Mogal

Apache Oozie is a workflow scheduler system to manage Apache Hadoop™ jobs. It is one of the most popular open-source workflow scheduler systems for Hadoop. Cask Data Application Platform (CDAP) is an open-source platform to build and deploy data applications on Hadoop. CDAP provides abstractions on top of Hadoop that enable developers to rapidly build, … Read more


Multitenancy for Hadoop: Namespaces – Part II

Bhooshan Mogal

We introduced the concept of namespaces and how it helps to bring multitenancy to Apache Hadoop in a previous blog. We also briefly introduced the use of namespaces in CDAP,  leaving out the implementation details. In this blog we’ll discuss some of the requirements that influenced the design of namespaces in CDAP, as well as … Read more


Multitenancy for Hadoop: Namespaces

Bhooshan Mogal

As a data processing platform, Hadoop‘s popularity today is often attributed to its cost-effectiveness, derived equally from the usage of commodity hardware and from the ability to co-locate work on shared compute and storage resources. Sharing resources allows organizations to maximize the throughput and utilization of a small number of large clusters instead of managing a large … Read more