Cask Blog

Combining Hadoop and Spark in a Data Processing Pipeline

Tony Duarte

  CDAP includes an Application Development Framework so that Developers can build entire Applications with existing Big Data technologies – technologies such as Apache Hadoop, Apache Spark, Apache HBase, Apache Hive and more. CDAP has been used by Fortune 50 customers to help them do Data Ingestion and Data Egress from their data lakes and to help them … Read more


A Hydrator Python Transform for Python nerds like you and me!

John Jackson

Before every CDAP release, we at Cask conduct an internal hackathon to use CDAP and work on interesting features. A few Cask engineers got together and, wanting to open up the capabilities of Cask Hydrator beyond Java developers, decided to build a transformation that uses user-written Python. Beginning with CDAP release 3.2, the CDAP UI … Read more