Avatar Gurdit Singh

Technical Blog

Home About

© 2026.

Newer

Part-3 'Transaction Log' : Delta Lake

13 Jun 2021
The Transaction Log is a collection of ordered json files. Which provides the latest version of a Delta Table state. More …

Part-2 'First Delta Table' : Delta Lake

05 Jun 2021
Let’s create our first Delta table! Like in databases, to define table definition and schema, and store in Delta format. More …

Part-1 'Overview' : Delta Lake

30 May 2021
In this first part, we will understand What kind of problem it causes for a typical data lake implementation More …

Episode-5 'Spark-submit vs Apache Livy' : Spark Performance Tuning

22 May 2021
Once a user application is bundled, it can be launched using the spark-submit script or via REST API apache Livy. More …

Episode-4 'File Formats' : Spark Performance Tuning

09 May 2021
Apache Spark supports many different file formats, common formats are CSV, JSON and other mainly used for big data analysis are Apache ORC, Apache Parquet and Apache Avro. More …
Older