13 Jun 2021
The Transaction Log is a collection of ordered json files. Which provides the latest version of a Delta Table state.
More …
05 Jun 2021
Let’s create our first Delta table! Like in databases, to define table definition and schema, and store in Delta format.
More …
30 May 2021
In this first part, we will understand What kind of problem it causes for a typical data lake implementation
More …
22 May 2021
Once a user application is bundled, it can be launched using the spark-submit script or via REST API apache Livy.
More …
09 May 2021
Apache Spark supports many different file formats, common formats are CSV, JSON and other mainly used for big data analysis are Apache ORC, Apache Parquet and Apache Avro.
More …