16 Jan 2021
A Kafka producer is an application that can act as a source of data in a Kafka cluster. A producer can publish messages to one or more Kafka topics.
More …
09 Jan 2021
With reference to storage in Kafka. Partitions are the units of storage in Kafka for messages and Topic can be thought of as being a container in which these partitions lie.
More …
01 Nov 2020
You can generate SURROGATE_KEY by apache spark to automatically generate numerical Ids for rows as you enter data into a table.
More …
28 Oct 2020
You can generate SURROGATE_KEY by apache spark to automatically generate numerical Ids for rows as you enter data into a table.
More …
25 Oct 2020
Apache Spark already performs data processing in parallel. Spark runs multiple tasks among each executor to achieve parallelism, however, it is not true at job level.
More …