This is some text inside of a div block.

This is some text inside of a div block.

How we can help

Meet the team

HiveMQ supports two ingestion paths into Azure: Datalake Extension for batch-based storage and Kafka Extension for real-time streaming

An ETL pipeline using Delta Live Tables transforms raw "bronze" data into structured silver or gold-level datasets ready for analytics.

Databricks’ SQL Warehouse interface helps expose curated Delta Live Tables to Power BI, building a bridge between raw IoT data and business analytics

From MQTT to Power BI: Streaming IoT Data with HiveMQ, Azure & Databricks

Connectivity

From MQTT to Power BI: Streaming IoT Data with HiveMQ, Azure & Databricks

How to move data from the edge to the data lake using data ingestion architectures for HiveMQ

HiveMQ supports two ingestion paths into Azure: Datalake Extension for batch-based storage and Kafka Extension for real-time streaming

An ETL pipeline using Delta Live Tables transforms raw "bronze" data into structured silver or gold-level datasets ready for analytics.

Databricks’ SQL Warehouse interface helps expose curated Delta Live Tables to Power BI, building a bridge between raw IoT data and business analytics

The challenge

The results

The approach

September 8, 2025

min read

Leonard Van Vlierberghe

AI & Software Engineer

We partnered with

Data Ingestion Architectures for HiveMQ

Turning MQTT messages into business-ready insights requires more than just a broker. It needs an end-to-end pipeline, from ingestion to transformation to visualization. At our company, we explored two main data flows to integrate HiveMQ with Azure and ultimately Power BI:

A Datalake Extension approach, which writes Parquet files directly to Azure Blob Storage
A Kafka Extension approach, streaming messages into Azure Event Hub, processed via Stream Analytics

Both routes land data into Databricks for structured transformation before reaching dashboards in Power BI. The best choice depends on your use case—but also on cost, latency, and flexibility. An overview of both methods and their dataflow is shown below.

Scheme showing the two routes of HiveMQ Architecture.

Using the Datalake Extension

HiveMQ’s Datalake Extension makes it easy to capture MQTT traffic and store it directly as Parquet files in Azure. This setup is well-suited for historical analytics and periodic data exploration. However, there are two important caveats:

It writes files in batch mode, meaning data is buffered before being written to storage, introducing certain delay and make it less suitable for real-time data analytics.
It requires a Professional HiveMQ license, which comes with a higher cost compared to the Kafka Extension.

This makes it ideal for scenarios where latency isn’t critical, but structured archival and downstream transformation is required.

Using the Kafka Extension with Azure Event Hub

The Kafka Extension enables HiveMQ to forward MQTT messages in real time using the Kafka protocol. When connected to Azure Event Hub (on the Standard tier), this unlocks real-time streaming into the Azure ecosystem.

With Stream Analytics, incoming MQTT events can be instantly processed and written to Parquet files.

Compared to the Datalake Extension:

The Kafka Extension supports lower latency and parallel streaming
It only requires a Starter HiveMQ license, making it more accessible in terms of licensing cost
It separates concerns: HiveMQ sends raw messages, Azure handles transformation
The costs in Azure will be higher though.

This architecture favors use cases that demand low-latency insights, operational monitoring, or alerting.

Transforming Bronze Data into Delta Live Tables

Whether data is ingested via the Datalake Extension or streamed through Event Hub, it ends up as raw Parquet files in Azure storage. These files are considered bronze-level data: they contain all events but are still unstructured and unfiltered.

In Databricks, we built an ETL pipeline using Delta Live Tables (DLTs) to process this data. This transformation step includes:

Filtering unnecessary records
Adding contextual metadata
Structuring the data into business-friendly formats
Converting narrow-format event logs into wide-format tables suitable for BI tools

The output of this ETL is considered silver or gold-level data, ready for use in decision-making, analytics, and visualization tools like Power BI.

Power BI Integration

Once Delta Live Tables are created in Databricks, they can be exposed to Power BI through Databricks’ SQL Warehouse interface. This allows users to build interactive dashboards directly on curated datasets, without additional pipelines or exports.

The result is a seamless bridge between raw IoT data and business analytics: MQTT data flows from HiveMQ to Azure, gets enriched and structured in Databricks, and is visualized in Power BI. Whether you're monitoring factory conditions or generating reports on energy usage, this flow gives you the full picture, from edge to insight.

Narrow vs Wide Format for BI

A key transformation that happens in Databricks is the pivot from narrow format to wide format:

In narrow format, each row represents a single measurement: timestamp, sensor ID, metric type, and value.
In wide format, each row aggregates all metrics from a single sensor at a single timestamp—ideal for dashboards.

Most ingestion pipelines (Kafka, Datalake) deliver narrow-format data. But BI tools like Power BI and Grafana work best with wide-format tables. The ETL step handles this reshape, making the data more intuitive and performant for end-users.

Conclusion: Choosing the Right Flow

Feature: Latency
Datalake Extension: Seconds to minutes
Kafka + Event Hub: Near real-time
Feature: Licensing
Datalake Extension: Requires Professional
Kafka + Event Hub: Works with Starter
Feature: Complexity
Datalake Extension: Lower
Kafka + Event Hub: Higher (but scalable)
Feature: Output Format
Datalake Extension: Parquet
Kafka + Event Hub: Parquet via Stream Analytics
Feature: Transformation
Datalake Extension: Databricks DLT
Kafka + Event Hub: Stream Analytics + Databricks DLT
Feature: Best for
Datalake Extension: Historical batch processing
Kafka + Event Hub: Real-time dashboards & alerting

Both architectures are powerful. The Datalake Extension excels in simplicity and archiving. The Kafka Extension, on the other hand, brings real-time capabilities and a more flexible streaming pipeline—without requiring an expensive license.

For businesses looking to bridge industrial IoT with enterprise reporting, HiveMQ, Azure, and Databricks offer a robust and scalable stack.

Curious to see how this works with ignition as a base platform? You might want to check out our related article: "From Ignition to Power BI: Streaming IoT Data with Azure & Databricks"

How we can help

Meet the team

From MQTT to Power BI: Streaming IoT Data with HiveMQ, Azure & Databricks

How to move data from the edge to the data lake using data ingestion architectures for HiveMQ

The challenge

The results

The approach

September 8, 2025

Leonard Van Vlierberghe

Data Ingestion Architectures for HiveMQ

Using the Datalake Extension

Using the Kafka Extension with Azure Event Hub

Transforming Bronze Data into Delta Live Tables

Power BI Integration

Narrow vs Wide Format for BI

Conclusion: Choosing the Right Flow

Read our other cases

Read our other blogs

From Ignition to Power BI: Streaming IoT Data with Azure & Databricks

Powering the Modern Grid: How Unified Namespace Helps Utilities

How we can help

Meet the team

From MQTT to Power BI: Streaming IoT Data with HiveMQ, Azure & Databricks

How to move data from the edge to the data lake using data ingestion architectures for HiveMQ

The challenge

The results

The approach

September 8, 2025

Leonard Van Vlierberghe

Data Ingestion Architectures for HiveMQ

Using the Datalake Extension

Using the Kafka Extension with Azure Event Hub

Transforming Bronze Data into Delta Live Tables

Power BI Integration

Narrow vs Wide Format for BI

Conclusion: Choosing the Right Flow

Read our other cases

Read our other blogs

From Ignition to Power BI: Streaming IoT Data with Azure & Databricks

Powering the Modern Grid: How Unified Namespace Helps Utilities

UNS:streamline your‍operations with‍real-time data

UNS:
streamline your
‍operations with
‍real-time data