This is some text inside of a div block.
Circle
This is some text inside of a div block.

How we can help

Meet the team

Circle

HiveMQ supports two ingestion paths into Azure: Datalake Extension for batch-based storage and Kafka Extension for real-time streaming

Red circle

An ETL pipeline using Delta Live Tables transforms raw "bronze" data into structured silver or gold-level datasets ready for analytics.

Circle

Databricks’ SQL Warehouse interface helps expose curated Delta Live Tables to Power BI, building a bridge between raw IoT data and business analytics

From MQTT to Power BI: Streaming IoT Data with HiveMQ, Azure & Databricks
Connectivity
Circle

From MQTT to Power BI: Streaming IoT Data with HiveMQ, Azure & Databricks

How to move data from the edge to the data lake using data ingestion architectures for HiveMQ

Circle

HiveMQ supports two ingestion paths into Azure: Datalake Extension for batch-based storage and Kafka Extension for real-time streaming

Red circle

An ETL pipeline using Delta Live Tables transforms raw "bronze" data into structured silver or gold-level datasets ready for analytics.

Circle

Databricks’ SQL Warehouse interface helps expose curated Delta Live Tables to Power BI, building a bridge between raw IoT data and business analytics

Circle
The challenge
Technical detail
Technical detail
Technical detail
Technical detail
Circle
The results
Technical detail
Technical detail
The approach
September 8, 2025
3
min read
Technical detail

We partnered with

Data Ingestion Architectures for HiveMQ

Turning MQTT messages into business-ready insights requires more than just a broker. It needs an end-to-end pipeline, from ingestion to transformation to visualization. At our company, we explored two main data flows to integrate HiveMQ with Azure and ultimately Power BI:

  • A Datalake Extension approach, which writes Parquet files directly to Azure Blob Storage  
  • A Kafka Extension approach, streaming messages into Azure Event Hub, processed via Stream Analytics  

Both routes land data into Databricks for structured transformation before reaching dashboards in Power BI. The best choice depends on your use case—but also on cost, latency, and flexibility. An overview of both methods and their dataflow is shown below.

Scheme showing the two routes of HiveMQ Architecture.

Using the Datalake Extension

HiveMQ’s Datalake Extension makes it easy to capture MQTT traffic and store it directly as Parquet files in Azure. This setup is well-suited for historical analytics and periodic data exploration. However, there are two important caveats:

  • It writes files in batch mode, meaning data is buffered before being written to storage, introducing certain delay and make it less suitable for real-time data analytics.  
  • It requires a Professional HiveMQ license, which comes with a higher cost compared to the Kafka Extension.  

This makes it ideal for scenarios where latency isn’t critical, but structured archival and downstream transformation is required.

Using the Kafka Extension with Azure Event Hub

The Kafka Extension enables HiveMQ to forward MQTT messages in real time using the Kafka protocol. When connected to Azure Event Hub (on the Standard tier), this unlocks real-time streaming into the Azure ecosystem.

With Stream Analytics, incoming MQTT events can be instantly processed and written to Parquet files.

Compared to the Datalake Extension:

  • The Kafka Extension supports lower latency and parallel streaming  
  • It only requires a Starter HiveMQ license, making it more accessible in terms of licensing cost  
  • It separates concerns: HiveMQ sends raw messages, Azure handles transformation  
  • The costs in Azure will be higher though.

This architecture favors use cases that demand low-latency insights, operational monitoring, or alerting.

Transforming Bronze Data into Delta Live Tables

Whether data is ingested via the Datalake Extension or streamed through Event Hub, it ends up as raw Parquet files in Azure storage. These files are considered bronze-level data: they contain all events but are still unstructured and unfiltered.

In Databricks, we built an ETL pipeline using Delta Live Tables (DLTs) to process this data. This transformation step includes:

  • Filtering unnecessary records  
  • Adding contextual metadata  
  • Structuring the data into business-friendly formats  
  • Converting narrow-format event logs into wide-format tables suitable for BI tools  

The output of this ETL is considered silver or gold-level data, ready for use in decision-making, analytics, and visualization tools like Power BI.

Power BI Integration

Once Delta Live Tables are created in Databricks, they can be exposed to Power BI through Databricks’ SQL Warehouse interface. This allows users to build interactive dashboards directly on curated datasets, without additional pipelines or exports.

The result is a seamless bridge between raw IoT data and business analytics: MQTT data flows from HiveMQ to Azure, gets enriched and structured in Databricks, and is visualized in Power BI. Whether you're monitoring factory conditions or generating reports on energy usage, this flow gives you the full picture, from edge to insight.

Narrow vs Wide Format for BI

A key transformation that happens in Databricks is the pivot from narrow format to wide format:

  • In narrow format, each row represents a single measurement: timestamp, sensor ID, metric type, and value.  
  • In wide format, each row aggregates all metrics from a single sensor at a single timestamp—ideal for dashboards.  

Most ingestion pipelines (Kafka, Datalake) deliver narrow-format data. But BI tools like Power BI and Grafana work best with wide-format tables. The ETL step handles this reshape, making the data more intuitive and performant for end-users.

Conclusion: Choosing the Right Flow

  • Feature: Latency
    Datalake Extension: Seconds to minutes
    Kafka + Event Hub: Near real-time
  • Feature: Licensing
    Datalake Extension: Requires Professional
    Kafka + Event Hub:  Works with Starter
  • Feature: Complexity
    Datalake Extension: Lower
    Kafka + Event Hub:  Higher (but scalable)
  • Feature: Output Format
    Datalake Extension: Parquet
    Kafka + Event Hub:  Parquet via Stream Analytics
  • Feature: Transformation
    Datalake Extension: Databricks DLT
    Kafka + Event Hub:  Stream Analytics + Databricks DLT
  • Feature: Best for
    Datalake Extension: Historical batch processing
    Kafka + Event Hub:  Real-time dashboards & alerting

Both architectures are powerful. The Datalake Extension excels in simplicity and archiving. The Kafka Extension, on the other hand, brings real-time capabilities and a more flexible streaming pipeline—without requiring an expensive license.

For businesses looking to bridge industrial IoT with enterprise reporting, HiveMQ, Azure, and Databricks offer a robust and scalable stack.

Read our other cases

Read our other blogs

Find out how we helped our other clients with their digital transformation journeys.

read more