Building a Scalable Data Lakehouse Architecture with Azure and Databricks - Brillio

About the Client:

The client is one of the EU’s largest debt management companies on a mission to help our customers take control of their debt. The client was founded following the merger of the UK, Nordic, and German market leaders.

Customer Challenge

The client was struggling with disparate data sources, which led to multiple versions of truth across business groups, reporting teams, and data science teams. Consequently, reports had to be generated manually, and accuracy validations required significant time and effort.

The existing on-premise applications were not designed to handle the size and complexity of the data. Previously, queries took weeks to complete and required multiple manual interventions due to errors. As a result, model and report deployment pipelines were delayed for years.

Additionally, following acquisitions and organic growth, there were introduced new source systems and data sets that could not be integrated into the existing data analytics landscape.

Brillio’s Solution:

Following a comprehensive analysis of the client’s environment and necessities, we have designed and built a scalable Data Lakehouse architecture using Azure Data Lake and Databricks Delta Lake to support complex data science products, such as feature stores, sample selectors, and reporting data sets.

Brillio performed development and deployment of end-to-end models using Databricks,MLOps, DevOps, and leveraged AKS to deploy scoring models, along with Synapse to store batch scores.

  • End to end ML enablement using feature stores on delta lakehouse and MLOps.
  • Enabled DataOps and MLOps with Azure Databricks, DevOps and Azure Automation.
  • Used Synapse as the enterprise data warehouse to support Qliksense and PowerBI reporting.
  • Automated support and maintenance using DevOps, Azure Logic Apps, and Azure Automation.
  • Built in multi nation GDPR compliance and PII management solution. 

Business Impact: 

Following our implementation, the client managed to achieve:

  • Improved Trust in Data – Cleansing, validation, automation, and reporting secure users’ trust in the data.
  • Data Lakehouse Architecture – Ensure ACID transaction on big data scale and processing, ensuring cost efficiency and performance.
  • High-Performance Engineering – DevOps and automation partner to ensure faster development-to-release cycles.
  • Improved User Experience
    • Model deployment time reduced from 8-12 months to 2-4 months.
    • 90% faster report development time.
    • 70% quicker new data source onboarding.
    • DataOps and MLOps enablement.

Let’s create something brilliant together!

Let's Connect