Data Lakehouse Implementation & Regression Tracking

Data Lakehouse Implementation & Regression Tracking

Problem

The client lacked visibility into device runtime and startup behavior and performance, which led to difficulties in making data-driven decisions. They needed a solution to process and analyze data from multiple sources, to monitor field reports feature lifecycle, to become a data-driven company.

Solution

Our team addressed the client’s problem by implementing a Data Lakehouse System. Data from multiple sources is ingested, processed, stored, analyzed, and visualized in the system, which is also a base for advanced analytics and Machine Learning. The system allows the client to track all important metrics and KPIs. All-important data is centralized in the system, which enables the client to monitor partner reports and the integrated feature lifecycle. After implementing the Data Lakehouse System and with some extra automation work on our end, the client requested weekly reporting dashboard for tracking of 100+ different KPIs and metrics while Machine Learning algorithm would run AI based predication of performance based in historic data . The client was able to predict trend in KPI changes and end-results in production and we were proud to enable another client in this journey of becoming a data-driven decision making firm

Key Metrics/Technologies

  • AWS: a cloud computing platform for building, deploying, and scaling applications.
  • Apache Spark: an open-source distributed computing system for big data processing.
  • Apache Trino: a distributed SQL query engine for big data analytics.
  • Apache Airflow: a platform to programmatically author, schedule, and monitor workflows.
  • Delta Lake: an open-source storage layer that brings reliability to data lakes.
  • Terraform: a tool for building, changing, and versioning infrastructure safely and efficiently.
  • Python: a programming language used for data analysis and data science.
  • Looker: a business intelligence and data visualization tool.

Client

The client is a German automotive Tier1 delivering ECUs for different OEMs.