AI-Driven Retail Execution Platform

Lead Data & ML Engineer2023 - 2024Dany D.

Lead Data & ML Engineer

Data Engineer & Big Data

Key Expertise

Declarative Data EngineeringAdvanced Stream ProcessingReal-time CDC PipelinesMedallion Lakehouse DesignMLOpsAgentic AI Architecture

Experience

8+ years

Timezone

CET (UTC +1)

Skills

AI / ML

statsmodelsLangGraphMLflowLightGBMLangChain

Languages

Python

Databases

DatabricksAuto LoaderPostgreSQLUnity CatalogDelta Lake

Infrastructure

Azure DevOps PipelinesKafkaAWSTerraformAzurebanditGitLab CIKubernetescentralized loggingmypyDatadogruff

Frameworks

typed configuration frameworkPydanticpytestdeclarative streaming pipelinesPySparkDatabricks Asset BundlesScikit-learnDatabricks Workflows

Integrations & Protocols

Model Context Protocollog-based CDC connectorsPower BI

7-day risk-free trial

Response within 24 hours

View Full Profile

Overview

The project involved delivering an enterprise data and AI platform for a multinational consumer-goods company to orchestrate daily sales-execution planning for its field teams across several major retail channels and international markets. The platform combines a medallion-architecture lakehouse on Databricks with a portfolio of production ML models that translate raw retailer feeds, inventory signals, compliance data, and third-party audits into a ranked set of outlet-level tasks delivered to reps each morning. The system operates as a multi-tenant codebase where each retailer channel is onboarded as a configurable tenant rather than a fork.

Achievements

Brought a full production ML portfolio (demand forecasting at multiple time horizons, behavioural segmentation, compliance scoring, stock-availability risk, pricing anomaly detection, and a final task-ranking model) online and into daily operation. Reduced onboarding time for new retail channels from a multi-month custom build to a configuration exercise. Established a fully typed, validated configuration stack that catches misconfigurations before pipeline execution, eliminating an entire class of runtime failures.

Responsibilities

Architected the bronze/silver/gold lakehouse on Databricks with parallel bronze ingestion, dozens of silver transformation tables, and a downstream gold layer consumed by the ML pipeline.
Designed and implemented the ML inference DAG with explicit task dependencies, combining gradient-boosted forecasting, unsupervised segmentation, rules-driven compliance flagging, and a final prioritisation step that blends model-impact scoring with recency/cooldown constraints.
Built a schema-governance framework using typed column and table definitions for consistent DDL management and evolution across bronze, silver, and gold layers.
Implemented tenant-specific variation points (data sources, engineered features, enabled/disabled model outputs, output schema) so that a single codebase serves all downstream channels without branching.
Stood up the CI/CD pipeline with lint, strict type checking, security scanning, asset-bundle deployment to layered target environments, and automated semantic versioning.

Technologies Used

DatabricksPySparkDelta LakePythonScikit-learnLightGBMstatsmodelstyped configuration frameworkPydanticDatabricks Asset BundlesAzure DevOps PipelinesPower BIruffmypybandit

Dany D.

Lead Data & ML Engineer

Data Engineer & Big Data

Key Expertise

Declarative Data EngineeringAdvanced Stream ProcessingReal-time CDC PipelinesMedallion Lakehouse DesignMLOpsAgentic AI Architecture

Experience

8+ years

Timezone

CET (UTC +1)

Skills

AI / ML

statsmodelsLangGraphMLflowLightGBMLangChain

Languages

Python

Databases

DatabricksAuto LoaderPostgreSQLUnity CatalogDelta Lake

Infrastructure

Azure DevOps PipelinesKafkaAWSTerraformAzurebanditGitLab CIKubernetescentralized loggingmypyDatadogruff

Frameworks

typed configuration frameworkPydanticpytestdeclarative streaming pipelinesPySparkDatabricks Asset BundlesScikit-learnDatabricks Workflows

Integrations & Protocols

Model Context Protocollog-based CDC connectorsPower BI

7-day risk-free trial

Response within 24 hours

View Full Profile

This project was delivered by

Dany D.

View Full Profile

More Projects by Dany D.

2025-2026

Agentic Automation Platform for Document-Intensive Workflows

AI Architect & Tech Lead Data Engineer

The project involved architecting a greenfield agentic AI platform that automates the end-to-end processing of high-volume, document-heavy business cases for a regulated enterprise. A supervisor-style agent graph routes each case through a set of specialist agents that handle ingestion, enrichment, validation, coordination, and resolution, replacing manual review queues while keeping a human-in-the-loop checkpoint on high-stakes transitions. The agent layer sits on top of a cloud-native Databricks data platform with Unity Catalog governance, declarative streaming ingestion from an object-store landing zone, and a multi-region, multi-tenant infrastructure baseline.

LangGraphLangChainPythonPydanticPySpark+9

View Details

2022-2023

Cloud Lakehouse with Change-Data-Capture Ingestion

Senior Data Engineer & Architect

The project involved designing and delivering a cloud-native data platform for a financial-services institution moving off a fragmented legacy ETL stack. The platform is built around a medallion lakehouse on Databricks, declarative streaming transformations for the silver layer, and log-based change-data-capture from operational relational sources via a managed Kafka service. A config-driven pipeline layer decouples table onboarding from code changes, and a data-quality engine splits each stream into a clean sink and a quarantine sink for audit and remediation.

DatabricksPySparkDelta Lakedeclarative streaming pipelinesAuto Loader+7

View Details