All success stories

Automating Global Data Platform Workflows with Azure & Databricks

Reading time: 1 minute

Automating Global Data Platform Workflows with Azure & Databricks

Automating Global Data Platform Workflows with Azure & Databricks

Challenge

The company was transitioning from a heterogeneous AWS-based data platform (SageMaker, Lambda, Redshift) to a unified Azure architecture centered around Databricks. While the cloud team provisioned base Azure resources and workspaces, the actual Databricks setup had to support a multi-region, multi-environment deployment model — with separate sandbox, non-production, and production environments per geographic region.

They needed a robust, repeatable way to automate the creation and configuration of each workspace, with consistent access controls, data structures, and secrets — across dozens of isolated environments.

Solution

Nubosas designed and implemented a GitOps-driven automation layer that provisioned and configured Databricks environments from the ground up, including:

  • Workspace setup across all combinations of region and environment (e.g. EU-prod, US-sandbox)
  • User group and access role configuration for each environment
  • Catalog and schema creation aligned with a Medallion Architecture model (Bronze, Silver, Gold)
  • Role-based permission templates tailored to data engineering, analytics, and operations teams
  • Secure secret management via Azure Key Vault and Databricks secret scopes
  • Azure Storage endpoints for SFTP-based ingestion
  • Azure Data Factory pipelines for scheduled data ingestion from enterprise sources
  • ADO pipelines to manage deployments, Git repositories, and CI/CD operations for infrastructure
  • Terraform state split by Databricks unit for infrastructure (account, metastore and workspace)

All configurations were codified, version-controlled, and deployable through pipelines, ensuring consistency and auditability.

Results

  • A fully reproducible and scalable setup supporting multiple regions and environments
  • Strong separation of concerns and clear environment boundaries
  • Faster onboarding of internal teams and accelerated migration from AWS
  • Improved security and compliance posture via centralized secrets management
  • Minimal manual intervention needed during provisioning or updates

Technologies

Azure, Databricks, Azure Key Vault, Azure DevOps (ADO), Azure Storage, Azure Data Factory, Terraform