HR Mango is hiring a Databricks Architect for an international consulting firm client. The Databricks Architect must come with strong experience in enterprise data platform architecture and governance to lead client-facing data platform implementations. This role blends high-impact architectural responsibilities (reference architectures, security, scalability, cost management, operational model) with technical leadership in designing, building, deploying, and optimizing data pipelines and data products on Lakehouse/EDW platforms (with an emphasis on Databricks). You will own the full lifecycle of data products and standardize patterns, tools, and practices across large-scale, vertical-specific implementations for our clients.
For faster consideration apply at: https://app.hrmango.com/careers/jobDetail/10707
Role & Responsibilities:
- Define the overall data platform architecture (Lakehouse/EDW), including reference patterns (Medallion, Lambda, Kappa), technology selection, and integration blueprint.
- Design conceptual, logical, and physical data models to support multi-tenant and vertical-specific data products; standardize logical layers (ingest/raw, staged/curated, serving).
- Establish data governance, metadata, cataloging (e.g., Unity Catalog), lineage, data contracts, and classification practices to support analytics and ML use cases.
- Define security and compliance controls: access management (RBAC/IAM), data masking, encryption (in transit/at rest), network segmentation, and audit policies.
- Architect scalability, high availability, disaster recovery (RPO/RTO), and capacity & cost management strategies for cloud and hybrid deployments.
- Lead selection and integration of platform components (Databricks, Delta Lake, Delta Live Tables, Fivetran, Azure Data Factory / Data Fabric, orchestration, monitoring/observability).
- Design and enforce CI/CD patterns for data artifacts (notebooks, packages, infra-as-code), including testing, automated deployments and rollback strategies.
- Define ingestion patterns (batch & streaming), file compacting/compaction strategies, partitioning schemes, and storage layout to optimize IO and costs.
- Specify observability practices: metrics, SLAs, health dashboards, structured logging, tracing, and alerting for pipelines and jobs.
- Act as technical authority and mentor for Data Engineering teams; perform architecture and code reviews for critical components.
- Collaborate with stakeholders (Data Product Owners, Security, Infrastructure, BI, ML) to translate business requirements into technical solutions and roadmap.
- Design, develop, test, and deploy processing modules using Spark (PySpark/Scala), Spark SQL, and database stored procedures where applicable.
- Build and optimize data pipelines on Databricks and complementary engines (SQL Server, Azure SQL, AWS RDS/Aurora, PostgreSQL, Oracle).
- Implement DevOps practices: infra-as-code, CI/CD pipelines (ingestion, transformation, tests, deployment), automated testing and version control.
- Troubleshoot and resolve complex data quality, performance, and availability issues; recommend and implement continuous improvements.
Hard Skills - Must have:
- Previous experience as architect or lead technical role on enterprise data platforms.
- Hands-on experience with Databricks technologies (Delta Lake, Unity Catalog, Delta Live Tables, Auto Loader, Structured Streaming).
- Strong expertise in Spark (PySpark and/or Scala), Spark SQL and distributed job optimization.
- Solid background in data warehouse and lakehouse design; practical familiarity with Medallion/Lambda/Kappa patterns.
- Experience integrating SaaS/ETL/connectors (e.g., Fivetran), orchestration platforms (Airflow, Azure Data Factory, Data Fabric) and ELT/ETL tooling.
- Experience with relational and hybrid databases: MS SQL Server, PostgreSQL, Oracle, Azure SQL, AWS RDS/Aurora or equivalents.
- Proficiency in CI/CD for data pipelines (Azure DevOps, GitHub Actions, Jenkins, or similar) and packaging/deployment of artifacts (.whl, containers).
- Experience with batch and streaming processing, file compaction, partitioning strategies and storage tuning.
- Good understanding of cloud security, IAM/RBAC, encryption, VPC/VNet concepts, and cloud networking.
- Familiarity with observability and monitoring tools (Prometheus, Grafana, Datadog, native cloud monitoring, or equivalent).
Hard Skills - Nice to have/It's a plus:
- Automation experience with CICD pipelines to support deployment and integration workflows including trunk-based development using automation services such as Azure DevOps, Jenkins, Octopus.
- Advanced proficiency in Pyspark for advanced data processing tasks.
- Advance proficiency in spark workflow optimization and orchestration using tools such as Asset Bundles or DAG (Directed Acyclic Graph) orchestration.
- Certifications: Databricks Certified Data Engineer / Databricks Certified Professional Architect, cloud architect/data certifications (AWS/Azure/GCP).
Soft Skills / Business Specific Skills:
- Ability to identify, troubleshoot, and resolve complex data issues effectively.
- Strong teamwork, communication skills and intellectual curiosity to work collaboratively and effectively with cross-functional teams.
- Commitment to delivering high-quality, accurate, and reliable data products solutions.
- Willingness to embrace new tools, technologies, and methodologies.
- Innovative thinker with a proactive approach to overcoming challenges.
For faster consideration apply at: https://app.hrmango.com/careers/jobDetail/10707
*HRmango does not discriminate in any aspect of employment based on race, color, religion, national origin, ancestry, gender, sexual orientation, gender identity and/or expression, age, veteran status, disability, or any other characteristic protected by federal, state, or local employment discrimination laws where HRmango does business.
Job Type: Temp-to-hire
Pay: $125,000.00 - $150,000.00 per year
Experience:
- Databricks: 5 years (Required)
Work Location: Remote