Knowledge of distributed systems (e.g., Apache Spark or Databricks)
Knowledge of Delta Lake and Lakehouse architecture
Experience with multiple data programming languages (Spark-SQL, PySpark, Pandas)
Experience with On-premises databases such as SQL server, Oracle etc.
Experience with version control (e.g., Git), DevOps, CI/CD
Design and build data pipelines using Spark-SQL and PySpark in Azure Databricks.
Build and maintain Lakehouse architecture in ADLS / Databricks.
Perform data preparation tasks including data cleaning, normalization, deduplication, type conversion etc.
Work with DevOps team to deploy solutions in production environments.
Collaborate with Data Science and Business Intelligence teams to share key learnings, leverage ideas and solutions and to propagate best practices.
Apply change management tools including training, communication, and documentation to manage upgrades, changes and data migrations.
Base Salary Range: $120,000 / Annum - $140,000 / Annum
#LI-SV2