Data Engineer
Remote
7 months
Inside ir35 - Umbrella only
Required skills
- Strong understanding of data concepts - data types, data structures, schemas (both JSON and Spark), schema management etc
- Strong understanding of complex JSON manipulation
- Experience working with Data Pipelines using a custom Python/PySpark frameworks
- Strong understanding of the 4 core Data categories (Reference, Master, Transactional, Freeform) and the implications of each, particularly managing/handling Reference Data.
- Strong understanding of Data Security principles - data owners, access controls - row and column level, GDPR etc including experience of handling sensitive datasets
- Strong problem solving and analytical skills, particularly able to demonstrate these intuitively (able to work a problem out, not follow a work instruction to resolve)
- Experience working in a support role would be beneficial, particularly able to demonstrate incident triage and handling skills/knowledge (SLAs etc)
- Fundamental linux system administration knowledge - ssh keys and config etc, Bash CLI and scripting, Environment variables
- Experience using browser based IDEs (Jupyter Notebooks, RStudio etc)
- Experience working in a dynamic Agile environment (SAFE, scrum, sprints, JIRA etc)
Languages / Frameworks
- JSON
- YAML
- Python (as a programming language, not just able to write basic scripts. Pydantic experience would be a bonus.)
- SQL
- PySpark
- Delta Lake
- Bash (both CLI usage and scripting)
- Git
- Markdown
- Scala (bonus, not compulsory)
- Azure SQL Server as a HIVE Metastore (bonus)
Technologies
- Azure Databricks
- Apache Spark
- Delta Tables
- Data processing with Python
- PowerBI (Integration / Data Ingestion)
- JIRA
If this is the role for you please submit your CV at your earliest convenience. If you have not had a response within 2 weeks please take this as you have not been successful on this occasion.
