Data Engineer / ETL developer, who is responsible for implementation of data load and harmonization within Data Pool.
Responsibilities
- Implementation of complex data ingestion and transformation pipelines for various data domains
- Write technical documentation and apply best practices
- Develop new functionalities for the data pipeline (data ingestion, mapping and harmonization of data over multiple sources, calculation of KPIs and standard dimensions)
Required skills
- Experience with AWS technologies including AWS Lambda, AWS SQS, SNS, Redshift, RDS, S3, EMR, similar solutions built around Hive/Spark etc.
- AWS Glue for ETL, AWS Athena for querying in storage and QuickSight/PowerBI/Tableau/Looker for Analytics and BI dashboards.
- Have hands-on practice to manually migrate data between Amazon Redshift clusters and Amazon S3.
- Basic networking concepts around AWS VPC is nice to have to understand concepts around scalability.
- Proficiency in Python, SQL, knowledge of Advanced SQL working with large data sets
- Experience using big data technologies (Hadoop, Hive, HBase, Spark etc.).
- Knowledge of batch and streaming data architectures
- Deep understanding of data warehousing , experience in data modeling, ETL development.
- Knowledge of how to query data from multiple tables in Data Warehouse and Data Lake.
- Knowledge of software engineering best practices across the development lifecycle, including agile methodologies, coding standards, code reviews, source management, build processes, testing, and operations
- Knowledge of data management fundamentals and data storage principles
- Knowledge of Advanced Statistics and implementing ML models. (would be a plus, but not must)
- Fluent in English
What they offer
- Annual Bonus based on Performance
- Flexible Working Hours & Ability to Work From Home
- Learning & Career Development Opportunities
- Supportive Work Culture & Regular Team Events
- Employee Discounts & Company Offerings
- Discount on All You Can Move Sport Pass