We are looking for senior Data Engineer with Java , Python and SQL knowledge for a 6 month long project with the possibility of extension.
- Participate in design and implementation of scalable computation and data distribution platform.
- You will create and operationalize data pipelines to enable squad to deliver high quality data-driven product.
- Act as Lead to identify, design, and implement internal process improvements and relay to the relevant technology organization.
- Develop and mentor other team members in design and development, provide development estimates on projects and tasks.
- Identify, investigate, and resolve data discrepancies by finding the root cause of issues
- Understand existing systems and resolve operations issues
- Automate manual ingest processes and optimize data delivery subject to service level agreements;
- Design, maintenance and ownership of a Data Infrastructure.
- Be up to date with the latest tech trends in the big-data space and recommend them as needed.
- B.S. / M.S. degree in Computer Science, Engineering or a related discipline.
- Overall 5+ years of hands-on experience in computer engineers/software development.
- 3-5+ years of hands-on experience with
- o Java/J2EE/Spring architecture design and development.
- o Python for data transformation and server-side implementation (Core Python, Pandas and PySpark).
- 3-5+ years of experience using SQL (e.g., MS SQL Server, MySQL), stored procedures and complex queries.
- 3+ years of experience using Hive (on Spark). Proficiency on bucketing, partitioning, tuning and different file formats (ORC, PARQUET & AVRO).
- Strong analytical, architectural and programming skills.
- Working experience with any No-SQL databases (Cassandra, MongoDB).
- Committed to code quality, software testing, and CI/CD.
- Experience with Hadoop or any Distributed system
- Experience with stream-processing systems: Storm, Spark-Streaming
- Experience with building and optimizing ‘big data’ pipelines, architectures, and data sets.
- Familiarity with data pipeline and workflow management tools (e.g., Luigi, Airflow, NiFi, Kylo and etc.).
- Experience with containerization architecture: Docker and Kubernetes.
- Knowledge of any Graph Databases
- Any experience with Cloud platform is huge PLUS