About the Role
Job Update: Seekajob
A seasoned technical contributor who has experience in proposing and implementing Data Engineering projects and bringing them to production to solve complex engineering/business problems. Passionate about storage, retrieval and analysis of data to build data driven solutions and aid in the decision making process.
Role and Responsibilities
- Lead the effort of setting up a strong framework for collection, storage and analysis of data at scale.
- Own and develop the data pipelines that form the base of upGrads applications and services and leverage the use of cutting edge techniques for both real time and non-real time ETL and data pipelines.
- Provide timelines and own end to end delivery of data engineering projects.
- Prioritize to manage ad-hoc requests in parallel with ongoing projects.
- Take decisions on real time and non-real time based collection and analysis strategies which suit the need of the engineering and product teams.
- Standardize the tools that are used by backend and frontend developers to collect events data. Collaborate with multiple teams and identify problem statement and propose and implement solutions.
- Find ways to optimize the huge amount of data being stored and queried with different tools and follow best practices of setting up and delivering data driven solutions and APIs.
Skills/Experience
- A highly talented and technical developer with 8+ years of hands-on experience in building and deploying data engineering solutions. Experience in building APIs and dashboards using data.
- 6+ years of industry experience with at least 4+ years experience with ETL, Data Modeling and working with large-scale datasets.
- 8+ years experience with an object-oriented programming language such as Python, Scala or Java
- Extremely proficient in writing performant SQL working with large data volumes
- Experience with designing, scaling and optimizing cloud based data warehouses (like AWS Redshift) and data lakes
- Strong knowledge and experience in 1 or more tools/frameworks from each category:
- Query/Data Processing engines: Spark, Hive, Athena/ Presto
- Data Warehouses: Redshift, Druid, Snowflake
- AWS : S3, Glue, EMR, RDS
- Stream-processing: Spark-Streaming, Faust, Flink etc.
- Message queuing: Apache Kafka, AWS Kinesis
- Orchestration: Luigi, Airflow, Azkaban etc.
- Basic knowledge of software architecture design, docker, and microservices concepts. Kubernetes knowledge is a bonus.
- Strong knowledge of cloud computing and AWS services.
0 Comments