Data Engineer II
Data Engineer - Specialized in Data Quality Validation
We Are EA
We’re EA—the world’s largest video game publisher. You’re probably familiar with many of our titles—Madden, FIFA, Apex Legends, The Sims, Need for Speed, Dead Space, Battlefield and Star Wars, to name a few. But maybe you don’t know how we’re committed to creating games for every platform—from social to mobile to console —to give our players that anytime, anywhere access they demand. What does that mean for you? It means more opportunities to unleash your computing genius.
Player Insight Network (PIN) is the central telemetry solution that helps EA to make great games across all platforms and titles. We, Player Insight Network Quality Verification Team, are looking for a data engineer responsible for Extract Transform Load (ETL) logic validation through the use of test automation and dbt. You will work with our partners to ensure quality data in player engagement and gameplay telemetry from source to dashboard reporting. Besides applying best software engineering practice in data engineering space, SQL and scripting language, you will work with an amazing team of data craft practitioners in continuously driving high quality in our data product portfolios.
EA Hyderabad is seeking a Data Engineer with a love for data. You will be responsible for planning, building and deploying data quality solutions for EA. The Data Engineer will be a subject matter expert in building data quality solutions using modern ETL technologies. You will be leveraging cutting edge tools & technologies to store, analyse petabytes of data and solve complex business problems. You will collaborate with product, business and engineering teams to build & deliver the most optimal solution.
- Creation of test plans by reviewing data pipeline feature design spec
- Create test cases using DBT to validate ELT pipelines quality
- Create test cases using SQL & python in internal test case automation framework to ensure data quality
- Work with data analytics solution team to understand data requirements, Entity Relationship Diagram (ERD) and implementation of ETL logic
- Design effective test strategy to validate end-to-end data quality based on data requirements and business needs
- Perform root cause analysis on failed test cases and communicate those to stakeholders
- Understand the design and architecture of data pipelines and come up with test solutions
- Identify and communicate test coverage gaps on raw data to the stakeholders
- Define data quality KPIs & metrics to provide executive summary
- Participate in scrum ceremonies (Sprint planning, daily stand up, review, retrospective)
- Analyze and improve test coverage based on feedback provided by downstream stakeholders
- Identify new, creative methods to increase data pipelines testing efficiency
- Enhance existing data pipelines quality processes
- Work with the product, business and engineering teams to Identify new opportunities for data acquisition
- Build analytical tools and programs to empower the business teams make decisions based on data
- Collaborate with data scientists and architects across teams serve their data needs
- Evangelise and Influence to make data a first class citizen with every person/team/group you get in touch with
Must Have Skills
- Excellent command over querying and analysing data using various query languages native to the storage system (e.g., SQL CQL, MQL, Hive, etc.)
- Good understanding of SQL and NoSQL database concepts & applications, data modelling techniques (3NF, Dimensional)
- Proven experience in tuning databases and queries for optimal performance
- Experience in programming languages used in data systems - Python or R or Java
- Experience in ingesting data from various sources in formats such as JSON, XML, CSV, etc.
- Experience in data visualisation tools such as Tableau, Power BI, etc.
- Experience in working in a fast paced environment using agile methodologies. Prior experience in using tools such as Gitlab, JIRA, Confluence, etc.
- Ability to debug and test complex data pipelines
- Experience with identifying data quality rules and applying data validation techniques
- Good understanding of data lifecycle & quality applications
- Good understanding of software quality assurance concepts, debugging processes and practices
- Ability to communicate ideas clearly and effectively and influence others
- 3+ years relevant industry experience in data engineering role who has Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field
- Building Statistical models and Machine Learning Algorithms
- Real-time data acquisition & processing frameworks (e.g., Apache Storm, Apache NiFi, Apache Spark)