Data Engineer

分享
  • 江苏省-无锡

Join our Operations IT team, a global IT capability supporting the Global Operations organization. We partner with various Operations capability areas such as Pharmaceutical Technology Development, Manufacturing & Global Engineering, Quality Control, Sustainability, Supply Chain, Logistics & Global External Sourcing & Procurement. Our work directly impacts patients by redefining our ability to develop life-changing medicines. We are passionate about impacting lives through data, analytics, AI, machine learning, and more.

As part of the Data Analytics and AI (DA&AI) organization within Operations IT, you will deliver brand-new Data Analytics and AI solutions for various Operations capability areas. Our work transforms our ability to develop life-changing medicines, empowering the business to perform at its peak. We combine powerful science with leading digital technology platforms and data.

As a Data Engineer, you will be responsible for building and maintaining scalable, high-performance data solutions that enable data-driven decision-making. Your primary focus will be on data ingestion, transformation, and orchestration across cloud-based platforms. In addition to core data engineering responsibilities, you will also collaborate on cloud architecture and contribute to software engineering efforts to support the end-to-end delivery of data products. You’ll work closely with cross-functional teams—including Data Engineering, Cloud Engineering, and Software Development—to ensure seamless integration, performance, and reliability of data pipelines and services.
 

Accountabilities

  • Build and optimize scalable data pipelines for ingestion, transformation, and analysis to ensure performance and reliability.

  • Develop and maintain efficient ETL/ELT processes using clean, testable, and production-ready code.

  • Design and manage data models that align with business requirements and support analytical use cases.

  • Collaborate with cross-functional teams (software engineers, data scientists, product managers) to deliver integrated data solutions.

  • Implement and manage cloud-based data infrastructure, leveraging services like AWS for storage, processing, and orchestration.

  • Create APIs and microservices to enable seamless data integration across platforms and systems.

  • Optimize databases and queries for performance and maintain high-quality, reliable data workflows.

  • Ensure data quality, validation, and compliance with security and privacy standards.

  • Document data pipelines, models, and processes for maintainability and knowledge sharing.

  • Stay up to date with modern data engineering tools and practices, contributing to ongoing improvements in architecture and delivery.

Essential Skills/Experience

  • Minimum 6+ years of experience in delivering data and software engineering solutions.

  • Hands-on experience with ETL/ELT tools like SnapLogic, FiveTran, or similar.

  • Strong expertise in Snowflake, DBT, and modern data warehouse technologies.

  • Skilled in data modelling, data transformation, and handling large-scale data systems.

  • Good understanding of data pipeline design, including dimensional modelling and schema development.

  • Familiar with Data Mesh and Data Product concepts, with experience in managing reusable data assets.

  • Experience with data orchestration tools like Apache Airflow or MWAA.

  • Proficient in Power BI or similar tools for building dashboards and data visualizations.

  • Knowledge of DevOps/DataOps, including CI/CD pipelines and GitHub for version control.

  • Familiar with Docker and Kubernetes for containerization and deployment.

  • Experience with automated testing for data (unit, integration, regression testing).

  • Strong programming skills in Python or similar languages.

  • Experience with SQL and both relational (MySQL, PostgreSQL) and NoSQL databases.

  • Exposure to full-stack development using JavaScript frameworks (Node.js, ReactJS, Python Fast API).

  • Solid experience working with AWS services (S3, EC2, RDS, Lambda, EKS, ECS).

  • Strong communication skills and ability to work closely with stakeholders to translate requirements into solutions.

  • Comfortable working in Agile environments, with the ability to adapt quickly to change.

  • Excellent problem-solving and analytical thinking, with a proactive approach to identifying and resolving issues.

Desirable Skills/Experience

  • Bachelor's or master's degree in a relevant field such as Health Sciences, Life Sciences, Data Management, Information Technology or equivalent experience.

  • Experience working in the pharmaceuticals industry

  • Certification in AWS Cloud or any data engineering or software engineering-related certification

  • Awareness of use case specific GenAI tools available in the market and their application in day-to-day work scenarios.

  • Possess working knowledge of basic prompting techniques and continuously improve these skills.

  • Stay up to date with developments in AI and GenAI, applying new insights to work-related situations

  • Awareness of use case specific GenAI tools available in the market and their application in day-to-day work scenarios.

  • Possess working knowledge of basic prompting techniques and continuously improve these skills.

  • Stay up to date with developments in AI and GenAI, applying new insights to work-related situations