Key Responsibilities
Cloud Operations & Reliability
• Operate and support multi-tenant SaaS workloads across multiple AWS accounts.
• Ensure high availability and resilience through proactive monitoring, troubleshooting, and incident response.
• Manage lifecycle tasks (patching, scaling, upgrades, backups, and DR exercises).
CI/CD & Automation
• Own and enhance CI/CD pipelines (ArgoCD, GitHub Actions) for reliable, repeatable deployments.
• Automate operational workflows (infrastructure, application releases, reporting).
• Support engineering teams with smooth delivery pipelines and self-service tooling.
FinOps & Cost Optimization
• Monitor, analyze, and optimize AWS usage across accounts.
• Drive cost savings through compute optimization (Graviton/AMD migrations, RDS tuning), storage tiering, and right-sizing.
• Partner with finance and engineering stakeholders to align cost efficiency with performance.
Database & Data Operations
• Operate, scale, and optimize MongoDB and RDS clusters in production.
• Monitor database performance, indexing, replication, and backup/restore processes.
• Collaborate with data engineering to ensure stable data pipelines and integrations.
Observability & Incident Response
• Implement and manage monitoring and logging with Splunk, Grafana, OpenTelemetry, and AWS CloudWatch.
• Define SLI/SLO metrics and drive continuous improvements in availability and performance.
• Lead incident response (P0/P1/P2), root cause analysis, and postmortems.
Security & Compliance
• Apply least-privilege IAM practices, patching, and hardening.
• Ensure compliance with healthcare and industry standards (GXP, GDPR, HIPAA, NIST).
• Support audit readiness (SOC 2, ISO 27001).
Required Experience & Qualifications
• Experience: 7+ years in DevOps or cloud infrastructure roles, with significant experience in SaaS and multi-tenant platforms. Proven track record of mentoring team members in Cloud infrastructure related projects.
• Cloud Expertise: Expert knowledge of AWS services, including VPC, IAM, EC2, S3, RDS, Lambda, EKS, AWS WAF, AWS EventBridge, and AWS CloudTrail.
• Containerization & Orchestration: Deep proficiency in Docker, Kubernetes, Helm, and associated ecosystem tools.
• CI/CD Proficiency: Expertise in CI/CD tools such as ArgoCD and GitHub Actions.
• Infrastructure as Code (IaC): Advanced experience with AWS CDK (TypeScript preferred) and CloudFormation.
• Networking: Strong understanding of AWS networking services such as VPCs, Transit Gateway, ALB, and Security Groups.
• Security: In-depth knowledge of IAM, AWS KMS, encryption standards, AWS WAF, and security compliance frameworks including NIST.
• Monitoring & Alerting: Extensive experience with OpenTelemetry, Splunk, Grafana, AWS CloudWatch, and AWS CloudTrail for monitoring and incident response.
• Data & ETL Pipelines: Familiarity with AWS Glue and Managed Kafka for real-time and batch data processing.
• Programming & Automation: Strong scripting and automation skills using TypeScript and Bash.
• Multi-Account AWS Management: Experience managing multiple AWS accounts with AWS Control Tower.
• Communication & Collaboration: Exceptional verbal and written communication skills, with the ability to explain complex technical concepts to diverse stakeholders.
Desired Experience & Qualifications
• Advanced expertise in AWS CDK, including building complex, reusable constructs and pipelines.
• Familiarity with Projen for automating CDK project configuration and management.
• Hands-on experience with Helm charts and Kubernetes manifests.
• Experience with monitoring and logging tools such as Splunk, Grafana, and AWS CloudWatch.
• Exposure to multi-tenant SaaS platforms and best practices.
• Experience working with AI tools and frameworks.
Personal Attributes
• Mentor & Leader: Enjoys mentoring team members, and fostering a collaborative, innovation-driven team culture.
• Organized & Adaptable: Able to manage multiple priorities and thrive in a fast-paced environment.
• Innovative: Passionate about leveraging technology to solve complex problems and drive efficiency.
• Customer-Focused: Dedicated to building infrastructure that delivers measurable business and customer value.
Work Arrangement:
This is an in-office role based in Shanghai, China, with a requirement to work a minimum of three days per week on-site. Remote or travel flexibility is not available.
Join Evinova and redefine healthcare with us. Apply now to be part of a team that’s transforming life sciences with technology, data, and innovation.