About the Role:
We are looking for a highly skilled AWS Cloud & DevOps Engineer to lead the infrastructure design, provisioning, CI/CD pipelines, and cloud operations for our modular, multi-tenant AI platform. This role demands hands-on expertise in cloud architecture, DevOps automation, and secure, scalable deployments across SaaS and enterprise environments.
If you thrive in a fast-paced startup, enjoy building infrastructure from scratch, and are passionate about supporting intelligent, scalable systems — this role is for you.
What You’ll Do:
Infrastructure Setup & Management
- Design and provision scalable, secure, and cost-optimized AWS infrastructure for a modular Agentic AI platform.
- Manage AWS services including VPCs, subnets, IAM roles, networking, EC2/ECS/EKS, S3, RDS, Secrets Manager, and more.
- Build and maintain multi-tenant architectures and support environment isolation for dev, stage, and prod.
CI/CD & Deployment Architecture
- Architect and implement robust CI/CD pipelines using tools like GitHub Actions, CodePipeline, or ArgoCD.
- Use Infrastructure-as-Code (IaC) tools such as Terraform, CloudFormation, or Pulumi to automate provisioning.
- Design deployment workflows for SaaS and enterprise self-hosted deployments.
- Collaborate with engineering on blue/green deployments, canary testing, and rollback strategies.
Platform Observability & Operations
- Set up monitoring, alerting, and logging using CloudWatch, Prometheus, Grafana, or ELK/EFK stacks.
- Build observability flows to monitor runtime logs, agent performance, and API health.
- Define and maintain incident response processes and backup/recovery plans.
Security & Governance
- Implement best practices around IAM, secrets management, encryption, and cloud security policies.
- Ensure readiness for standards like SOC2, ISO 27001, GDPR, in collaboration with compliance teams.
Collaboration & Documentation
- Work closely with developers, testers, and product managers to seamlessly integrate infrastructure with platform development.
- Maintain thorough documentation, runbooks, and infrastructure blueprints.
What We’re Looking For:
- 4–7 years of hands-on experience in cloud infrastructure and DevOps, with deep AWS expertise.
- Proficiency in Terraform, Docker, CI/CD pipelines, and Kubernetes (especially AWS EKS).
- Strong knowledge of VPC design, IAM policies, cloud networking, and security best practices.
- Familiarity with microservices architecture and multi-tenant SaaS models.
- Experience with monitoring and logging systems like CloudWatch, ELK, or Prometheus/Grafana.
- Scripting proficiency in Bash, Python, or similar for infrastructure automation.
- Clear understanding of deployment strategies (blue/green, canary, rolling updates).
Good to Have:
- Experience deploying AI platforms, agents, or model pipelines.
- Knowledge of hybrid-cloud setups or edge deployments.
- Familiarity with multi-region/global scaling.
- Exposure to AWS cost optimization tools and practices.
- Experience with security tools like GuardDuty, AWS Inspector, or WAF.
- Understanding of serverless architecture including Lambda, API Gateway, Step Functions.
Why Join Us?
- Be a Pioneer: Join a fast-moving startup building foundational AI infrastructure from scratch.
- Massive Impact: Play a critical role in shaping how businesses interact with intelligent agents.
- Ownership & Autonomy: Lead key architectural and infrastructure decisions with real responsibility.
- Top-Tier Talent: Collaborate with visionary founders, engineers, and a passionate, innovation-driven team.
- Future-Focused: Build modern systems supporting the next wave of AI-native software.
- Flexibility & Growth: Experience a flexible working culture, flat hierarchy, and long-term career growth.