Location: Hyderabad, India
Work type: On-site / Work from Office
About Us
We’re building a conversational voice assistant that transforms how people interact with businesses. Join early and help shape our core services, data models, and AI/LLM platform.
The road ahead is ambitious: we’re scaling to thousands of concurrent calls, maturing an AWS-native platform that is secure, observable, and cost-aware. You’ll play a critical role in evolving our LLM/RAG stack, deploying reliable pipelines, and setting engineering standards that will define the future of our platform.
What You’ll Do
- Design, maintain, and optimize CI/CD pipelines to enable fast, reliable deployments.
- Manage and scale AWS ECS/EKS clusters for containerized workloads.
- Implement strong monitoring, logging, and alerting systems for observability.
- Ensure platform security and compliance with HIPAA and SOC2.
- Automate infrastructure using Infrastructure as Code (Terraform/CloudFormation).
- Continuously tune AWS usage for performance and cost optimization.
- Collaborate with developers to improve velocity and build self-service tooling.
- Support incident response, troubleshooting, and postmortems.
Must-Haves
- 5+ years in a DevOps, SRE, or Cloud Engineering role.
- Strong experience with AWS services (ECS, EKS, EC2, S3, RDS, IAM, etc.).
- Proficiency with Docker and container orchestration.
- Hands-on experience with CI/CD tools (e.g., Jenkins, GitHub Actions, GitLab CI).
- Solid understanding of monitoring and observability (CloudWatch, Prometheus, ELK, etc.).
- Hands-on experience with Terraform or CloudFormation.
- Excellent problem-solving and collaboration skills.
Nice-to-Haves
- AWS Certifications (Solutions Architect, DevOps Engineer, SysOps Administrator, or similar).
- Experience running Jenkins pipelines at scale.
- Background in security automation and compliance (SOC2, HIPAA).
- Cloud cost optimization experience.
- Knowledge of networking, secrets management, and incident response best practices.
- Strong scripting/automation skills (Python, Bash, or similar).
Why Join
- Real impact, real scale: Your work will power thousands of concurrent real-time calls and mission-critical workflows.
- Ownership from day 1: Architect → build → deploy → monitor; no heavy bureaucracy, just impact.
- Deep technical growth: Work across AWS (ECS/EKS, Lambda, S3, SQS/SNS, CloudWatch), CI/CD, Docker/Kubernetes, and security/compliance.
- Hard, meaningful problems: Low-latency real-time pipelines (WebSockets/audio), reliability at scale, cost/latency tuning, and robust observability.
- Challenging problems: Build secure, cost-efficient, low-latency pipelines and reliable infrastructure at scale.
- Fast feedback loop: Automated CI/CD, observability, frequent releases, and clear success metrics (latency, reliability, cost).
- Engineering culture: Code reviews, runbooks, postmortems, and a collaborative team that values clear communication.
- Mission + rigor: Build HIPAA/SOC2-aligned systems that balance speed, security, and reliability.
- Remote-friendly & collaborative: Pragmatic processes, respectful schedules, and a focus on outcomes over hours.
We’re an equal-opportunity employer and value an inclusive, diverse team.
Let's talk at careers@interactly.ai