[Remote] Sr. Platform Ops Engineer
Note: The job is a remote job and is open to candidates in USA. Versant Media is an industry-changing media and entertainment business home to trusted brands that shape culture and inform audiences. They are seeking a Senior Platform Engineer to build and maintain core platform infrastructure that supports their engineering teams, driving platform enhancements and enabling developer productivity.
Responsibilities
- Build and maintain core platform infrastructure that serves as the foundation for engineering teams across the organization
- Drive and participate in solution architecture and delivery across diverse domains, offering expertise in AWS cloud infrastructure and resources, containers (ECS/Kubernetes) and cloud-native architectures
- Develop, enhance, and operate platform capabilities including CI/CD pipelines, infrastructure as code, reusable patterns, and automation solutions that enable rapid, reliable, and secure deployments
- Design and implement standardized, reusable cloud infrastructure patterns and templates that help teams adopt best practices while maintaining security and compliance requirements
- Demonstrate, enable, and promote consistent adoption of software delivery practices and solutions that optimize for fast feedback, observability, and operational excellence
- Proactively collaborate with product and engineering teams, taking technical leadership in areas of cross-functional projects while sharing knowledge and best practices
- Evaluate existing standards and practices, identify gaps, and implement improvements to strengthen Development, Security, and Operational practices across the organization
- Demonstrate excellent communication skills through proactive status updates, clear documentation, knowledge sharing, and effective collaboration when facing technical challenges
- Partner closely with engineering teams to understand their needs, refine requirements, and deliver solutions that enhance developer productivity while meeting project objectives
- Work with emerging technologies around AI, MCP, and RAG
- Cost optimization - Monitor and optimize cloud spend, implementing cost governance strategies and right-sizing recommendations
- Service reliability Maintain SLOs/SLIs/SLAs, improve uptime targets, and help implement reliability engineering practices
- Mentorship - Mentor junior and mid-level engineers, conduct design reviews, and raise the technical bar across the platform team
Skills
- AWS Expertise: 6+ years of hands-on experience with AWS, with a focus on Infrastructure as Code
- Linux Proficiency: Minimum of 4 years of experience managing and maintaining Linux systems
- Automation/Scripting: 3+ years of experience with Python for automation and scripting
- Git and GitOps: Practical experience and comfortable using Git and automated workflows
- AWS Security Knowledge: Familiarity with AWS security best practices, including DNS, secure VPCs, and database security (e.g., PostgreSQL, MySQL)
- Communication & Collaboration: Strong written and verbal communication skills with the ability to explain complex technical concepts to technical and non-technical audiences
- Problem-Solving Mindset: Eagerness to learn, grow, and tackle challenges in a fast-paced environment
- Troubleshooting & debugging: Proven ability to troubleshoot complex distributed systems issues across networking, compute, storage, and application layers. Comfortable with packet captures, log analysis, performance profiling, and diagnosing intermittent failures in production
- Terraform / Infrastructure as Code: 3+ years of hands-on experience with Terraform, including module development, state management, and multi-environment workflows. Comfortable writing, reviewing, and maintaining IaC that provisions and manages cloud infrastructure at scale
- Networking fundamentals: VPCs, load balancers, DNS, CDNs, service mesh, network troubleshooting
- Monitoring/Observability: Datadog, Prometheus, Grafana, CloudWatch, ELK stacks
- Incident management experience: PagerDuty, on-call rotations, post-mortem culture
- AWS Certifications: Relevant certifications such as AWS Certified Cloud Practitioner, AWS Certified Solutions Architect – Associate
- Background in building or maintaining systems that support scalable application environments and event-driven architectures
- Understanding of container security and orchestration (Docker, Kubernetes)
- Fundamentals of running AI-related services
- Experience contributing to open-source projects or internal platform tools
- Multi-account AWS strategy (Organizations, Control Tower, Landing Zones)
- Chaos engineering / game days
- Platform-as-a-product thinking (internal developer portals, Backstage)
Benefits
- Health insurance
- Retirement plans
- Paid time off
Company Overview
Company H1B Sponsorship
Apply To This Job