Back to Jobs

[Remote] Staff Platform Engineer

Remote, USA Full-time Posted 2026-06-16

Note: The job is a remote job and is open to candidates in USA. Rezdy is hiring a Staff DevOps Engineer to join their new product, Manifest, in a dynamic environment. The role involves owning critical infrastructure, improving developer experience, and collaborating closely with product engineers and DevOps leadership.


Responsibilities

  • Work on a team with two other platform engineers
  • Own and evolve the infrastructure that supports Manifest, including AWS environments, networking, compute, data services, observability, CI/CD, and operational tooling
  • Work with Pulumi and TypeScript to define, maintain, and improve infrastructure as code across the platform
  • Support and improve our containerized application platform, including deployment pipelines, rollback mechanisms, and runtime configuration
  • Help operate and harden our data infrastructure, including connection pooling, backups, disaster recovery, replication, and safe schema-change practices
  • Partner with engineers to improve the reliability and safety of releases, including database migrations, deployment workflows, environment management, and production readiness checks
  • Improve CI/CD workflows so that builds, tests, infrastructure changes, and deployments are fast, reliable, and easy for engineers to understand
  • Lead observability and incident readiness work, including alerting, dashboards, SLOs, runbooks, incident response practices, and post-incident follow-up
  • Help ensure the platform is secure, cost-conscious, and maintainable as the product scales
  • Mentor engineers on infrastructure, operations, reliability, and production ownership

Skills

  • Deep production experience with AWS, especially services such as ECS/Fargate, RDS/Aurora PostgreSQL, VPC networking, load balancing, IAM, KMS, Secrets Manager, CloudFront, WAF, and related managed services
  • Experience designing and operating systems that serve a global user base, seamless multi-region availability, and disaster recovery procedures
  • Treats reliability, scalability, performance, and observability as a first-class design constraint, building these into designs from the start rather than bolting them on later
  • Strong infrastructure-as-code experience. Pulumi with TypeScript is ideal, but deep experience with Terraform or another mature IaC approach is also valuable
  • Strong operational knowledge of PostgreSQL, including performance investigation, connection pooling, backups, replication, locking, migrations, and safe schema-change patterns
  • Experience designing and maintaining CI/CD systems, ideally with GitHub Actions, OIDC-based cloud authentication, container builds, environment promotion, required checks, and deployment gates
  • Experience supporting containerized production workloads and improving deployment safety, rollback strategies, and runtime reliability
  • Strong observability and incident response experience, including metrics, logs, traces, alerting, dashboards, runbooks, and post-incident learning
  • The ability to work effectively in ambiguity, make pragmatic tradeoffs, and communicate clearly with both infrastructure specialists and product engineers
  • A track record of raising the engineering bar through reusable patterns, documentation, automation, mentoring, and thoughtful technical leadership

Company Overview

  • The world’s leading online booking and distribution platform powering the experiences industry. It is a sub-organization of Checkfront. It was founded in 2011, and is headquartered in Sydney, New South Wales, AUS, with a workforce of 51-200 employees. Its website is http://rezdy.com.

  •   Apply To This Job

    Similar Jobs