[Remote] Senior Site Reliability Engineer

Remote, USA Full-time Posted 2026-06-16

Note: The job is a remote job and is open to candidates in USA. Lean Tech is a rapidly expanding organization in the technology services sector, seeking a highly experienced Senior Site Reliability Engineer. The role focuses on evolving the reliability, security, observability, and operational maturity of their cloud platform, leveraging AI tools and practices to enhance operational efficiency.

Responsibilities

Own and evolve the reliability, security, observability, and operational maturity of our cloud platform
Use AI tools and agentic workflows to automate infrastructure and SRE tasks
Manage production infrastructure for SaaS platforms, including senior AWS ownership
Lead production incidents and drive root-cause analysis, creating remediation plans
Ensure compliance with security best practices and maintain compliance controls

Skills

Expert use of AI tools and agentic workflows to automate infrastructure and SRE tasks
Hands-on experience using AI for Terraform development, incident triage, log analysis, runbook creation, postmortems, operational automation, CI/CD pipeline generation, and reducing repetitive operational work
Strong understanding of AI capabilities, limitations, and necessary validation processes
Ability to clearly articulate AI workflows, tooling choices, operational safeguards, and production outcomes
10+ years managing production infrastructure for SaaS platforms, including 5+ years of senior AWS ownership
Deep expertise with AWS services such as ECS, VPC, IAM, RDS, S3, CloudFront, Route53, Lambda, API Gateway, CloudWatch, Secrets Manager, and related security and governance services
Advanced Terraform experience managing multi-account environments, infrastructure state, drift remediation, and dependency management
Advanced Terraform experience managing multi-account, multi-workspace infrastructure
Strong understanding of: provider versioning, state management, drift detection and remediation, dependency management, infrastructure blast radius analysis
Proven experience resolving production infrastructure drift safely
Significant experience leading production incidents as the accountable owner
Ability to operate calmly and effectively during high-severity outages
Proven experience authoring detailed postmortems and operational remediation plans
Strong understanding of operational risk management and production recovery procedures
Proven experience leading production incidents, driving root-cause analysis, and creating remediation plans
Strong background in observability, monitoring, logging, distributed tracing, and alerting using tools such as Grafana
Experience owning CI/CD pipelines, deployment strategies, infrastructure automation, and operational workflows
Strong Linux administration, containerization (Docker), networking, and scripting skills
Experience with security best practices, identity management (SAML, OIDC, SCIM), and compliance frameworks such as SOC 2, ISO 27001, HIPAA, or PCI
Comfortable working directly with auditors and maintaining compliance controls
Experience supporting Spring Boot or JVM-based systems in production
Experience with runtime security or EDR tooling such as Falco
Experience automating joiner/mover/leaver identity workflows using SCIM and IdP tooling
AWS certifications including: AWS Solutions Architect Professional, AWS DevOps Engineer Professional, AWS Security Specialty
Ability to read and debug Kotlin or Java backend services from an SRE perspective
React/NodeJS/Backstage developer experience
MuleSoft API Management experience

Benefits

Professional development opportunities with international customers
Collaborative work environment
Career path and mentorship programs that will lead to new levels

Company Overview

Global Technology Services (GTS) is the technology solution of Lean Solutions Group, helping companies scale faster through AI-driven automation, software development, and tech-powered talent. It was founded in 2019, and is headquartered in Medellín, Antioquia, COL, with a workforce of 1001-5000 employees. Its website is https://www.lean-tech.io/.

Company H1B Sponsorship

Lean Tech has a track record of offering H1B sponsorships, with 1 in 2023, 1 in 2022. Please note that this does not guarantee sponsorship for this specific role.

Apply To This Job

Apply Now

[Remote] Senior Site Reliability Engineer

Similar Jobs

[Remote] Data Visualization & Power BI Consultant

[Remote] Fall 2026 Legal Intern, Immigrants' Rights Project- New York

[Remote] Specialized Financial Analyst Project Manager

[Remote] DevOps Platform Engineer

[Remote] Data Platform Engineer | Remote

[Remote] Sr. Clinical Operations Lead (Clinical Trial Manager)

[Remote] Customer Success Manager

[Remote] Head of Finance - Payments

[Remote] Consumer Financial Services Attorney

[Remote] Senior Software Engineer (Mandarin-speaking)

Experienced Data Entry Virtual Assistant – Work From Home Opportunity at arenaflex

Strategic Credit & Fraud Risk Analyst – Net Loss Forecasting & Risk Analytics (Remote Work From Home)

Software Engineer, Platform - Toronto, Canada

Remote Customer Support Representative – Aviation Passenger Services, Booking & Issue Resolution at arenaflex

Director, Customer Growth - Driving Beverage Incidence and Profitability Across Arenaflex's Diverse Customer Base

Caretaker

Experienced Remote Data Entry Operator – Join arenaflex's Dynamic Team and Revolutionize Data Management

Experienced Entry-Level Data Entry Specialist – arenaflex Remote Part-Time Opportunity with Flexible Hours

CRM Specialist - 1345 - Karachi, Pakistan

Clinical Quality Assurance Coordinator (32313)