[Remote] Sr. Site Reliability Engineer(Storage Platform)_Remote
Note: The job is a remote job and is open to candidates in USA. Dice is looking for a Senior Site Reliability Engineer specializing in Storage Platforms to join their team. The role involves managing enterprise storage and Kubernetes platforms, ensuring the reliability and efficiency of mission-critical production environments.
Skills
- 6+ years of experience managing enterprise storage and Kubernetes platforms on Linux
- Strong hands-on experience with SDS solutions (Ceph, Longhorn) and storage migrations from legacy systems
- Experience with block, file, and object storage, including Fibre Channel and IP-based protocols
- Experience with NVMe-oF or iSCSI fabrics
- Expert knowledge of Kubernetes and Linux systems (Ubuntu, RHEL/CentOS)
- Proficiency with Infrastructure-as-Code (IaC) (Ansible, Terraform)
- Strong scripting skills in Python and Bash (Golang (GO) a plus)
- Strong working knowledge of Enterprise DNS and integrations with Kubernetes
- Experience operating 24x7 mission-critical production environments
- Hands-on experience with KVM hypervisors (Suse Harvester, OpenStack)
- Strong written and verbal communication skills
- Proficiency with Git, CI/CD pipelines, and automated testing frameworks
- OpenStack Cinder multi-backend administration
- Backup platforms (Rubrik)
- Understanding of CIS/NIST security and infrastructure lifecycle management
- ITIL Foundation/advanced certifications in support of ITSM standard methodology
- CNCF Certified Kubernetes Administrator (CKA), Certified Kubernetes Security Specialist (CKS) or Red Hat specialist in Ceph Storage Administrator (EX125) certifications
Company Overview
Apply To This Job