[Remote] Sr Platform Engineer-1
Note: The job is a remote job and is open to candidates in USA. Flexential is a company focused on IT platforms and services, and they are seeking a Senior Platform Engineer to join their team. This hands-on engineering role involves building and operating IT platforms, ensuring high availability, security, and scalability while utilizing native-AI technologies.
Responsibilities
- Design, develop and operationally manage automated, resilient, high availability, self-healing, secure platforms with native-AI capabilities for IT needs, serving both internal as well as customer business capabilities
- Develop, and manage the Observability OpenTelemetry Central Backend Stack: Grafana Enterprise, Mimir, Loki, Tempo, and Alertmanager on Kubernetes/RKE2 via Helm and GitLab CI-CD
- Build and manage iaC and CI-CD for automated provisiong and deployment, including Terraform modules for Infra/VM/storage provisioning, Ansible AWX playbooks for OS/App bootstrap, ArgoCD and Helm for Kubernetes configuration
- Develop and manage OpenTelemetry Prometheus scrape profile library including SNMP exporters, REST API exporters, and cloud provider exporters (CloudWatch, Azure Monitor, GCP) for multiple device classes
- Develop AIOps capabilities on platforms for e.g Observability use-cases: anomaly detection integrations, event correlation rules in Alertmanager, and synthetic monitoring patterns to reduce alert noise
- Configure and maintain Zabbix auto-discovery: network range scanning, device classification, and Prometheus service discovery integration
- Build and harden Edge Stack deployments (Prometheus + OTel collector) per data center site using GitOps templates
- Integrate Alertmanager with ServiceNow: webhook routing, ticket enrichment, auto-close logic, and escalation policy configuration
- Maintain platform security: Conjur/CyberArk secret injection at runtime, mTLS between stack components, RBAC in Grafana Enterprise
- Author and maintain Grafana dashboards in JSON/GitLab — facility overview, network health, RED metrics, application telemetry
- Mentor mid-level engineers, lead code reviews, and establish engineering standards for the team. Represent platform engineering in cross-functional architecture reviews and executive-level program updates
- Perform other duties as required and assigned
Skills
- DevOps / Automation - 5+ years in a production environment, Kubernetes (RKE2/k3s), Helm chart deployment, system services, Docker/container
- LGTM Stack Development and Configuration - 4+ years: Grafana, Mimir, Loki, Tempo configuration, tuning, dash-boarding and production operations; Prometheus required
- Senior-level Python / Scripting frameworks - 5+ years, Automation scripts, exporter development, GitLab pipeline scripting, REST API integrations
- GitOps / CI/CD - 5+ years, GitLab CI/CD pipeline authoring; Terraform and Ansible as primary IaC tools; ArgoCD or Flux preferred
- AIOps / Observability Engineering - 2+ years, Alertmanager rule authoring, anomaly detection integration, event correlation, noise reduction techniques
- Working infrastructure (Linux/VM) management knowledge - 5+ years, Linux administration, VMware vCenter/VCF experience, Netapp storage management, network fundamentals (SNMP, TCP/IP)
- Secrets Management - 2+ years, CyberArk/Conjur, HashiCorp Vault, or equivalent — runtime secret injection patterns
- Minimal travel may be required
- Experience and/or knowledge of ITSM processes and workflow automation e.g. Incident & Response Mgmt (IRM), Release mgmt., ServiceNow ITSM integration, alert routing, escalation policy design, SLA-driven on-call workflows
- Hands-on experience or working knowledge of Boomi integrations PaaS(iPaaS) technologies
- Experience working with BAS / BMS systems in a Datacenter / OT environment
- Hands-on experience working with AWS products in a Well-architected Framework and multi-account model to develop various compute, storage, network iaaS and PaaS services for IT applications
Benefits
- Benefits of working at Flexential: Benefits are subject to change at the Company's discretion.
- Flexential participates in the E-Verify program.
Company Overview
Apply To This Job