[Remote] Infrastructure Deployment Engineer
Note: The job is a remote job and is open to candidates in USA. Verigent's client is looking for a Deployment Engineer to join their engineering team. In this role, you will own the design and engineering of multi-site WAN backbones and high-density GPU compute fabrics for AI data center operators, taking projects from concept and high-level design through detailed design, build, and validation.
Responsibilities
- Design multi-site network backbones including BGP routing, transit and peering, internet edge, and data center interconnect (DCI), and document them as defensible high- and low-level designs
- Engineer high-density GPU compute fabrics (leaf-spine, rail-optimized fat-tree) using InfiniBand (NDR/XDR) and RoCEv2 / Spectrum-X Ethernet
- Develop port-to-port cable maps and fabric plane architectures (single, dual, and quad plane) for NVL72-class GPU clusters, preserving NCCL collective performance
- Plan and execute staged fabric bring-up including Subnet Manager / UFM configuration, firmware and pkey alignment, and controlled port enablement to avoid trap storms and reconvergence events
- Produce HLD and LLD documentation, IP and ASN schemes, and cable schedules, and drive them through an issued-for-construction freeze and commissioning process
- Apply NetDevOps practices (source-of-truth databases, infrastructure-as-code, and automated validation) to the logical fabric
- Coordinate carrier and transport (transit, wavelengths, point-to-point links) and capacity planning across sites
- Support network operations and managed services (monitoring, change control, root-cause analysis, and preventive maintenance) through deployment and hyper-care
Skills
- 10+ years of experience in network engineering, data center fabric design, or large-scale WAN/backbone design
- Deep command of BGP (eBGP/iBGP, multi-homing, transit and peering) and enterprise or service-provider routing and switching
- Hands-on experience designing and deploying leaf-spine data center fabrics
- Working knowledge of InfiniBand and/or RoCE GPU back-end networks (fat-tree topology, rails, and collective traffic patterns)
- Ability to produce HLD/LLD documentation and structured cable schedules to a build-ready standard
- Strong problem-solving and technical communication skills, and the ability to collaborate across engineering, deployment, and customer teams
- Willingness to travel to customer and project sites (estimated 25–40%)
Company Overview
Apply To This Job