[Remote] Customer Support Engineer (GPU Cluster)
Note: The job is a remote job and is open to candidates in USA. Together AI is a research-driven artificial intelligence company focused on creating innovative AI systems. As a Customer Support Engineer, you will support customers in building training and inference solutions, tackle complex technical challenges, and collaborate with various teams to enhance customer satisfaction.
Responsibilities
- Engage directly with customers to tackle and resolve complex technical challenges involving our cutting-edge Kubernetes GPU clusters; ensure swift and effective solutions every time
- Become a product expert in our GPU Cluster service, serving as the last line of technical defense before issues are escalated to Engineering and Product teams
- Collaborate seamlessly across Engineering, Research, and Product teams to address customer concerns; collaborate with senior leaders both internally and externally to ensure the highest levels of customer satisfaction
- Transform customer insights into action by identifying patterns in support cases and working with Engineering and Go-To-Market teams to drive Together’s roadmap (e.g., future models to support)
- Maintain detailed documentation of system configurations, procedures, troubleshooting guides, and FAQs to facilitate knowledge sharing with team and customers
- Be flexible in providing support coverage during holidays, nights and weekends as required by business needs to ensure consistent and reliable service for our customers
Skills
- 3+ years of experience in a customer-facing technical role with at least 1 year in a support function in AI or supporting a mission-critical API in SaaS
- Strong technical background, with knowledge of AI, ML, GPU technologies and their integration into high-performance computing (HPC) environments
- Familiarity with infrastructure services (e.g., Kubernetes, SLURM), infrastructure as code solutions (e.g., Ansible) high-performance network fabrics, NFS-based storage management, container infrastructure, and scripting and programming languages
- Foundational understanding in the installation, configuration, administration, troubleshooting, and securing of compute clusters
- Complex technical problem solving and troubleshooting, with a proactive approach to issue resolution
- Ability to work cross-functionally with teams such as Sales, Engineering, Support, Product and Research to drive customer success
- Strong sense of ownership and willingness to learn new skills to ensure both team and customer success
- Excellent communication and interpersonal skills, with the ability to explain complex technical concepts to non-technical stakeholders
- Ability to operate in dynamic environments, adept at managing multiple projects, and comfortable with frequent context switching and prioritization
Benefits
- Startup equity
- Health insurance
- Other benefits
- Flexibility in terms of remote work
Company Overview
Company H1B Sponsorship
Apply To This Job