[Remote] Big Data Engineer
Note: The job is a remote job and is open to candidates in USA. AdvanSix plays a critical role in global supply chains, innovating and delivering essential products for various end markets. They are seeking a Big Data Engineer to build and operate their enterprise Unified Data Layer, delivering data products that support multiple corporate functions and ensure trusted data governance.
Responsibilities
- Build ingestion pipelines (batch, CDC, streaming) from S/4HANA/DataSphere, PHD/historian, LIMS, TMS, HSE, and other sources into landing → curated → semantic layers
- Implement data contracts, schema/versioning, SCD handling, partitioning, and performance tuning (file formats, clustering, caching)
- Develop dimensional/semantic models that back certified Power BI datasets and APIs for apps/agents
- Integrate OT data via OPC UA/MQTT, broker/DMZ patterns, read-only historian feeds, and event/batch frames—no control-net reads
- Collaborate with plant controls on change control, signal quality, and downtime windows
- Embed data quality rules, unit/integration tests, and validation checks (freshness, completeness, drift/PSI)
- Instrument lineage and end-to-end monitoring; build alerting and on-call runbooks to minimize MTTR
- Enforce RBAC, secrets management, PII/HSE classifications, and retention aligned to Governance/MDM policies
- Automate build/test/deploy with Git-based CI/CD (environments, approvals, blue/green)
- Track and optimize cost/performance (cluster sizing, autoscaling, cache strategy); contribute to FinOps reviews
- Partner with Reporting & BI on semantic model contracts, RLS, and performance SLAs; avoid direct system scraping
- Produce “readme” docs, data dictionaries, runbooks, and post-incident reviews; support knowledge transfer with vendors
Skills
- Minimum 5 years' in data engineering building production pipelines at scale (batch/CDC/streaming)
- Hands-on with Azure data stack: Databricks or Fabric/Synapse, ADF/Pipelines, ADLS/OneLake, Azure SQL/SQL MI, Key Vault
- Strong SQL and Python/PySpark; comfort with Spark Structured Streaming and performance tuning
- Experience implementing tests/observability (freshness, schema, expectations), and Git-based CI/CD
- Familiarity with SAP S/4HANA structures and SAP DataSphere semantic modeling
- OT concepts: historians (PHD/PI), OPC UA/MQTT, event/batch frames, ISA-95/99 basics
- Understanding of Power BI consumption (semantic models, RLS) and APIs for downstream AI/ML apps/agents
- Time-series/data-quality tooling (e.g., Great Expectations or equivalent patterns), feature/metric stores
- MDM concepts (keys, survivorship), lineage/catalog tooling
- TMS/WMS, LIMS, Historian, HSE domain exposure; Lean/Six Sigma mindset; FinOps awareness
Benefits
- Paid holidays
- Paid time off including vacation
- Eligibility to purchase company stock
- Tuition reimbursement
- 401K with a competitive company match
- Discretionary financial benefits such as incentive pay, equity awards, and participation in a deferred compensation plan
- Medical, dental and vision insurance
- Flexible spending and health savings account eligibility
- Employer-provided short term disability benefits
- Eligibility to purchase long term disability benefits
- Employer-provided basic life insurance
- Eligibility to purchase voluntary life coverages
Company Overview
Company H1B Sponsorship
Apply To This Job