[Remote] Senior Data Engineer
Note: The job is a remote job and is open to candidates in USA. Keyrock is a leading change-maker in the digital asset space, known for its innovative approach and diverse team. They are seeking a Senior Data Engineer to help build the Keyrock Data Platform, enabling various teams to access and utilize data efficiently for trading and asset management purposes.
Responsibilities
- Build streaming and batch pipelines that ingest, normalise, and distribute market, trading, and portfolio data, resilient to feed and exchange failures
- Build the self-serve tooling (SDKs, patterns, templates, AI agents) so other teams publish, consume, and build on data products without waiting on us
- Own data contracts and schema evolution. Keep schema changes from turning into multi-team coordination events
- Design the lakehouse and time-series layer around consumer query patterns
- Build and evolve the Data Governance and Data Quality Framework: stale-feed detection, schema validation, range checks, idempotent writes, lineage, ownership, self-healing
- Build the derived analytics the business runs on: cross-exchange spreads, VWAP at depth, order book microstructure for the desks; portfolio views, exposure, performance for wealth and asset management
- Make observability, cost, and performance first-class from day one
- Treat infrastructure as code (Docker, Terraform, CI/CD) alongside our Central Infrastructure Team
- Work in the open: write things down, partner closely with Architecture, Infrastructure, Platform, and the rest of the teams
Skills
- 8+ years of building production data systems that other people rely on
- Strong proficiency in Python and SQL: not just being able to write a query, but being able to reason about what the engine is doing with it
- Code that's easy for someone else to read, test, and delete later
- Strong understanding of data modelling for both streaming and analytical workloads
- Efficiency, quality, idempotency, and observability are taken seriously by default
- You've designed and operated streaming systems on Kafka, Redpanda, MSK, or Kinesis, and you have opinions about partitioning, consumer groups, offsets, and schema registries
- You've used a time-series store in production (ClickHouse ideally; TimescaleDB, QuestDB, or similar are fine too) and can talk about table design as a function of query patterns
- You've worked with a lakehouse architecture and reason about table layout, partitioning, and compaction as design choices that shape query performance and storage cost
- You build for self-healing and idempotency. Reprocessing is safe, retries don't double-write, and the system recovers without a human in the loop
- Docker, Terraform, and CI/CD are how you work, not a separate 'DevOps' thing
- You think about cost and performance early
- You instrument as you build: logs, metrics, and traces are part of the system from day one
- You design for data quality and governance up front covering contracts, validation, lineage, and ownership
- You reason from first principles when a problem is new, stay pragmatic when it isn't, and update your view when you learn more
- You treat the trading desks, wealth and asset management, product, risk, finance, compliance, and research as customers of what you build, and communicate with them that way
- You optimise for outcomes over output. A smaller, simpler thing that ships and works beats a bigger thing that doesn't
- You take ownership end-to-end: design, ship, operate, improve
- You say what you think including when it's an unpopular take. You change your mind when the argument is better
- You make the people around you better. Reviews are real, juniors grow from working with you, and peers want to work with you again
- You're curious about how markets work. Data engineering on its own won't keep you interested here
- You're honest about what you know and what you don't, and quick to close the gap
- You understand financial market data: order books, trades, reference data, portfolios, exposures. Crypto, TradFi, or both are a strong plus
- Lakehouse experience with Apache Iceberg or Delta Lake
- Familiarity with DataHub or similar metadata/lineage platforms
- Rust. Some of our performance-critical services are written in it. Interest is welcome; fluency isn't required
Benefits
- A competitive salary package, with various benefits.
- Flexible hours, remote-first, business-hours on-call shared across the team.
- Regular online get-togethers and a yearly onsite where everyone's in the same room.
Company Overview
Apply To This Job