Platform Engineer (Data & DevOps Focus)

MERCURIA ENERGY TRADING SA

Design and operate robust, scalable data and model pipelines.

data-collection-ingestion Cleaning & preprocessing

Details

Company Description

Established in 2004, Mercuria is one of the leading integrated energy and commodity trading companies in the world. We bring energy markets together to support the needs of today by trading, structuring finance, and investing into strategic assets, while generating more than $110 billion in turnover.

Our operations span over 50 countries on 5 continents, including all the major energy hubs. We trade physical oil, energy products, environmental products and other commodities from Geneva, London, Singapore, Shanghai, Beijing, Dubai, Houston, Calgary and Greenwich (CT).

We are committed to advancing the transition to a more sustainable, affordable and reliable energy system for tomorrow. Over 50% of our assets are in low carbon & energy transition sectors - providing a strong platform to trade these new markets and support decarbonization.

In 2023, we established Silvania, a $500 million fund, investing into restoration and protection of nature and biodiversity globally and in support of the Paris Agreement goals & UN 30x30 biodiversity initiative.

The Role

We are looking for a Platform Engineer with a strong focus on Data Engineering and DevOps to serve as the backbone of our AI infrastructure. In this role, you will surface high-quality enterprise knowledge from SQL and analytics sources, convert it to embeddings, and ensure our LLM platform remains performant, secure, and cost-efficient. You’ll act as the critical bridge between raw datasets, model lifecycles, and scalable AWS infrastructure—empowering our AI team to deliver impactful, production-ready solutions.

Key Responsibilities

SQL-Centric Data Pipelines: Architect and build robust ETL/ELT pipelines from sources like MSSQL, Snowflake, Databricks, and S3 using Python, PySpark, and DBT or Liquibase. Implement Change Data Capture (CDC) to maintain efficiency.
Embeddings & Vector Store Management: Build and optimize chunking and embedding workflows. Manage provisioning and tuning of vector databases (e.g., Pinecone, Weaviate), ensuring incremental updates with nightly refreshes of only modified data.
MLOps Automation: Orchestrate model training, evaluation, and drift detection using SageMaker Pipelines or Bedrock workflows. Register models using MLflow or SageMaker Registry.
Infrastructure & DevOps: Own and evolve Terraform-based infrastructure for data/ML workloads, including GPU fleet autoscaling, spot-instance optimization, and Bedrock usage quota tracking.
Governance & Compliance: Implement lineage tracking, encryption standards, and PII masking. Prepare audit-ready artefacts for SOC2, MiFID compliance, and model risk management.
Monitoring & Reporting: Surface metrics on data quality, pipeline health, and operational costs using Grafana and Prometheus. Set up alerts to enable proactive issue resolution.

Technical Expertise

Programming & Data: Python, SQL, PySpark, Pandas, DBT, Liquibase
Pipelines & Orchestration: Airflow, Dagster, Debezium, Kafka Connect
Cloud & ML Services: AWS Glue, Step Functions, SageMaker, Bedrock, S3
IaC & Observability: Terraform (multi-account patterns), Prometheus, Grafana
Vector Databases: Weaviate, Pinecone, pgvector, OpenSearch
Fluent English

Non-Technical Skills

4+ years of experience in data or ML engineering, with at least 3 years running production-grade MLOps pipelines.
Deep understanding of data operations best practices, with a track record of compliance in regulated industries (e.g., finance, healthcare).
Experience collaborating with cross-functional stakeholders including AI researchers, SREs, and data stewards.
Comfortable operating in fast-paced environments and taking ownership of mission-critical infrastructure.

How to apply for this job

Send your application to [email protected]

Posted here on 19/06/2025