*Location:* UK (Manchester) *Remote:* Yes (preferred) *Willing to relocate:* UK/EU consider...

josh-gree • yesterday at 7:26 PM • 0 replies • view on HN

*Location:* UK (Manchester)

*Remote:* Yes (preferred)

*Willing to relocate:* UK/EU considered for the right role

*Resume:* josh-gree.github.io/cv

*Technologies:* Python, SQL, R; Airflow, Prefect, Dagster; Kafka; Docker/Kubernetes; Terraform; GCP/AWS; Postgres, PostGIS, Snowflake, Redshift; Zarr/Parquet; ML/Deep Learning; HPC; React/Flask.

*Summary:* Senior Software/Data Engineer with a strong mathematical and computational modelling background. I build high-reliability data systems, complex ETL/ELT pipelines, and ML-ready data platforms—especially where datasets are large, irregular, hierarchical, or scientifically complex.

Most recently, I’ve been designing and operating large-scale data infrastructure for high-dimensional biological datasets (100k+ samples), unifying heterogeneous storage formats into lineage-aware catalogues, creating ontologies for hierarchical labels, building QC pipelines in Dagster, developing synthetic single-cell data generators, and working closely with domain scientists to formalise and scale experimental and computational workflows.

Previously: large-scale mobile-network analytics for humanitarian agencies; climate/energy data engineering; ad-tech pipelines; and HPC-driven modelling from computational research.

I’m looking for roles where difficult data problems, scientific or ML-adjacent pipelines, or complex modelling workflows need to be made robust, reproducible, and scalable. Prefer small teams, high ownership, and work with real impact.

*What I offer:* – Architecture & implementation of reliable data/ML platforms

– Workflow orchestration, data governance, and reproducibility

– Scientific/ML pipeline design (Bayesian modelling, synthetic data, QC/validation)

– Cloud infra/IaC and cost-efficient storage design

– Ability to collaborate deeply with domain experts and formalise messy processes

alt Hacker News