What you will need to succeed
A degree in Software Engineering, Computer Science or associated quantitative discipline (Math, Physics, Computational Statistics, Machine Learning) / or demonstrable work experience.
Python 3, proficient.
Linux OS and its scripting methods, knowledgeable.
Development and deployment of ETL pipelines and ML algorithms in training and production environments, experienced in a commercial or research environment.
Unit and integration testing, knowledgeable.
Continuous Integration and AGILE development, familiar.
Familiarity with financial data of considerable size (terabyte-sized).
Knowledge of SQL (PostgreSQL or MySQL).
Knowledge of developing, deploying and maintaining microservices and APIs using Flask or NodeJS.
Familiarity with the Google Cloud Platform or Microsoft Azure and its data processing and ML stacks.
Knowledge of C++.
Familiarity with modern ETL systems, e.g. Apache Kafka/Confluent.
What’s in it for you
Working alongside a highly talented team, with leading names in the quantum computing industry. We offer a highly competitive package including equity participation, 28 day’s paid annual leave, a workplace pension and a positive approach to flexible working.
The big picture
As part of our mission to expand our world class team and our pioneering work in the quantum industry Cambridge Quantum Computing’s Machine Learning (ML) team is searching for a Data Engineer.
In this role you will help to streamline the data preparation for the training and the deployment of ML models for real-time financial forecasting. You would be developing and maintaining a bespoke data acquisition and processing pipeline that serves both the ML development and the live production system, as well as working on the data preparation stage for crafting new ML trading strategies.
The successful candidate will join the London Victoria office and will be working in a highly dynamic, development-focused group, alongside software engineers and machine learning scientists.
Key responsibilities in this role include:
- Integrating new data sources in live and back testing systems.
- Maintaining and replenishing the central repository of tick data.
- Maintaining and verifying both back testing and live Extraction Transformation and Loading (ETL) pipelines.
- Preparing datasets according to the scientists’ specs to train ML models.
- Maintaining custom scientific data processing software, ensuring scalability and best practices.
Apply on Lever
More on Careers