Scala Big Data Engineer (Senior)
If you are a Scala developer who would like to get into Big Data or a Data Engineer who wants to learn more about Scala (or both!) then we are looking for someone just like you.
Our main goal is to develop a generic Big Data ingestion and transformation framework for a new, highly funded, UK-based, insurance company. We work across two teams: one responsible for developing the framework and extending the capabilities of the platform and another one focusing on using the framework to ingest hundreds of datasets from various data sources.
There are lots of greenfield areas and possibilities to impact the whole company.
Scala, Spark (mostly Core, some SQL), AWS (EMR, EC2, S3, Codebuild)
- Creating a generic Big Data framework on top of Spark being able to ingest and transform hundreds of datasets from various sources.
- Making Apache Spark reliable and conforming to Functional Programming style.
- Automating any possible part of the development process - the less manual work the better.
There are around 20 engineers (both Scala/Spark and Snowflake) in the team, of which about a half are from well known consultancies.
Most of the Engineers have Scala background and put a lot of focus on Functional Programming - some of the consultants have been in the Scala community for more than 10 years.
We are developing an analytical platform (e-commerce industry), operating on a cluster of hundreds of machines, with hundreads of RAM’s terabytes, as well as thousands of cores. We integrate, process and analyze data. We need all this to run and test the applications we have developed that use state-of-the-art technologies from the Big Data world.
We are also optimizing complex machine learning applications (ML pipelines), improving the generation of huge analytical views (joining tables with millions of records in a few minutes is our speciality!). We have an enormous cluster available to test all the developed apps.
Apache Spark (core, SQL, pySpark, Streaming, mllib), Scala, Kafka, Hive, HBase, Hadoop, Teradata, Azure, Jenkins, Ansible, SBT, GIT.
Over thirty developers with experience in building Big Data solutions divided into teams of 4-6 people. We have a real influence on the selection of tools or the possibility of architectural changes.
We do not expect you to qualify for all of the above points. A good understanding of some of these areas and a willingness to develop expertise in others may be sufficient. We are not concerned with your education or any other formalism. What we are concerned with are your passion, knowledge, and experience.
"*" indicates required fields