Senior Data Engineer
The current team works in a microservices-oriented architecture using modern technologies like Docker, Gitlab CI/CD, Kafka, gRPC, Spark, among others. We pride ourselves in using an “Infrastructure-As-Code” approach that ensures our code and infrastructure are secure and easily deployed in the cloud. Engineers are empowered to try on new technologies and contribute to streamlining our stack while they evolve the platform to the next level.
Roles and Responsibilities
- Work on business critical data pipelines that increase the security of our fortune 500 clients by operationalizing data ingested from their tools and premium intelligence feeds.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of sources and tools using SQL, Spark, Kafka and AWS.
- Design, architect and support new and existing data and ETL pipelines and recommend improvements and modifications with a focus on maintainability, scalability, and performance.
- Analyze, debug and correct issues with data pipelines.
- Work closely with Data Scientists to make the process of training, testing, and deployment of Machine Learning models seamless.
- Communicate strategies and processes around data modeling and architecture to multi-functional groups and senior level management.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- At least 3 years of experience building complex ETL pipelines using Spark.
- Proficiency in Scala / Java.
- Experience writing complex SQL and ETL processes.
- BS in Computer Science / Software Engineering or equivalent experience.
- Strong knowledge of Apache Spark, Spark streaming, Apache Kafka and similar technology stacks.
- Strong understanding & usage of algorithms and data structures.
- Background working with cybersecurity data.
- Experience working with Python.
- Experience with cloud service providers such as AWS / Databricks.