My client is searching for a Data Engineer to leverage key technologies to develop, construct, test and maintain large-scale data processing systems. You will play a key role in supporting their next generation of products and data initiatives, including building a new machine learning platform!
- Create and maintain optimal data pipeline architecture.
- Optimize data systems and building them from the ground up to improve real-time streaming solutions
- Support our software developers, database architects, data analysts and data scientists on data initiatives, ensuring optimal data delivery architecture is consistent throughout ongoing projects such as integrating new source systems (eg. Google Analytics) into a data lake
- Assemble large, complex data sets that meet functional/non-functional business requirements
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader
- Demonstrated experience in the full software development lifecycle (SDLC)
- Competent with stream-processing systems: Storm, Spark-Streaming, etc.
- Experience with AWS cloud services: EC2, EMR, RDS, Redshift, Kinesis streams, Dynamo DB, Lambda, Step functions etc. Certifications will be highly regarded.
- Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
- Strong analytic skills related to working with unstructured datasets.
- Proactive and desire to take ownership
- Experience working on a Data Lake project or building a Machine Learning platform would be highly regarded, but not essential
If interested please apply online