Overview of Job Function:
The Data Engineer (the “Engineer”) will design and build scalable data processing and AI pipelines for our Big Data and AI Platform. The Engineer will be responsible for processing billions of events from marquee customers in real-time for generating customer satisfaction analytics, AI based predictions, recommendations and engagement. Our platform is designed with cutting edge technologies and the Engineer will work with Kafka, spark, vertx and Mesos amongst others
Build/manage Kafka cluster to handle high volume real-time and batch processing pipelines for analytics and AI;
Performance tune and scale Kafka clusters across various products and systems;
Develop software /product that analyzes large volume of data in bigdata system, as well as deep learning using various AI/ML technologies;
Evaluate new technologies in bigdata pipeline processing.
Bachelor’s degree in Computer Science or equivalent experience;
At least 3 years relevant working experience managing large Kafka clusters in 24x7 production environment;
At least 3 years relevant experience in development, asynchronous programming and data streaming in Java;
Highly skilled on Kafka cluster upgrade, monitoring and troubleshooting, with deep understanding of impact on partition reassignment, Kafka consumer pause/resume, broker network latency optimization, etc.;
A passion for innovation and thinking “out of the box” but focused on delivering product on time;
Knowledge of Python or Spark for real time data processing is a big plus;
Vertx knowledge is a plus.