Lead Machine Learning Engineer

Bengaluru, Karnataka, India | Data Science | Full-time | COVID-19 remote


Machine Learning Engineer -- ML Platform

The ShareChat ML platform team builds scalable shared infrastructure to accelerate the pace of Machine Learning development and magnify both impact and quality of ML use-cases deployed across the company.

We're looking for deeply passionate machine learning engineers who have demonstrated the ability to design, build and scale distributed backend services.

This role on the platform team will give you the opportunity to work on

  • Building out feature computation, storage, monitoring, analysis and serving systems for billions of features required across ShareChat applications every day
  • Developing distributed real-time training & experiment infrastructure over terabytes of data
  • Developing a highly scalable, high-QPS inference service providing low latency performance using a mix of CPU and GPU hardware to most efficiently utilise resources
  • Scaling distributed backend services to reliably support high-QPS low latency use cases
  • Building core data and model metadata systems powering the end-to-end ML lifecycle
  • Advancing the usage of ML monitoring, observability and explainability across the company

To accomplish this, you will make use of cutting-edge open source technologies like Kubernetes, Docker, Kafka, Spark, Snowflake, Metabase, Tensorflow, PyTorch among others.

You should have,

  • 4+ years of industry experience with a solid understanding of engineering and infrastructure best practices
  • Strong coding skills with Go, Java or Scala. Familiarity with Python is a plus.
  • Hands-on experience using data processing tools like Beam, Spark or Flink in a cloud environment like GCP or AWS and first-hand knowledge about data management concepts
  • Written libraries or tools used by other engineers or researchers.
  • A keen drive for code quality, reliability, continuous improvement and monitoring.
  • Value for team success over personal success

In addition,

  • Experience developing and productionising machine learning models is a plus
  • Industry experience building end-to-end Machine Learning Infrastructure a big plus
  • Experience with Docker, Kubernetes or Flink is a plus
  • You have a preference for working in an end-to-end product development flow: from user research, discovery, coding and testing to deployment, monitoring and gathering feedback.