Logo of Huzzle

Member of Technical Staff, Inference & Model Serving

image

Cohere

13d ago

  • Job
    Full-time
    Mid Level
  • Software Engineering
  • San Francisco, +3
  • Quick Apply

AI generated summary

  • You need experience with ML model serving, deep learning architectures, accelerators, distributed systems, performance optimization, cloud infrastructure, and Golang for our technical staff position at Cohere.
  • You will build high-performance, scalable machine learning systems, deploy NLP models with low latency and high availability, interface with customers, and create customized deployments.

Requirements

  • Experience with serving ML models.
  • Experience designing, implementing, and maintaining a production service at scale.
  • Familiarity with inference characteristics of deep learning models, specifically, Transformer based architectures.
  • Familiarity with computational characteristics of accelerators (GPUs, TPUs, and/or Inferentia), especially how they influence latency and throughput of inference.
  • Strong understanding or working experience with distributed systems.
  • Experience in performance benchmarking, profiling, and optimization.
  • Experience with cloud infrastructure (e.g. AWS, GCP).
  • Experience in Golang (or, other languages designed for high-performance scalable servers).

Responsibilities

  • Are you energized by building high-performance, scalable and reliable machine learning systems? Do you want to help define and build the next generation of AI platforms powering advanced NLP applications? We are looking for Members of Technical Staff to join the Model Serving team at Cohere. The team is responsible for developing, deploying, and operating the AI platform delivering Cohere's large language models through easy to use API endpoints. In this role, you will work closely with many teams to deploy optimized NLP models to production in low latency, high throughput, and high availability environments. You will also get the opportunity to interface with customers and create customized deployments to meet their specific needs.

FAQs

What is the main responsibility of a Member of Technical Staff in the Inference & Model Serving team at Cohere?

The main responsibility of a Member of Technical Staff in the Inference & Model Serving team at Cohere is to develop, deploy, and operate the AI platform delivering Cohere's large language models through easy to use API endpoints, and to optimize NLP models for production in low latency, high throughput, and high availability environments.

What skills and experience are required for this role?

To be a good fit for this role, candidates should have experience with serving ML models, designing and implementing production services at scale, familiarity with deep learning models and Transformer based architectures, understanding of distributed systems, experience with computational characteristics of accelerators, proficiency in performance benchmarking and optimization, familiarity with cloud infrastructure, and experience in Golang or other languages designed for high-performance scalable servers.

What is the work environment like at Cohere for members of the Inference & Model Serving team?

At Cohere, members of the Inference & Model Serving team work closely with various teams to deploy optimized NLP models to production environments. The team is passionate about building high-performance, scalable, and reliable machine learning systems and is dedicated to providing customers with the best AI platforms for advanced NLP applications. Team members are encouraged to work hard, move fast, and collaborate to drive value for customers.

At Cohere, our mission is to build machines that understand the world, and to make them safely accessible to all.

Technology
Industry
51-200
Employees
2019
Founded Year

Mission & Purpose

Cohere provides unprecedented access to affordable, easy-to-deploy large language models. Our platform gives computers the ability to read and write - whether you want to better understand what your customers are saying, or you want to write compelling copy that speaks to your target audience, Cohere can help.