FAQs
What is the focus of the Model Efficiency team?
The Model Efficiency team is focused on increasing the inference efficiency of large language models by improving model architecture and optimizing ML frameworks.
Where are the offices located for this position?
Our offices are located in Toronto, San Francisco, New York, and London. We also embrace a remote-friendly environment, strategically distributing teams based on interests, expertise, and time zones for collaboration and flexibility.
What qualifications are needed to be a good fit for the Model Efficiency team?
To be a good fit for the Model Efficiency team, significant experience in developing high-performance machine learning algorithms or infrastructure, hands-on experience with large language models, bias for actions and results, and an appetite for solving challenging machine learning research problems are required.
What areas of expertise are considered a big plus for this position?
Considerable experience with model compression techniques, GPU/Accelerator programming, LLM inference performance modeling, and machine learning framework internals are considered a big plus for this position.