Logo of Huzzle

Research Scientist Intern, GenAI - Multimodal Audio (Speech, Sound and Music)

image

Meta

2mo ago

  • Internship
    Full-time
    Off-cycle Internship
  • Research & Development
  • Menlo Park

AI generated summary

  • You must have a Ph.D. in relevant field, research experience in ML, deep learning, Python/C++, experience with Pytorch/Tensorflow, audio dataset curation, publications in leading conferences, ability to work in team environment, and analytical problem-solving skills.
  • You will conduct research on multimodal audio models, develop ML algorithms, analyze and improve efficiency, collaborate with researchers, publish results, and contribute to Meta product development.

Requirements

  • Currently has or is in the process of obtaining a Ph.D. degree in Computer Science, Machine Learning, Artificial Intelligence, Robotics, Algorithms, Computational Mathematics, or relevant technical field.
  • Must obtain work authorization in country of employment at the time of hire and maintain ongoing work authorization during employment.
  • Research experience in machine learning, deep learning, computer vision and/or natural language processing.
  • Experience with Python, C++, C, Lua or other related language.
  • Experience with deep learning frameworks such as Pytorch or Tensorflow
  • Preferred Qualifications:
  • Intent to return to degree program after the completion of the internship/co-op
  • Experience in either audio dataset curation or audio generation model evaluation
  • Proven track record of achieving significant results as demonstrated by grants, fellowships, patents, as well as first-authored publications at leading workshops or conferences such as NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ACL or similar
  • Experience working and communicating cross functionally in a team environment
  • Publications or experience in audio (speech, sound, or music) or vision (image or video) generative models.
  • Experience solving analytical problems using quantitative approaches.
  • Experience setting up ML experiments and analyze their results.
  • Experience manipulating and analyzing complex, large scale, high-dimensionality data from varying sources
  • Experience in utilizing theoretical and empirical research to solve problems.

Responsibilities

  • Full-life-cycle research on multimodal generative foundation models with a focus on the audio modality, including bringing up ideas, designing and implementing models and algorithms, curating training data, training / tuning / scaling the models, evaluating the performance, open sourcing and publication
  • Develop novel state-of-the-art machine learning algorithms and corresponding systems, leveraging various deep learning techniques
  • Analyze and improve efficiency, scalability, and stability of corresponding deployed algorithms
  • Perform research to advance the science and technology of intelligent machines
  • Collaborate with researchers and cross-functional partners including communicating research plans, progress, and results
  • Publish research results and contribute to research that can be applied to Meta product development

FAQs

What are the main areas of focus for the Research Scientist Intern position at GenAI - Multimodal Audio?

Main areas of focus for the Research Scientist Intern position at GenAI - Multimodal Audio include deep learning, computer vision, audio and speech processing, natural language processing, machine learning, reinforcement learning, computational statistics, and applied mathematics.

What kind of projects will interns be working on at GenAI - Multimodal Audio?

Interns at GenAI - Multimodal Audio will have the opportunity to make core algorithmic advances and apply their ideas at an unprecedented scale. Projects may involve advancing technologies to help interact with and understand the world, with a focus on multimodal audio (speech, sound, and music) processing.

What qualifications are required for the Research Scientist Intern position at GenAI - Multimodal Audio?

Qualifications for the Research Scientist Intern position at GenAI - Multimodal Audio include a passion for artificial intelligence, as well as expertise in areas such as deep learning, computer vision, audio and speech processing, natural language processing, machine learning, reinforcement learning, computational statistics, and applied mathematics.

What is the scope of impact for projects that interns will be working on at GenAI - Multimodal Audio?

Projects that interns work on at GenAI - Multimodal Audio have the potential for significant impact in advancing the field of artificial intelligence and technologies to interact with and understand the world. Interns will have the opportunity to make core algorithmic advances and apply their ideas at a large scale.

Technology
Industry
10,001+
Employees
2004
Founded Year

Mission & Purpose

Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology.