Logo of Huzzle


Research Intern - Systems Reliability

Logo of Microsoft


2mo ago

🚀 Off-cycle Internship


AI generated summary

  • You must be enrolled in a PhD program in Computer Science or related field, knowledgeable in cloud systems failures, debugging, testing, and familiar with distributed systems and machine learning techniques. Be prepared to be physically present at a Microsoft worksite.
  • You will research cloud system failures, develop bug detection techniques, diagnose failures, and troubleshoot production incidents for Microsoft.

Off-cycle Internship



  • Research Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers, who pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment.
  • The Systems Reliability Research in Microsoft Research is looking to hire Resaerch Interns and aims to develop practical tools and techniques that can help cloud developers adequately debug, test, monitor, and troubleshoot their systems. The research combines Distributed Systems, PL, Software Engineering, and Machine Learning techniques and spans all aspects of improving reliability and availability of large-scale cloud systems.


  • Required Qualifications:
  • Currently enrolled in a PhD program in Computer Science or a related STEM field.
  • Other Requirements:
  • Research Interns are expected to be physically located in their manager’s Microsoft worksite location for the duration of their internship.
  • In addition to the qualifications above, you’ll need to submit a minimum of two reference letters for this position. After you submit your application, a request for letters may be sent to your list of references on your behalf. Note that reference letters cannot be requested until after you have submitted your application, and furthermore, that they might not be automatically requested for all candidates. You may wish to alert your letter writers in advance, so they will be ready to submit your letter.
  • Preferred Qualifications:
  • Familiarity with common failures of cloud systems and techniques for debugging, testing, configuring, and monitoring large-scale cloud systems
  • Familiarity with existing distributed systems, software engineering, and machine learning techniques to improve software reliability

Education requirements

Currently Studying

Area of Responsibilities



  • Research Interns put inquiry and theory into practice. Alongside fellow doctoral candidates and some of the world’s best researchers, Research Interns learn, collaborate, and network for life. Research Interns not only advance their own careers, but they also contribute to exciting research and development strides. During the 12-week internship, Research Interns are paired with mentors and expected to collaborate with other Research Interns and researchers, present findings, and contribute to the vibrant life of the community. Research internships are available in all areas of research, and are offered year-round, though they typically begin in the summer.
  • Analyzing production data to understand how real cloud systems fail and what can be done to prevent them.
  • Developing practical static and dynamic analysis techniques to uncover hard-to-find bugs before production. The techniques can be evaluated and used with thousands with Microsoft software projects.
  • Developing practical and novel techniques for diagnosing failures, runtime monitoring, logging, & failure prevention, etc.
  • Developing solutions to help quick troubleshooting of production incidents.


Work type

Full time

Work mode