Logo of Huzzle

Pan-Canadian Artificial Intelligence Compute Environment - Senior Systems Administrator

Applications are closed

  • Job
    Full-time
    Mid Level
  • Engineering
  • $75.9K - $107.4K
  • Edmonton

Requirements

  • Bachelor’s degree, preferably in a science, engineering, or biomedical area.
  • Three years’ experience in maintaining and using high-performance computing (HPC) systems and research software commonly installed on such systems is required.
  • Knowledge of how to use, install, and maintain HPC schedulers such as Slurm.
  • Knowledge of programming languages and scripting languages.
  • Experience with Linux/Unix systems.
  • Preferred Qualifications:
  • Experience with AI/ML is valuable.
  • Computer and network security knowledge is expected, although experience may be gained on the job with guidance from security experts in IST and the Alliance.

Responsibilities

  • Maintain operations of high-performance digital research equipment to ensure security, effective capability for research, and efficient operation.
  • Automation of standard procedures to minimize human configuration errors will be expected.
  • Documentation of configurations of the cluster and storage, as well as security overall will be critical aspects.
  • Monitor and report on system use and security through logs and configuration controls. Troubleshoot aberrations and conduct root cause analyses.
  • Train researchers on how to use this environment and on best practices to ensure they use the system effectively and efficiently.
  • Collaborate with other trainers in the development and delivery of this training.
  • Maintain necessary vendor relationships to ensure appropriate vendor support of warrantied computational and storage equipment.

FAQs

What is the primary responsibility of the Senior Systems Administrator for the Pan-Canadian AI Compute Environment - West (PAICE-W)?

The primary responsibility is to lead the technical operation of the PAICE-W, a digital research platform designed to facilitate AI and Machine Learning (ML) research, ensuring the system is secure, robust, and operates efficiently.

Who will the Senior Systems Administrator collaborate with in this role?

The Senior Systems Administrator will work with colleagues from the University of Alberta Research Computing, Information Services & Technology (IST), Alberta Machine Intelligence Institute (amii), Digital Research Alliance of Canada (Alliance), and other research computing sites.

What kind of equipment and resources will the Senior Systems Administrator be responsible for maintaining?

The Senior Systems Administrator will maintain high-performance digital research equipment that includes GPU-rich cluster computing and high-speed storage systems necessary for AI/ML research.

What specific tasks are expected to be performed in terms of security and documentation?

The Senior Systems Administrator is expected to implement security standards, document configurations of the cluster and storage, monitor and report on system use and security, and conduct troubleshooting and root cause analyses.

What are the educational qualifications required for the Senior Systems Administrator position?

A Bachelor’s degree, preferably in a science, engineering, or biomedical area, is required for this position.

How much experience in high-performance computing systems is necessary for this role?

A minimum of three years’ experience in maintaining and using high-performance computing (HPC) systems and research software is required.

Is knowledge of specific software or programming languages necessary for this job?

Yes, knowledge of installing and maintaining HPC schedulers such as Slurm, as well as familiarity with programming and scripting languages, is necessary.

Are there any preferred qualifications for this role?

Yes, experience with AI/ML is valuable, and knowledge of computer and network security is expected, although some experience may be gained on the job with guidance from security experts.

Will there be a training component involved in this role?

Yes, the Senior Systems Administrator will train researchers on how to effectively and efficiently use the PAICE-W environment and collaborate with other trainers in the development and delivery of this training.

What is the importance of vendor relationships in this position?

Maintaining necessary vendor relationships is crucial to ensure appropriate vendor support for warrantied computational and storage equipment used in the PAICE-W environment.

We are #UAlberta! Always seeking, always challenging and, most of all, always leading with purpose.

Education
Industry
10,001+
Employees
1908
Founded Year

Mission & Purpose

The University of Alberta is one of Canada’s top teaching and research universities, with an international reputation for excellence across the humanities, sciences, creative arts, business, engineering, and health sciences. Home to more than 39,000 students and 15,000 faculty and staff, the university has an annual budget of $1.7 billion and attracts nearly $450 million in sponsored research revenue. The U of A offers close to 400 rigorous undergraduate, graduate, and professional programs in 18 faculties on five campuses. The university has more than 250,000 alumni worldwide. The university and its people remain dedicated to the promise made in 1908 by founding president Henry Marshall Tory that knowledge shall be used for “uplifting the whole people."​

Get notified when University of Alberta posts a new role

Get Hired with Huzzle

Discover jobs with AI-powered precision. Autofill and track applications, create tailored resumes, and find the best opportunities across the web – all by simply chatting.

Already have an account?