FAQs
What is the primary responsibility of the Senior Systems Administrator for the Pan-Canadian AI Compute Environment - West (PAICE-W)?
The primary responsibility is to lead the technical operation of the PAICE-W, a digital research platform designed to facilitate AI and Machine Learning (ML) research, ensuring the system is secure, robust, and operates efficiently.
Who will the Senior Systems Administrator collaborate with in this role?
The Senior Systems Administrator will work with colleagues from the University of Alberta Research Computing, Information Services & Technology (IST), Alberta Machine Intelligence Institute (amii), Digital Research Alliance of Canada (Alliance), and other research computing sites.
What kind of equipment and resources will the Senior Systems Administrator be responsible for maintaining?
The Senior Systems Administrator will maintain high-performance digital research equipment that includes GPU-rich cluster computing and high-speed storage systems necessary for AI/ML research.
What specific tasks are expected to be performed in terms of security and documentation?
The Senior Systems Administrator is expected to implement security standards, document configurations of the cluster and storage, monitor and report on system use and security, and conduct troubleshooting and root cause analyses.
What are the educational qualifications required for the Senior Systems Administrator position?
A Bachelor’s degree, preferably in a science, engineering, or biomedical area, is required for this position.
How much experience in high-performance computing systems is necessary for this role?
A minimum of three years’ experience in maintaining and using high-performance computing (HPC) systems and research software is required.
Is knowledge of specific software or programming languages necessary for this job?
Yes, knowledge of installing and maintaining HPC schedulers such as Slurm, as well as familiarity with programming and scripting languages, is necessary.
Are there any preferred qualifications for this role?
Yes, experience with AI/ML is valuable, and knowledge of computer and network security is expected, although some experience may be gained on the job with guidance from security experts.
Will there be a training component involved in this role?
Yes, the Senior Systems Administrator will train researchers on how to effectively and efficiently use the PAICE-W environment and collaborate with other trainers in the development and delivery of this training.
What is the importance of vendor relationships in this position?
Maintaining necessary vendor relationships is crucial to ensure appropriate vendor support for warrantied computational and storage equipment used in the PAICE-W environment.