Chan Zuckerberg Biohub - San Francisco is hiring a Remote AI ML HPC Principal Engineer
\nThe Opportunity\n\nThe Chan Zuckerberg Biohub Network has an immediate opening for an AI/ML High Performance Computing (HPC) Principal Engineer. The CZ Biohub Network is composed of several new institutes that the Chan Zuckerberg Initiative created to do great science that cannot be done in conventional environments. The CZ Biohub Network brings together researchers from across disciplines to pursue audacious, important scientific challenges. The Network consists of four institutes throughout the country; San Francisco, Silicon Valley, Chicago and New York City. Each institute closely collaborates with the major universities in its local area. Along with the world-class engineering team at the Chan Zuckerberg Initiative, the CZ Biohub supports several 100 of the brightest, boldest engineers, data scientists, and biomedical researchers in the country, with the mission of understanding the mysteries of the cell and how cells interact within systems.\n\nThe Biohub is expanding its global scientific leadership, particularly in the area of AI/ML, with the acquisition of the largest GPU cluster dedicated to AI for biology. The AI/ML HPC Principal Engineer will be tasked with helping to realize the full potential of this capability in addition to providing advanced computing capabilities and consulting support to science and technical programs. This position will work closely with many different science teams simultaneously to translate experimental descriptions into software and hardware requirements and across all phases of the scientific lifecycle, including data ingest, analysis, management and storage, computation, authentication, tool development and many other computing needs expressed by scientific projects.\n\nThis position reports to the Director for Scientific Computing and will be hired at a level commensurate with the skills, knowledge, and abilities of the successful candidate.\n\nWhat You'll Do\n\n\n* Work with a wide community of scientific disciplinary experts to identify emerging and essential information technology needs and translate those needs into information technology requirements\n\n* Build an on-prem HPC infrastructure supplemented with cloud computing to support the expanding IT needs of the Biohub\n\n* Support the efficiency and effectiveness of capabilities for data ingest, data analysis, data management, data storage, computation, identity management, and many other IT needs expressed by scientific projects\n\n* Plan, organize, track and execute projects\n\n* Foster cross-domain community and knowledge-sharing between science teams with similar IT challenges\n\n* Research, evaluate and implement new technologies on a wide range of scientific compute, storage, networking, and data analytics capabilities\n\n* Promote and assist researchers with the use of Cloud Compute Services (AWS, GCP primarily) containerization tools, etc. to scientific clients and research groups\n\n* Work on problems of diverse scope where analysis of data requires evaluation of identifiable factors\n\n* Assist in cost & schedule estimation for the IT needs of scientists, as part of supporting architecture development and scientific program execution\n\n* Support Machine Learning capability growth at the CZ Biohub\n\n* Provide scientist support in deployment and maintenance of developed tools\n\n* Plan and execute all above responsibilities independently with minimal intervention\n\n\n\n\nWhat You'll Bring \n\nEssential โ\n\n\n* Bachelorโs Degree in Biology or Life Sciences is preferred. Degrees in Computer Science, Mathematics, Systems Engineering or a related field or equivalent training/experience also acceptable.\n\n* A minimum of 8 years of experience designing and building web-based working projects using modern languages, tools, and frameworks\n\n* Experience building on-prem HPC infrastructure and capacity planning\n\n* Experience and expertise working on complex issues where analysis of situations or data requires an in-depth evaluation of variable factors\n\n* Experience supporting scientific facilities, and prior knowledge of scientific user needs, program management, data management planning or lab-bench IT needs\n\n* Experience with HPC and cloud computing environments\n\n* Ability to interact with a variety of technical and scientific personnel with varied academic backgrounds\n\n* Strong written and verbal communication skills to present and disseminate scientific software developments at group meetings\n\n* Demonstrated ability to reason clearly about load, latency, bandwidth, performance, reliability, and cost and make sound engineering decisions balancing them\n\n* Demonstrated ability to quickly and creatively implement novel solutions and ideas\n\n\n\n\nTechnical experience includes - \n\n\n* Proven ability to analyze, troubleshoot, and resolve complex problems that arise in the HPC production compute, interconnect, storage hardware, software systems, storage subsystems\n\n* Configuring and administering parallel, network attached storage (Lustre, GPFS on ESS, NFS, Ceph) and storage subsystems (e.g. IBM, NetApp, DataDirect Network, LSI, VAST, etc.)\n\n* Installing, configuring, and maintaining job management tools (such as SLURM, Moab, TORQUE, PBS, etc.) and implementing fairshare, node sharing, backfill etc.. for compute and GPUs\n\n* Red Hat Enterprise Linux, CentOS, or derivatives and Linux services and technologies like dnsmasq, systemd, LDAP, PAM, sssd, OpenSSH, cgroups\n\n* Scripting languages (including Bash, Python, or Perl)\n\n* OpenACC, nvhpc, understanding of cuda driver compatibility issues\n\n* Virtualization (ESXi or KVM/libvirt), containerization (Docker or Singularity), configuration management and automation (tools like xCAT, Puppet, kickstart) and orchestration (Kubernetes, docker-compose, CloudFormation, Terraform.)\n\n* High performance networking technologies (Ethernet and Infiniband) and hardware (Mellanox and Juniper)\n\n* Configuring, installing, tuning and maintaining scientific application software (Modules, SPACK)\n\n* Familiarity with source control tools (Git or SVN)\n\n* Experience with supporting use of popular ML frameworks such as Pytorch, Tensorflow\n\n* Familiarity with cybersecurity tools, methodologies, and best practices for protecting systems used for science\n\n* Experience with movement, storage, backup and archive of large scale data\n\n\n\n\nNice to have - \n\n\n* An advanced degree is strongly desired\n\n\n\n\nThe Chan Zuckerberg Biohub requires all employees, contractors, and interns, regardless of work location or type of role, to provide proof of full COVID-19 vaccination, including a booster vaccine dose, if eligible, by their start date. Those who are unable to get vaccinated or obtain a booster dose because of a disability, or who choose not to be vaccinated due to a sincerely held religious belief, practice, or observance must have an approved exception prior to their start date.\n\nCompensation \n\n\n* $212,000 - $291,500\n\n\n\n\nNew hires are typically hired into the lower portion of the range, enabling employee growth in the range over time. To determine starting pay, we consider multiple job-related factors including a candidateโs skills, education and experience, market demand, business needs, and internal parity. We may also adjust this range in the future based on market data. Your recruiter can share more about the specific pay range during the hiring process. \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Consulting, Education, Cloud, Node, Engineer and Linux jobs that are similar:\n\n
$57,500 — $85,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nSan Francisco, California, United States
๐ Please reference you found the job on Remote OK, this helps us get more companies to post here, thanks!
When applying for jobs, you should NEVER have to pay to apply. You should also NEVER have to pay to buy equipment which they then pay you back for later. Also never pay for trainings you have to do. Those are scams! NEVER PAY FOR ANYTHING! Posts that link to pages with "how to work online" are also scams. Don't use them or pay for them. Also always verify you're actually talking to the company in the job post and not an imposter. A good idea is to check the domain name for the site/email and see if it's the actual company's main domain name. Scams in remote work are rampant, be careful! Read more to avoid scams. When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.
Chan Zuckerberg Biohub - San Francisco is hiring a Remote HPC Principal Engineer
\nThe Opportunity\n\nThe Chan Zuckerberg Biohub has an immediate opening for a High Performance Computing (HPC) Principal Engineer. The CZ Biohub is a one-of-a-kind independent non-profit research institute that brings together three leading universities - Stanford, UC Berkeley, and UC San Francisco - into a single collaborative technology and discovery engine. Along with the world-class engineering team at the Chan Zuckerberg Initiative, the CZ Biohub supports over 100 of the brightest, boldest engineers, data scientists, and biomedical researchers in the Bay Area, with the mission of understanding the underlying mechanisms of disease through the development of tools and technologies and the application to therapeutics and diagnostics.\n\nThis position will be tasked with strengthening and expanding the scientific computational capacity to further the Biohubโs expanding global scientific leadership. The HPC Principal Engineer will also provide IT capabilities and consulting support to science and technical programs. This position will work closely with many different science teams simultaneously to translate experimental descriptions into software and hardware requirements and across all phases of the scientific lifecycle, including data ingest, analysis, management and storage, computation, authentication, tool development and many other IT needs expressed by scientific projects.\n\nThis position reports to the Director for Scientific Computing and will be hired at a level commensurate with the skills, knowledge, and abilities of the successful candidate.\n\nWhat You'll Do\n\n\n* Work with a wide community of scientific disciplinary experts to identify emerging and essential information technology needs and translate those needs into information technology requirements\n\n* Build an on-prem HPC infrastructure supplemented with cloud computing to support the expanding IT needs of the Biohub\n\n* Support the efficiency and effectiveness of capabilities for data ingest, data analysis, data management, data storage, computation, identity management, and many other IT needs expressed by scientific projects\n\n* Plan, organize, track and execute projects\n\n* Foster cross-domain community and knowledge-sharing between science teams with similar IT challenges\n\n* Research, evaluate and implement new technologies on a wide range of scientific compute, storage, networking, and data analytics capabilities\n\n* Promote and assist researchers with the use of Cloud Compute Services (AWS, GCP primarily) containerization tools, etc. to scientific clients and research groups\n\n* Work on problems of diverse scope where analysis of data requires evaluation of identifiable factors\n\n* Assist in cost & schedule estimation for the IT needs of scientists, as part of supporting architecture development and scientific program execution\n\n* Support Machine Learning capability growth at the CZ Biohub\n\n* Provide scientist support in deployment and maintenance of developed tools\n\n* Plan and execute all above responsibilities independently with minimal intervention\n\n\n\n\nWhat You'll Bring \n\nEssential โ\n\n\n* Bachelorโs Degree in Biology or Life Sciences is preferred. Degrees in Computer Science, Mathematics, Systems Engineering or a related field or equivalent training/experience also acceptable. An advanced degree is strongly desired.\n\n* A minimum of 8 years of experience designing and building web-based working projects using modern languages, tools, and frameworks\n\n* Experience building on-prem HPC infrastructure and capacity planning\n\n* Experience and expertise working on complex issues where analysis of situations or data requires an in-depth evaluation of variable factors\n\n* Experience supporting scientific facilities, and prior knowledge of scientific user needs, program management, data management planning or lab-bench IT needs\n\n* Experience with HPC and cloud computing environments\n\n* Ability to interact with a variety of technical and scientific personnel with varied academic backgrounds\n\n* Strong written and verbal communication skills to present and disseminate scientific software developments at group meetings\n\n* Demonstrated ability to reason clearly about load, latency, bandwidth, performance, reliability, and cost and make sound engineering decisions balancing them\n\n* Demonstrated ability to quickly and creatively to implement novel solutions and ideas\n\n\n\n\nTechnical experience includes - \n\n\n* Proven ability to analyze, troubleshoot, and resolve complex problems that arise in the HPC production storage hardware, software systems, storage networks and systems\n\n* Configuring and administering parallel, network attached storage (Lustre, NFS, ESS, Ceph) and storage subsystems (e.g. IBM, NetApp, DataDirect Network, LSI, etc.)\n\n* Installing, configuring, and maintaining job management tools (such as SLURM, Moab, TORQUE, PBS, etc.)\nRed Hat Enterprise Linux, CentOS, or derivatives and Linux services and technologies like dnsmasq, systemd, LDAP, PAM, sssd, OpenSSH, cgroups\n\n* Scripting languages (including Bash, Python, or Perl)\n\n* Virtualization (ESXi or KVM/libvirt), containerization (Docker or Singularity), configuration management and automation (tools like xCAT, Puppet, kickstart) and orchestration (Kubernetes, docker-compose, CloudFormation, Terraform.)\n\n* High performance networking technologies (Ethernet and Infiniband) and hardware (Mellanox and Juniper)\n\n* Configuring, installing, tuning and maintaining scientific application software\n\n* Familiarity with source control tools (Git or SVN)\n\n\n\n\nThe Chan Zuckerberg Biohub requires all employees, contractors, and interns, regardless of work location or type of role, to provide proof of full COVID-19 vaccination, including a booster vaccine dose, if eligible, by their start date. Those who are unable to get vaccinated or obtain a booster dose because of a disability, or who choose not to be vaccinated due to a sincerely held religious belief, practice, or observance must have an approved exception prior to their start date.\n\nCompensation \n\n\n* Principal Engineer = $212,000 - $291,500\n\n\n\n\nNew hires are typically hired into the lower portion of the range, enabling employee growth in the range over time. To determine starting pay, we consider multiple job-related factors including a candidateโs skills, education and experience, market demand, business needs, and internal parity. We may also adjust this range in the future based on market data. Your recruiter can share more about the specific pay range during the hiring process. \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Consulting, Education, Cloud, Engineer and Linux jobs that are similar:\n\n
$50,000 — $85,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nSan Francisco, California, United States
๐ Please reference you found the job on Remote OK, this helps us get more companies to post here, thanks!
When applying for jobs, you should NEVER have to pay to apply. You should also NEVER have to pay to buy equipment which they then pay you back for later. Also never pay for trainings you have to do. Those are scams! NEVER PAY FOR ANYTHING! Posts that link to pages with "how to work online" are also scams. Don't use them or pay for them. Also always verify you're actually talking to the company in the job post and not an imposter. A good idea is to check the domain name for the site/email and see if it's the actual company's main domain name. Scams in remote work are rampant, be careful! Read more to avoid scams. When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.
This job post is closed and the position is probably filled. Please do not apply. Work for Braintrust and want to re-open this job? Use the edit link in the email when you posted the job!
๐ค Closed by robot after apply link errored w/ code 403 10 months ago
\n\nABOUT US:\n\nBraintrust is a user-owned talent network that connects you with great jobs with no fees or membership costsโso you keep 100% of what you earn. \n\n \n\nABOUT THE HIRING PROCESS:\n\nWhen you join Braintrust, you will be invited to a screening process for Braintrust to learn more about your previous work experiences. Once completed, you will have access to the employer for this role and other top companies that seek high-quality talent. Apply to this job to kick off the process. \n\n \n\n\n* \nJOB TYPE: Direct Hire Position (no agencies/C2C - see notes below)\n\n* \nLOCATION: Remote - United States only\n\n* \nSALARY: $150,000 โ $160,000/yr\n\n* \nESTIMATED DURATION: 40hr/week - Long term\n\n\n\n\n \n\nTHE OPPORTUNITY\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nRequirements\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nQualification:\n\n\n\nTo perform this job successfully, an individual must be able to perform each essential job duty\nsatisfactorily. The requirements listed below are representative of the knowledge, skill, and/or\nability required.\n\n \n\nRequired Skills, Education and Certifications:\nโ 5+ years experience in enterprise-level DevOps\nโ GCP (Google Cloud Platform) experience\nโ Kubernetes experience (includes Docker, containers, GKE, EKS, Docker compose)\nโ HashiCorp product experience (e.g., Terraform, Packer, Vault)\nโ Knowledge in Continuous Integration; Delivery (+ testing)\nโ Deep knowledge of the Linux operating system\nโ Knowledge in git-scm\n\n \n\nBonus Skills:\nโ Security Practitioner / SecOps experience\nโ Knowledge of NodeJS\nโ Knowledge of DataDog\nโ AWS experience\nโ Ansible experience\nโ Knowledge of SaFe 3 Agile\nโ Knowledge in semver\n\n \n\nEssential Duties and Responsibilities:\nโ Write clean, scalable infrastructure-as-code\nโ Conduct code reviews\nโ Independently design cloud infrastructure solutions\nโ Monitor and debug production cloud infrastructure\nโ Collaborate with engineering team to identify opportunities to improve engineering\nworkflow\nโ Learn new software languages, tools, and frameworks needed to implement technical\nsolutions\nโ Mentor team members\nโ Additional duties assigned by your supervisor\n\n \nWhat youโll be working on\n\n\n\nJob Summary / Purpose:\nWe are looking for a strong DevOps engineer to join our infrastructure team. We are open to\ntalent being located anywhere worldwide so long as you have 6+ hours overlap with working\nhours in the PST timezone (9AM - 5PM PST.) This is a highly collaborative role. You will be\nengaging in regular conversation with our DevOps Lead and the rest of the DevOps team to\ndesign and implement solutions to novel problems. This role will also require close collaboration\nwith our engineering leads to design and implement infrastructure in tandem with software\ndevelopment.\n\n\nThe first deliverables will be:\nโ Automate 100% of our existing and future Cloud Infrastructure\nโ Build; Improve our current CI/CD workflow\nโ Assess and secure our Cloud Infrastructure\nโ Create and Improve our integration with 3rd party platforms\n\n \n\nWho you are:\nโ Entrepreneurial by nature\nโ Loves DevOps\nโ Team player\nโ Service mindset\nโ Enjoys collaborating on technical solutions\n\n \n\n \n\nThe information contained here is not intended to be an all-inclusive list of the duties and\nresponsibilities of the job, nor is it intended to be an all-inclusive list of the skills and abilities\nrequired to do the job. Reasonable accommodations may be made to assist qualified disabled\npersons to perform the essential functions of the job. Management may, at its discretion, assign\nor reassign duties and responsibilities to this job at any time. The job description does not\nconstitute an employment agreement between the employer and employee and is subject to\nchange by the employer as the needs of the employer and requirements of the job change.\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nApply Now!\n\nBraintrust Job ID: 6501\n\n \n\n\n\n\n\n\n\n\n\n\n\nC2C Candidates: This role is not available to C2C candidates working with an agency. If you are a professional contractor who has created an LLC/corp around their consulting practice, this is well aligned with Braintrust and weโd welcome your application.\n\nBraintrust values the multitude of talents and perspectives that a diverse workforce brings. All qualified applicants will receive consideration for employment without regard to race, national origin, religion, age, color, sex, sexual orientation, gender identity, disability, or protected veteran status.\n\n \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Design, Docker, DevOps, Cloud, Senior, Engineer, Linux and Digital Nomad jobs that are similar:\n\n
$60,000 — $115,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nCanada
# How do you apply?\n\nThis job post has been closed by the poster, which means they probably have enough applicants now. Please do not apply.