As the shape and demands of large scale computing environments have evolved, so have the needs of those who are responsible for keeping them in tip top shape. HPC administrators are challenged with knowing what the data on their system looks like, who’s doing what to the data and tracking jobs on the system. In this talk we’ll cover how the VAST data platform’s powerful structured data component and analytics makes these tasks easy.
AI assisted protein interaction modeling, pioneered by AlphaFold and RosettaFold, has become more diverse both with respect to the programs that do it, and how users run these programs. In this talk, we will cover the programs that are supported at the University of Utah, namely Alphafold2, Alphafold3, Colabfold, Boltz1, RFDiffusion and other tools from the Baker lab, the choices we have made with their deployment, and our experiences with using them. With respect to the ways to run, we will go over the standard SLURM scripts to run Alphafold in two stages (CPU only MSA search, GPU accelerated inference), use Colabfold server for faster MSA search, and using Google Colab running on compute nodes for interactive modeling in a notebook interface. Attendees should leave this talk with ideas how to set up and support these tools and contacts to UofU staff for further questions.
Managing data at scale in high-performance computing (HPC) environments requires efficient storage and retrieval strategies. Automated tiered storage solutions enable seamless migration of aged data to lower-cost archival tiers while maintaining accessibility. Enriched metadata—spanning tagging, search, discovery, and data provenance—enhances data usability and long-term value. This approach not only optimizes storage costs but also empowers researchers with better data discovery and reuse. Real-world HPC use cases demonstrate how metadata-driven workflows streamline research, ensuring that critical datasets remain accessible and actionable over time.
High-performance computing (HPC) research thrives on collaboration, yet institutional data silos often hinder progress. This extends beyond the storage hardware itself to where data is being siloed in departments and institutions alike. Accelerating scientific progress and innovation requires enabling secure data discovery and sharing across not only institutions, but scientific disciplines as well. Federated access models, controlled permissions, and distributed compute environments enable seamless yet secure collaboration. By facilitating data discovery across disciplines and optimizing shared infrastructure, organizations can break down barriers, enhance research efficiency, and drive cross-disciplinary insights that push the boundaries of scientific advancement.
Modern scientific computing demands flexible and scalable solutions that bring computing power closer to data while maintaining security and ease of use. We propose to present our solution which leverages Kubernetes to provide a platform to our employees, university members, and partner organizations that meets those demands and complements our existing HPC system. This presentation will cover how we utilize Continuous Integration and Continuous Delivery (CI/CD), coupled with GitOps and DevOps practices, to provide a robust and secure platform for hosting container-based workloads. These workloads include interactive web visualizations, JupyterHub instances, science gateways, data assimilation tools, data analysis tools, and those that require access to GPUs, including, but not limited to, AI/ML. The presentation will also cover how Cilium network and Kyverno access policies are implemented to secure the platform. It will also discuss how GitHub Actions are utilized to test and build codebases into containers that can run on the platform. Attendees will learn why we chose Kubernetes as our platform as well as practical strategies to implement similar solutions and common pitfalls to avoid.