Publication of research datasets is now a requirement of most funding agencies and journals. Data curation is the process of ensuring that these datasets are findable, accessible, and usable. In the era of Big Data, the generation of datasets with sizes on the order of 100s of gigabytes and larger is increasingly common. Such large datasets create challenges for both the curation and publishing of data as they often cannot be accessed on standard computer hardware or hosted in traditional online repositories. This presentation provides an overview of a collaborative process between the CU Boulder Libraries and CU Boulder Research Computing in which high-performance computing infrastructure is used to curate and publish gigabyte- and terabyte-scale datasets in a manner that makes them accessible to the research community.
The University of North Dakota (UND) Genomics Core has launched GenomEX 2.0, the first comprehensive and user-friendly bioinformatics platform powered by Oracle Cloud Infrastructure. This innovative platform enables biologists to seamlessly install over 13,000 bioinformatics tools, generate and execute custom code or command lines, and receive real-time guidance from an AI-based bioinformatics assistant—all through intuitive, one-click processes. To support the computing requirements for any bioinformatics tools, the platform is powered by the Oracle Cloud Infrastructure that provides fully secured (built-in security features and compliance certifications), personalized (adjustable CPU/GPU numbers & memory/storage capacity), dedicated (resources available 24/7 without any queue) and customizable (users have administrator rights) cloud-based high-performance computing environments at unbeatable pricing. Through the combined expertise of the UND Genomics Core and Oracle, GenomEX 2.0 emerges as a powerful and unique bioinformatics platform, providing every biologist with the freedom to explore biological data independently, regardless of their coding proficiency.
This talk shares the story of an NSF-funded experiential learning opportunity for undergraduate and graduate students at RMACC institutions. Students developed practical skills in HPC system administration by learning from and shadowing CU Boulder Research Computing staff. A total of 17 students participated across two in-person experiences and took part in various aspects of system design, deployment, and teardown.
Perhaps management is in your future. Or you have recently been cast in that role and you would appreciate some suggestions. Attend this session to collect some useful references, and discuss the importance of our role and maximizing the impact of our teams
In the presentation, we would share info on topics such as metrics, our user survey, and some other approaches. As part of the talk, we would like to engender a discussion and exchange of info about what other sites do to measure the effectiveness of their HPC environments.
Field-Programmable Gate Arrays (FPGAs) are gaining traction in research for their ability to deliver high-performance, energy-efficient computing across a range of domains—from machine learning and data analytics to signal processing and scientific simulations. However, integrating FPGA workflows into a shared university compute cluster presents unique challenges in terms of hardware management, toolchain support and user access. This session will explore the practical aspects of supporting FPGA applications in a multi-user academic environment. We will cover available FPGA platforms, commonly used development workflows (such as Xilinx Vivado and Vitis, Intel Quartus and OpenCL, and HLS), and the architectural and administrative considerations for cluster integration. Real-world use cases will illustrate how researchers and academics are leveraging FPGAs, and we’ll share lessons learned in enabling productive FPGA development. Attendees will gain insight into both the technical setup and the support models that foster a thriving FPGA user community on campus.
Modern single-cell and single-nucleus RNA sequencing allow us to profile every cell within a brain tumor, uncovering the diverse lineages and cell signals that drive growth and therapy resistance. However, each experiment can produce terabytes of raw reads and millions of barcodes, demanding significant CPU, GPU, and memory resources – far beyond the limits of a laptop. This talk will show how high-performance computing (HPC) systems transform that data deluge into biological insight.
I will walk through an end-to-end analysis pipeline that pairs the Cell Ranger aligner with an nf-core workflow for efficient, reproducible processing on CPU and GPU nodes. Interactive exploration then transitions to Seurat, where large-memory nodes accelerate dimensionality reduction, clustering, and differential expression analysis and integration of hundreds of thousands of cells. HPC infrastructure also enables RNA velocity calculations, and trajectory analysis that would be impractical on local workstations.
Benchmarks will illustrate how parallel job arrays and optimized space management can cut runtimes from days to hours while lowering costs. My goal is to provide researchers and students with a clear roadmap for harnessing supercomputers to advance neuro-oncology and other data-intensive areas of life science.
AI4WY is an NSF MRI-supported project seeking to build regional partnerships to transform the research landscape by acquiring a state-of-the-art high-performance computing (HPC) system. Led by the University of Wyoming in collaboration with Colorado State University and the Rocky Mountain Advanced Computing Consortium (RMACC), the AI4WY cluster will feature NVIDIA Grace Hopper Superchips for empowering AI-driven research and big data modeling across key domains such as environment, agriculture, society, and energy. In this presentation, we will provide an update on the system acquisition, expanding HPC access to RMACC through the NSF ACCESS program, and plans for fostering regional collaboration.