Data Science Research Institute HPC Activity Group
The Data Science Research Institute supports two activity groups surrounding the subjects of high-performance computing and artificial intelligence.
The HPC Activity Group currently meets twice a month to discuss HPC needs, expansion, and resources.
If you wish to join one of the activity groups, please reach out!
For any questions regarding the activity group scope, time dedication, and current events, please reach out!
- AI Activity Group: V. Chandra
- HPC Activity Group: Michael Kirby
- Events: Brandon Shanks
Previous Meeting Notes
HPC Activity Group 7/25/24
HPC Lunch Discussion Notes
Michael Kirby opening remarks
- Increase engagement with DSRI and HPC HPC for all
- CC* grant brief intro
Want to identify gaps
Documentation How to use HPC (not just Riviera)
Set up Github page with documentation ▪ Start for documentation efforts
FAQ
Jesse Wilson Lazer and non linear optics
- Documentation or resources for students to get them started on basics
- undergrads would be really productive with this kind of materials
- How to set up Jupiter or Pytorch
- How to reserve a node
- Getting MatLab installed on Riviera
- Potentially using OpenOndemand
Chandra
- Wants to see people not having to wait for years for data transfer
- Get faculty representatives to get
- Use this group of people who have expertise HPC to help others ▪ Use the people in the room as a “club” that can help build this
- Build visibility at admin level
Ashok Prasad
- HPC for professors across campus
- Increase Rivera resources to keep up with new users
- Documentation that can get people started
Kelly Wrighton
- Also continue to write the activity group needs into grants for community building efforts
- Return to run the jobs can be slow when the jobs don’t fit the optimal Riviera resources
- Potentially add in guidelines to help smaller jobs through faster and get research done
- Biological software interfacing with Riviera Resources for this?
- Potentially has a pipeline that others can use for their needs. ▪ Sharing tools with activity group so that people can see what is out there and save time if someone has already created the resource
- Ohio has great documentation and resources ▪ It was built by the researchers in the specific areas (writing in your expertise)
- Getting information on condo models would be useful
- Space for resources
- Lots of data coming in and going out
- More memory
- Queuing optimization to let big jobs not hold up smaller ones
- How to share the resource effectively
- Priorities to certain groups (or at least some nodes) for a specific timeframe
- Unification of documentation How to set up their dependencies
- How to do heterogeneous models
- How to play with the scheduler
- Room for dynamic allocation
- Communication and documentation
- User group listserv for announcements
- As we build community and resources share them into the listserv
- Put the minutes on website and sent out to the group
- Ticketing system for Bill
- Send the easy tickets (or difficult problems) to the activity group
- Grad student volunteers for easy tickets?
- Representing the entry level user and Vetlab
- Given hammer but don’t know how to use it
- More opportunities to help people ease the steep entry curve
- Is here to listen and help figure out how to let smart people do what they need instead of having to take time doing stuff outside their domain
- Communication about what is available to the university
- How to make it sustainable
- Getting access to higher RAM machines to work with
Sudeep Pasricha
- Capacity is biggest need
- Alpine is oversaturated and people are waiting in line
- Think about the next generation Alpine / planning for the future
- Creating a system that is accessible to students for classes
- Points students to Collab (google cloud service) web based dashboard ▪ Students would have projects terminated because of free version of Collab
- More GPU and plan bigger for where Riviera goes next
- Potentially consider maintaining relationship with Boulder (building off what Alpine and Summit have done)
- Sustainability can be benefited by DASHBOARD of what is going on and what has happened in the past
- Follow trends to know what needs will arise and when to be strategic about getting new hardware
- Copying Michigan’s dashboard
- Applications dashboard potentially needed as well
- Working to 2x the system
- CSU will need to be agile with resource restraints
Major Themes
- Documentation
- Communication to University and about HPC
- Capacity
- Community of Users
- Educational Materials
HPC Activity Group 9/5/24
HPC Meeting 9/5/24
Attendance:
Michael Kirby, Bill Carpenter, Brandon Shanks, David King, Mark Stenglein, Kevin Worthington, Marcelo Melo, Chris Snow, Bryan Wilson
DSRI YouTube Account Link
https://www.youtube.com/@DataScienceResearchInstitute
(also on Webpage home page)
Update on DSRI activities coming up for computing networking (specifically with UW)
- UW Computing Symposium with Drone Showcase September 19 – 20
- Affiliate Meeting in October with Cass and possibly department head from Computing School in UW
- Future HPC Lunch Meetings will transfer to HPC instruction and teaching across campus to understand
- Who is teaching what
- What is missing
- Core materials
- Best practices for HPC and instruction
- Both for credit and non-credit classes
Upcoming DSRI HPC workshops starting September 12 @ 10:00 A.M. (Lory Student Center room 302)
- Get word out to both HPC user group and DSRI affiliates
- Planning on running on a 4 week cycle moving forward
- Creating a plan for future workshops will be based on needs requested by users and campus
Discussion on load speeds across campus
- Updated and fastest connections will be around 100gbs
- Other networks will be slower even down to 10gbs or below
- Old switches might be a problem
- Continued involvement from all perspectives across campus
- Invite users of both Riviera, Alpine, and other HPCs in the past
- Getting input from as many people as possible (HPC group invite interested individuals)
- Issue of having money for GPU power but debating between options based on space and utility
- Also issue of data transfer and scheduling
Potentially having the HPC meetings at Energy Institute and Library rooms in the future
HPC Activity group 9/26/24
HPC Activity Group Lunch Discussion (9/26/24)
Introductions
- Introduction slides o Add yours here if you haven’t had a chance to yet
- Brainstorming the most needed curriculum updates in HPC
- Giving students the opportunity / resources to be comfortable working on high-performance computing systems across the continent o Open students minds to the possibilities of HPC
- Containerization and reproducibility
- Communication about HPC resources and student opportunities
- Students getting started with research on HPC can be unclear or seem daunting
- Roadblocks for students getting proficient at HPC
- Large interest and not enough face to face command line and linux instruction
- Resources for parallelization in R
- Interest in sequencing in complex ways
- Agriculture focused topics that require special resources
- Helping students navigate their learning goals and how to support them in that path
- Things that sometimes hold students back can be related to feeling like they belong “I am not the coder type” or “I am the coder type, but not good enough”
- Project based classes and workshops. Not just lectures but projects for experience
- Maybe have students submit projects / topics to cover to give practical applications and answers to questions
- Instead of hack-athon do a “install-athon”
- Increasing participation from different departments across campus o Not just the folks that are already doing HPC work
Most needed / priority
- CLI skills
- Getting students motivated about skills needed for current job market o HPC center on campus / practical and visual appeal (touring to see face-to-face?)
- UW and Anschutz have something like that
- Alpine makes a whole day event for users to see system, get users connected, and build community
Homework for everyone to think about
- What would a curriculum look like to cover suggestions and needs outlined? o Ideas, things you are thinking about, and thoughts about what you would like to see in the future
DSRI HPC workshops starting today and reoccurring through rest of semester
– First workshop in BSB 103 at 3:30 – 4:30 (9/26)
Head of school of computing from UW coming October 24 to meet with researchers on campus with DSRI
HPC Activity Group 10/9/24
HPC Activity Group Meeting 10/9/24
Attendance: Michael Kirby, Matt Koslovsky, Corey Broeckling, Frances Davenport, Wolfgang Bangerth, Anne Mook, Brandon Shanks
Ideas of HPC needs for campus and curriculum:
- In statistics, more intrigue into HPC with students and Linux understanding
- More learning opportunities for students with focus on computation
- Students generally need more help with Linux
Most pressing needs in curriculum updates
- Students can’t program and won’t get into HPC without programming knowledge
- Eventually students will need to program in their careers, so it will be a useful skill to acquire in university
- How to prepare students for more advanced classes in computation and programing
- Coding at all levels, not just graduate but also undergraduate levels as well
- GS510 has been taught for years, but has needed improvement and people to offer the course
- Idea could be starting with basics, getting on to a system, and creating a track that will lead to proficiency on multiple HPCs.
Core concepts for teaching the introduction course
- Subject of pipelines (tools for doing it and understanding)
- Idea of reproducible research A lot of funding opportunities require the research to be reproducible, should be being taught for students who will be working in those areas
- Mapreduce (lot of large scale databases use this mapping)
- Design patterns, knowledge of what things are called for computing
- Getting a better understanding of what is used in the professional world
- Shared vs. distributed memory systems Thinking about where the data is
- KoKKOS system out of national labs and used by major packages Makes it easy to map where things are run
- What are the underlying concepts that we can teach because things (OpenAC) will become obsolete in the future?
- You can take the perspective of teaching how to write software for HPC or the perspective of how to use HPC and introductions to what it can be
- Applications course (no prerequisites) and programming course to follow
How do you teach / offer a GS course?
The HPC activity group is excited to gain more input about HPC courses at CSU
DSRI and UWYO HPC Meeting 10/24/24
HPC Activity Group Meeting (10/24/24)
Attendance: Nikhil Krishnaswamy, Marcelo Melo, Chris Snow, Asa Ben-Hurr, Gabrielle Allen, Michael Kirby, Andrew Kirby, Bill Carpenter, Brandon Shanks, Melissa Reynolds
Introduction: Andrew Kirby; Research Scientist in the School of Computing (UWYO)
- International HPC Summer School (IHPCSS)
- Yearly paid program that goes for a week during the summer. Highly recommended for strong students in undergraduate
- This year either in Spain or Portugal
- Computing Capabilities
- Medicine Bow (44 nodes)
- 2.01 pflops (FP64)
- 7.53 pflops (FP32)4224
- CPU cores (AMD EPYC)
- 48x NVIDIA H100
- Wyoming / NCAR Alliance
- 13% compute share
- 318 million CPU core-hours
- 328,000 GPU hours
- 3rd university partner with ARGONNE national laboratory
NSF MRI (AI4WY)
- 3.9M
- NVIDIA Superchips
- Compute resources
- UWYO 75%
- CSU 15%
- RMAC 10%
- 15 different departments on award
- 33+ institutions
- Project Objective
- Build new state-of-the-art HPC system
- System Option 1 (Grace Hopper System)
- 48 NVIDIA GH200
- 400 Terabytes of storage
- 1.63 pflops
- System Option 2 (Grace Blackwell System)
- Grace Blackwell GPU
- 144 NVIDIA GB2001 petabyte of storage
- 6.48 pflops
Looking for CSU involvement. Getting people on board to build use-case argument for local and regional support
Specifically, would faculty at CSU plan to use system and what would they aim to use it for?
Interested faculty can reach out to Andrew Kirby or DSRI for ways to get connected
HPC Activity Group 11/14/24
HPC Activity Group Meeting 11/14/24
Attendance: Michael Kirby, Anne Mook, Sunetra Das, Brandon Shanks, Bill Carpenter, Jesse Burkhardt, Chris Snow, Simon Tavener, David King, Nanadini Nim
Discussion on HPC education objectives:
- Vet students coming in with no experience to bioinformatics and HPC. Looking for easy entry with fast path to computing
- For the business school, not integrated with high-level data science / HPC
- DS510,511,512 with final course being on bioinformatics
- Simon has had discussions with business school in terms of both undergraduate and graduate. Introducing synergy with graduate certificates
- Data science general concentration with minor in another thing such as business
- HPC 4-week workshops are helpful and encouraged
- When it comes to ML and AI working on entry level and avoiding duplication of resources at small university
- When talking about HPC, are ML and AI synonymous
Discussion of there being lots of resources across different departments for subjects in ML and AI and not HPC
- One option to solve this problem would be a foundational course that would apply to all further topics for ML
- Risk of repetition and waste of resources for single foundational class taught by one department.
- Trouble comes when you need resources to teach foundational material for specialized departments and topics
- Important to also feature foundational teaching on HPC (how to use these skills on systems)
- RNAC 4-week class that does 2 weeks of HPC (zero command line experience needed)
o After that class, is when the question of “what comes next” arises. Specialization is needed - Coding Club meets twice weekly (David King contact)
- Idea of DSRI working with other programs to create core class on HPC and linux (offering for credit or not for credit still in discussion)
- Teaching first a user guide and then follow up with programmer
- Bill noted that from his perspective teaching virtual environments and containerization to be the next step for instruction needed
- This will lead into GPU education
- Michael noted that GPU usage on Riviera is still under-utilized and that no courses are currently being offered that teach GPU computing on HPC
How are graduate students using the GPUs of Riviera?
- Either poorly or self-taught
- Follow-up question, what is the minimal amount material would you offer in a course to be effective?
- Practical aspects of using GPU code higher need than writing the code for GPU
How can we engage students to feel comfortable on using HPC?
- Giving them opportunities to dive in and use an HPC to inspire interest in learning more
o Students want to see results in their own particular area. Very difficult for a single course. Again the idea of basic and then specialization training - 3 modules: first two on basics, final one on specific topic (single instructor would not be able to do all this)
- Jesse mentioned being able to take the first two modules would be great to then go and teach grad students in his own department (third module)
- First modules could also emphasize the jargon related to HPC and coding
- ChatGPT has been very helpful in certain circumstances to fill in gaps / increase comfort with coding
Next question is inference. How do you know if the code is garbage or not? DS335 (taught by statistics teaching inference with data)
- HPC for other disciplines (outside traditional HPC users) it will need to become necessary in order to get students engaged with it
HPC won’t be the end goal, but a tool to achieve their specific research questions