Paper 1 Groups and Systems

The following are the group and system type assignments for each group (please note your group number as it is used in documents you will edit):

  • Group 1: GPU (GPGPU): Tillie, Katie, Jesse

  • Group 2: Cluster, non-IBM Power system: Alice, Emily, Theo U.

  • Group 3: Cluster, IBM Power system: George, Theo S., Tai

  • Group 4: MPP system: Luca, Ian, Tarang

  • Group 5: Green Supercomputer: Max, Matt, Ghazi, Marshall

  • Group 6: Cloud system: Owen, Aron, Sojin

Picking a Specific System

Your group will pick a specific system of your assigned type to present to the class. Take an look at some options and pick a system early and quickly, and then move on to investigating your system and planing your presentation.

In Systems Suggestions for each Group I’ve includes some suggestions of specific systems for each group to consider. It is important that each group picks a unique system to present. Thus, if your group picks a machine from outside my list of suggestions you should coordinate with the other group(s) that are doing a similar system to ensure that there are no duplicate systems selected (groups 2-5 should talk to each other about the specific systems they plan to present, and the specific in-depth focus to avoid duplication). You may not change the system type of your assigned group.

In addition, groups 2, 3 and 4 should pick a system from the June 2023 Top 500 list, and Group 5 should pick a system from near the top of the June 2023 Green 500 list.

You can find machines of different types by:

  • The architecture is listed with each machine.

  • Use the sublist generator (under Statistics menu, choose SublistGenerator) to generate a list of only specific architectures (choose MPP or cluster).

Usually systems at national labs, universities or supercomputing centers have more on-line information than systems at companies. Also, there is often more public information and articles about machines at the top of the list.

If you click on a system on the list, you will get a page with details that usually includes a link to the organization’s webpage about the system.

Systems Suggestions for each Group

Group 1: GPUs (GPGPU focus)

The GPU Architecture and GPGPU Programming Model should be the focus of your presentation. Do not present GPUs in the context of their use for graphics processing. Instead, focus on GPUs for general purpose parallel programming.

Look at Nvidia’s site for documentation about GPU architecture. CUDA is Nvidia’s language for General-Purpose programming on a GPU (GPGPU). You may include a high-level discussion of programming language for GPGPU computing, but focus more of your talk on the GPU architecture and system for supporting GPGPU programming and not on the details of languages (like CUDA) for GPGPU programming; the GPU architecture is very interesting.

You could also talk about other accelerator computing devices, but focus on GPUs. See Chapter 15.1 of Dive into Systems for a high-level overview. Search the web for GPGPU. There is a lot out there.

Groups 2 and 3: Clusters

If you pick a machine at near the top of the list, you may find more information about it. Your two groups should coordinate to make sure you select different systems. (DO NOT pick system (40) Amazon EC2 cluster instances):

Group 2 (focus on non-IBM Power systems):


 7. Tianhe-2A - TH-IVB-FEP Cluster, Intel Xeon ...
    National Super Computer Center in Guangzhou, China
    http://en.nscc-gz.cn/
    (also Wikipedia and google search for articles)
    here is an article about Tianhe-2:
    www.netlib.org/utk/people/JackDongarra/PAPERS/tianhe-2-dongarra-report.pdf

 10. Frotera - Dell C6420 ...
    Intel Omni-Path , Dell EMC
    Texas Advanced Computing Center/Univ. of Texas, US
    https://www.tacc.utexas.edu/

 17. SuperMUC-NG - Think System ...
    Leibniz Rechenzentrum, Germany
    https://www.lrz.de/english/

... there are a lot of non-IBM Power cluster systems on the top500

These are okay options, but with a couple caveats:

 8. JUWELS Booster Module
    (** Check with group 5 to make sure you don't both pick this)

 6. Selene - Nvidia, AMD
    Nvidia Corp, US
    https://www.nvidia.com/en-us/data-center/dgx-superpod/
    (** Check with group 1 about the focus of your talks so you do not duplicate foci)

Group 3 (focus on IBM Power systems):


  2.  Summit - IBM Power System AC922, ....
      DOE/SC/Oak Ridge National Laboratory, US
      https://www.ornl.gov/directorate/ccsd

  3. Sierra - IBM Power System AC922, ....
      DOE/NNSA/LLNL, US
      https://hpc.llnl.gov/hardware/platforms/sierra

  14. Marconi-100, CINECA, Italy
  20. Lassen, LLNL, US
  21. PANGEA III, France

  48. AiMOS. RPI, US
  ...

Group 4: MPP Systems

Some suggestions from the Top500 list (and look on the list for others):

 1. Supercomputer Fugaku - Supercomputer Fugaku, A64FX 48C
     RIKEN Center for Computational Science, Japan

 4. Sunway TaihuLight - Sunway MPP, ...
    National Supercomputing Center in Wuxi, China
    http://www.nsccwx.cn/  (upper-right link to English)

 15.  Piz Daint - Cray XC50, ...
     Swiss National Supercomputing Center
     https://www.cscs.ch/computers/decommissioned/piz-daint-piz-dora/

 16.  Trinity - Cray XC40, ...
     DOE/NNSA/LANL/SNL, US
     https://www.lanl.gov/projects//trinity/index.php

 30. Cori - Cray XC40, ...
    DOE/SC/LBNL/NER, US
    https://www.nersc.gov/systems/cori/

... and lots more after Cori

You could also pick this one if Group 5 doesn't

 5. Perlmutter - HPE Cray EX235n, AMD EPYC 7763 64C 2.45GHz, NVIDIA A100
     DOE/SC/LBNL/NERSC, US
     https://www.nersc.gov/systems/perlmutter/
     (** check with group 7 that you don't both pick this system)

If you pick a system with a Cray architecture, looking at Cray’s website may be useful too.

Looking at IBMs website for Blue Gene information may be useful too.

Group 5: Green Supercomputer

You should define and discuss briefly what green computing means, and the criteria for ordering machines on the Green500 list. Then pick one or two systems near the top to discuss some details of how they achieve power efficiency. If the system is near the top of the green500 list, there is likely some documentation on their webpage promoting it and also some articles about it. Start by doing some searches for some of these on the web to help you find a good machine or two to discuss in detail. It is fine to just cover one machine, but if information is sparse, you may want to add another that is different in a significant way.

Many of these systems use NVIDA systems, don’t focus on the GPU architecture of these as another group is presenting on this.

Some suggestions of a few from the Green 500 list that may have a bit more information include:

   1. MN-3, Japan
      https://projects.preferred.jp/supercomputers/en/

   2. HiPerGator AI, Univ. Florida, US
      https://www.rc.ufl.edu/

   7. JUWELS and/or 8. JURECA, Germany
      https://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/supercomputers_node.html
      (on top of their webpage is an option to get an English translation)
      (** check with group 2 that you don't both pick this system)

  5: NVIDA SuperPod, Nvidia, US (this may have more marketing info and
     less detailed systems info)

  6: Perlmutter, LBNL, US
     https://www.nersc.gov/systems/perlmutter/
       (** check with group 4 that you don't both pick this system).

  ... and lots more

Group 6: Cloud System

Part of this presentation should involve definition(s) of Cloud computing (what is it? what are common features? what are goals of cloud computing? what are the features of the system you are presenting?) The wikipedia cloud computing page may be a good place to start. Chapter 15.3 of Dive into Systems has a high-level overview of cloud computing too. You could define and talk about some or all of IaaS, Paas, SaaS.

The other part should involve either picking a Cloud system to present, or presenting a specific software system for cloud management or cloud application development. If you do the later, you should still include the definitions of a cloud system in your presentation.

Some commercial cloud systems may not have a lot of public information about their system. Try to pick one that has a reasonable amount, and try to search for articles about the system. If there isn’t much out there, pick a different system with more documentation about it.

Here are few suggestions, you are free to find others:

Some Example commercial cloud systems:

  Amazon EC2, AWS
  Microsoft Azure
  Google App Engine
  IBM Cloud Services

  Amazon EC2 Instance Cluster (#40 on Top500)
  https://aws.amazon.com/ec2/instance-types/

  ABCI AI Bridging Cloud Infrastructure (#12 on Top500)
  https://abci.ai/
  (I do not know how closely this may fit a presentation of
  cloud systems. If you think you may want to pick this one,
  I suggest looking at it briefly to see if it is a good
  example to discuss as you talk about Cloud computing
  in general.  I suspect one of the above systems will be more
  applicable, and also an easier example system to present.)

Some open SW for cloud management or cloud application development:

  OpenStack (software for cloud system): openstack.org
  Eucalyptus
  Rackspace
  Salesforce