Clusters

Runko has been successfully compiled on many different computing clusters. Module loading and environment variable initialization scripts are provided in archs/ folder.

As a rule of thumb:

  • Most important thing to look at cluster performance-wise is to check the memory per core. - PIC simulations are most often memory limited so this number is the actual limiting factor.

  • Second important thing is the inter connector because MPI communications are costly.

  • Thirdly, modern Intel CPUs vectorize mesh calculations efficiently and so all the filtering operations are more efficient on them.

Rusty

Flatiron Institute’s jack-of-all-trades at New York. Heterogeneous cluster:

  • 360 nodes = 11500 cores

  • 240 nodes of 2x14 Broadwell cores = 6700 cores

    • 512GB/node (18GB/core)

  • 120 nodes of 2x20 Skylake cores = 4800 cores

    • 768GB/node (19.2GB/core)

  • 100GB/s Omnipath

Beskow

https://www.pdc.kth.se/hpc-services/computing-systems/beskow

Swedish PDC Cray XC40 machine:

  • 11 cabinets = 515 blades = 2,060 compute nodes

  • 67,456 cores in total, 2 x Intel CPUs per node

  • 9 cabinets of Xeon E5-2698v3 Haswell 2.3 GHz CPUs (2x16 cores per node),

    • 64GB/node (2GB/core)

  • 2 cabinets of Xeon E5-2695v4 Broadwell 2.1 GHz CPUs (2x18 cores per node)

    • 128/node (3.6GB/core)

  • High speed network Cray Aries (Dragonfly topology)

Kebnekaise

https://www.hpc2n.umu.se/resources/hardware/kebnekaise

Swedish HPC2N Umeå heterogeneous Lenovo machine:

  • 15 racks = 602 nodes

  • 19,288 cores (of which 2,448 cores are KNL-cores)

  • 432 nodes of 2x14 Intel Xeon

    • 128GB/node (4.6GB/core)

  • 52 nodes of 2x14 Intel Skylake,

    • 192GB/node (6.7GB/core)

  • Infiniband FDR/EDR

Tegner

https://www.pdc.kth.se/hpc-services/computing-systems/tegner-1.737437

Heterogeneous post-processing cluster for Beskow.

Note

Actual simulation part is not working, only the analysis scripts are functional.