Benchmarks for VASP (version 4.6.31 serial)

VASP's scaling behavior (how it speed up with number of CPUs) is discussed in a separate section.

Libraries and VASP is also in another section.

In the table below, we compared the CPU time obtained on nano (nano.cnsi.ucsb.edu) and on QSR (qsr.cnsi.ucsb.edu) using vasp4.6.31-serial. The differences given in the last column were obtained using 100*[CPU(nano)-CPU(QSR)]/CPU(QSR). A negative number indicates a faster execution on nano than on QSR.

The execution is faster on nano in two cases only corresponding to the calculations demanding the least and the largest amount of memory. The slower execution of the largest job might be due to the fact that on QSR each node as 4 CPUs. Thus, our job as to share the memory and the swap space with up to three other job. Depending on the size of the other calculations, our job may have to use a larger amount of swap which slows it down compares to the limiting case: our job running alone on the node. In principle, this could be tested by blocking the access to all users to a node and run our job there.

Assuming the results for the largest job is correct, we can see that the difference as a function of the memory vary like an open umbrella.

CPU time (in seconds) for the execution of VASP on various systems (see the description below) using vasp4.6.31-serial on nano and QSR.

System Memory (in MB) Nano QSR Difference (%)
Clathrate 151 2,898 4,427 -34.4
O2 396 2,766 1,943 +42.4
Au5_C3H6_03Ns1 406 10,750 7,917 +35.6
HOO 434 5,508 3,892 +41.4
CO 511 6,724 4,176 +61.0
CO2 515 1,017 633 +60.7
nafion 703 6,746 5,668 +19.0
ZnO_Wurzite_332_0V01Ns0 966 19,540 16,386 +19.2
TiO2_424_Au1_O2_1V01Ns3 2,806 140,915 148,705 -5.2

We also have other extensive benchmark results, but have decided not to publish them on the web. If you are running VASP on the CNSI systems, please contact us for information on which system will likely run your code most efficiently.