Compiling issue: selfcompiled NAMD2.14 multicore version factor ~2x slower

From: René Hafner TUK (
Date: Sat Aug 07 2021 - 14:16:44 CDT

    Dear NAMD maintainers,

     I tried implementing a new colvar (which was successful) but
wondered about speed reduction by it.

     Though I compared my self compiled version plain MD simulations
(finally without colvars) with the precompiled binary from the website.

     The only thing changed in the code is in colvars module files that
is not active for the following comparism.

     I obtain the following speed of simulations for a single standard
cutoff etc. simulation (membrane + water, 7k atoms)

         Precompiled: 300 ns/day (4fs ,HMR)

         Selfcompiled: 162 ns/day (4fs timestep, HMR)

This is not CUDA Version dependent as this results is stable with both
CUDA 11.3 as well as with CUDA 10.1 (this latter version was used in the
precompiled binary).

Any help is appreciated.

Kind regards


I compiled it with the following settings:


# building charmm
module purge
module load gcc/8.4
./build charm++ multicore-linux-x86_64 gcc  -j16 --with-production

module purge
module load gcc/8.4
module load nvidia/10.1
./config Linux-x86_64-g++ --charm-arch multicore-linux-x86_64-gcc
--with-tcl --with-python --with-fftw --with-cuda --arch-suffix
cd Linux-x86_64-g++
# append the line CXXOPTS=-lstdc++ -std=c++11 to the Make.config
## if no CXXOPTS like --with-debug are defined then it will not work
echo "CXXOPTS=-lstdc++ -std=c++11" >> Make.config

echo "showin Make.config"
cat Make.config
# then run it
make -j 12 | tee


Dipl.-Phys. René Hafner
TU Kaiserslautern

This archive was generated by hypermail 2.1.6 : Fri Dec 31 2021 - 23:17:11 CST