next up previous contents index
Next: Linux or Other Unix Up: Running NAMD Previous: Windows Clusters and Workstation

Linux Clusters with InfiniBand or Other High-Performance Networks

Charm++ provides a special ibverbs network layer that uses InfiniBand networks directly through the OpenFabrics OFED ibverbs library. This avoids efficiency and portability issues associated with MPI. Look for pre-built ibverbs NAMD binaries or specify ibverbs when building Charm++. The newer verbs network layer should offer equivalent performance to the ibverbs layer, plus support for multi-copy algorithms (replicas).

Intel Omni-Path networks are incompatible with the pre-built ibverbs NAMD binaries. Charm++ for verbs can be built with -with-qlogic to support Omni-Path, but the Charm++ MPI network layer performs better than the verbs layer. Hangs have been observed with Intel MPI but not with OpenMPI, so OpenMPI is preferred. See ``Compiling NAMD'' below for MPI build instructions. NAMD MPI binaries may be launched directly with mpiexec rather than via the provided charmrun script.

Writing batch job scripts to run charmrun in a queueing system can be challenging. Since most clusters provide directions for using mpiexec to launch MPI jobs, charmrun provides a ++mpiexec option to use mpiexec to launch non-MPI binaries. If ``mpiexec -n procs ...'' is not sufficient to launch jobs on your cluster you will need to write an executable mympiexec script like the following from TACC:

  shift; shift; exec ibrun $*

The job is then launched (with full paths where needed) as:

  charmrun +p<procs> ++mpiexec ++remote-shell mympiexec namd2 <configfile>

Charm++ now provides the option ++mpiexec-no-n for the common case where mpiexec does not accept "-n procs" and instead derives the number of processes to launch directly from the queueing system:

  charmrun +p<procs> ++mpiexec-no-n ++remote-shell ibrun namd2 <configfile>

For workstation clusters and other massively parallel machines with special high-performance networking, NAMD uses the system-provided MPI library (with a few exceptions) and standard system tools such as mpirun are used to launch jobs. Since MPI libraries are very often incompatible between versions, you will likely need to recompile NAMD and its underlying Charm++ libraries to use these machines in parallel (the provided non-MPI binaries should still work for serial runs.) The provided charmrun program for these platforms is only a script that attempts to translate charmrun options into mpirun options, but due to the diversity of MPI libraries it often fails to work.

next up previous contents index
Next: Linux or Other Unix Up: Running NAMD Previous: Windows Clusters and Workstation