Re: How to run on multi-node environment

From: Vermaas, Josh (
Date: Wed Dec 22 2021 - 08:56:03 CST

That nodelist looks funky to me. I'm betting NAMD expects them to be one per line, only sees 1 line (as in the user guide, and assumes that you only have 1 node. It then nicely generates the configuration you generated on a single node, and goes about its merry way... Until it realizes that you've assigned 8 tasks to 4 GPUs, and warns you that it doesn't like sharing.


´╗┐On 12/22/21, 9:46 AM, " on behalf of Luis Cebamanos" < on behalf of> wrote:

    Hello all,

    Trying to run on a multinode/multi-GPU environment (namd built with
    Charm-verbs, cuda SMP and Intel). Each node with 4 GPUs, 40 CPUs:

    charmrun ++nodelist nodeListFiletxt ++p 72 ++ppn 9 namd2 +devices
    0,1,2,3 +isomalloc_sync +setcpuaffinity +idlepoll +pemap
    1-9,11-19,21-29,31-39 +comm
    ap 0,10,20,30 stmv.namd

    where my nodeListFile.txt looks like:

    group main
    host andraton11 host andraton12 ++cpus 40 ++shell ssh

    I am getting the following error:

    FATAL ERROR: Number of devices (4) is not a multiple of number of
    processes (8). Sharing devices between processe
    s is inefficient. Specify +ignoresharing (each process uses all visible
    devices) if not all devices are visible t
    o each process, otherwise adjust number of processes to evenly divide
    number of devices, specify subset of devices
      with +devices argument (e.g., +devices 0,2), or multiply list shared
    devices (e.g., +devices 0,1,2,0).

    If not using +ignoresharing, how should I run this correctly?


This archive was generated by hypermail 2.1.6 : Fri Dec 31 2021 - 23:17:12 CST