From: Jim Phillips (jim_at_ks.uiuc.edu)
Date: Fri Jan 16 2015 - 16:54:35 CST
Please try the released ibverbs-smp-CUDA binary.
CUDA + MPI is slow with or without SMP, which is why we put effort into
network-specific Charm++ machine layers like ibverbs. The ibverbs port
can be launched with "charmrun ++mpiexec", which will use the system
mpiexec internally while allowing us to distribute portable binaries that
don't require extra scripting to adapt to a queueing sytem.
On Thu, 15 Jan 2015, David Chin wrote:
> I am having some difficulty getting NAMD 2.10 to use all GPU units on an
> 8-node cluster (2 GPU devices per node) when using MVAPICH2-GDR 2.1a. Is
> the use of MVAPICH2-GDR supported? It seems to launch the appropriate
> number of CPU processes on all nodes, but only uses 4 GPUS (2 GPUs on 2
> separate nodes).
> * RHEL 6.x - kernel 2.6.32-358.23.2.el6
> * Intel Composer XE 2015
> * CUDA 6.5
> * Mellanox OFED 2.1
> Thanks in advance,
> David Chin, Ph.D.
> david.chin_at_drexel.edu Sr. Systems Administrator, URCF, Drexel U.
> 215.221.4747 (mobile)
This archive was generated by hypermail 2.1.6 : Thu Dec 31 2015 - 23:21:33 CST