From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Mon Oct 14 2013 - 01:15:13 CDT
Hi,
you are trying to start 256 processes. This is not a problem, but most linux
distributions don't allow that much connections by default. If you have a
smp compile of NAMD, you should set "+p" as you did before, but use "+ppn"
additionally, to use the shared memory parallelism layer of namd, this will
save you processes. You need to benchmark the best ratio between distributed
memory (processes) and shared memory (threads) for your configuration. A
good starting point is to start one process per cpu socket and fork over all
its cores. <- this is the most easiest way to solve this
Another option might be, as it seems you are using rsh to connect the
processes currently, try to use with charmrun "++remote-shell ssh".
Nonetheless you will have to modify the ssh settings to allow more
connections. I guess it's something like "MaxStartups" in
/etc/ssh/sshd_config.
Good luck
Norman Geist.
Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag
von Thomas C. Bishop
Gesendet: Montag, 14. Oktober 2013 05:30
An: namd-l_at_ks.uiuc.edu
Betreff: namd-l: rcmd: socket: All ports in use w/
NAMD_2.9_Linux-x86_64-ibverbs
Dear NAMD,
I'm trying to run a job w/ 256 core
(4way SMP systems w/ infiniband)
using NAMD_2.9_Linux-x86_64-ibverbs
and following command line
$CHARM ++timeout 120 ++nodelist nodes ++p $NP $NAMD $CONF
where $CHARM and $NAMD are the charmrun and namd executables
and nodes is the list of nodes allocated by pbs
NOTE nodes lists all hosts 4x since 4way SMP.
I get messages similar to following when try to start a job on 256 core
but not when I use say 64 core... suggestions?
rcmd: socket: All ports in use
rcmd: socket: All ports in use
rcmd: socket: All ports in use
rcmd: socket: All ports in use
Charmrun> Error 1 returned from rsh (poseidon106:247)
Thanks in advance,
Tom
-- ******************************* Thomas C. Bishop Tel: 318-257-5209 Fax: 318-257-3823 www.latech.edu/~bishop ********************************
This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:45 CST