From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Tue Nov 03 2015 - 00:25:15 CST
Assuming you are redirecting stdout and stderr to a log file, similar to:
mpirun […] namd2 job.in 2> job.e > job.out
You should have a look at the and of those files to find the reason why namd stopped. The message from mpirun about killed processes doesn’t really point out anything, as it simply informs you that the job has been cancelled.
Another reason might be a walltime limit on the cluster you are using.
Norman Geist
Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag von Shalton Evans
Gesendet: Montag, 2. November 2015 21:34
An: namd-l_at_ks.uiuc.edu
Betreff: namd-l: 16 total processes killed (some possibly by mpirun during cleanup)
Good Day All,
I am attempting a Monte Carlo approach to a dynamics run. I am executing 20 of the same script and required files all in different directories. But, it seems as though at some point one by one, I get somewhat of an error message. It says "16 total processes killed (some possibly by mpirun during cleanup." The dynamics runs are failing after days of computer time for no other reason I can think of except that they are running for too long.
Is there anyone that knows why this is happening? Help would be appreciated.
-Shalton
This archive was generated by hypermail 2.1.6 : Tue Dec 27 2016 - 23:21:30 CST