From: Ivan Gregoretti (ivangreg_at_gmail.com)
Date: Mon Mar 10 2014 - 18:45:17 CDT
Hi Aron and Roy,
First, thank you for replying to my post about NAMD crashing without an
I ran this simulation directly on the machine, without submission
Now, it gets interesting because I re-ran it in another machine that is far
more modest in speed, number of cores (8) and memory (32GB). In this rather
slow machine the execution completed flawlessly.
I will re try in the 32-core machine but this time I plan on nicing the job
if I specify +p32 to namd2. It's the sensible thing to do anyway. So,
namd2 +p32 mysimulation.conf > mysimulation.log
nice namd2 +p32 mysimulation.conf > mysimulation.log
Hopefully I will have good news to report to this thread.
Ivan Gregoretti, PhD
On Sat, Mar 8, 2014 at 11:03 AM, Roy Fernando <roy.nandos_at_gmail.com> wrote:
> Hi Ivan,
> I run my simulation in a cluster. I sometimes notice that the simulation
> stops without error message in the log file in attempting to connect to all
> the processors. However it picks up when I run it again. I wonder if you
> are seeing a similar situation?
> On Fri, Mar 7, 2014 at 12:45 PM, Morgan, Brittany <
> Brittany.Morgan_at_umassmed.edu> wrote:
>> I've had similar output if I hit a hard quota on physical memory. In my
>> experience, if there is no error message it means that the problem is
>> external to NAMD.
>> *From:* owner-namd-l_at_ks.uiuc.edu [owner-namd-l_at_ks.uiuc.edu] On Behalf Of
>> Ivan Gregoretti [ivangreg_at_gmail.com]
>> *Sent:* Friday, March 07, 2014 12:36 PM
>> *To:* namd-l_at_ks.uiuc.edu list
>> *Subject:* namd-l: NAMD 2.9 quits early without error message.
>> Hello everybody,
>> I bring you a riddle here to help me solve.
>> I am trying to run a 100000 simulation but NAMD 2.9 quits early.
>> My configuration file requests
>> # dynamics
>> numsteps 100000
>> but the last line of the output to the log file states
>> WRITING COORDINATES TO DCD FILE AT STEP 67000
>> The velocities and coordinate files did not get written. It is clearly an
>> unwanted early termination. No error message.
>> The head of the log file says
>> Charm++: standalone mode (not using charmrun)
>> Converse/Charm++ Commit ID: v6.4.0-beta1-0-g5776d21
>> CharmLB> Load balancer assumes all CPUs are same.
>> Charm++> Running on 1 unique compute nodes (32-way SMP).
>> Charm++> cpu topology info is gathered in 0.094 seconds.
>> Info: NAMD 2.9 for Linux-x86_64-multicore
>> Info: Please visit http://www.ks.uiuc.edu/Research/namd/
>> Info: for updates, documentation, and support information.
>> Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
>> Info: in all publications reporting results obtained with NAMD.
>> Info: Based on Charm++/Converse 60400 for multicore-linux64-iccstatic
>> Info: Built Mon Apr 30 14:00:48 CDT 2012 by jim on naiad.ks.uiuc.edu
>> Info: 1 NAMD 2.9 Linux-x86_64-multicore 32 inca igregore
>> Info: Running on 32 processors, 1 nodes, 1 physical nodes.
>> Info: CPU topology information available.
>> Info: Charm++/Converse parallel runtime startup completed at 0.169809 s
>> Info: 2265.71 MB of memory in use based on /proc/self/stat
>> I am running this on a 32 core Linux 64 machine with 256GB RAM. Notice
>> that this is NAMD 2.9 release.
>> Has anybody seen anything like this?
>> Thank you,
>> Ivan Gregoretti, PhD
This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:22:13 CST