From: Thomas C. Bishop (bishop_at_latech.edu)
Date: Tue Nov 11 2014 - 12:41:11 CST
The following may be related to the recent colvars post but not very likely.
Has anyone seen similar problems or willing to run a test on similar hardware/kernel configuration?
I recently demonstrated that my Supermicro (H8DG6 motherboard) with AMD Opteron(TM) Processor 6272
and the linux 3.11.10-21 x86_64 kernel (opensuse 13.1) has a memory problem that crashes a shared memory run w/ NAMD2.9/2.10
Using the same kernel/OS/simulation/namd versions but on intel based machines works fine.
Using the same simulation/OS/namd versions but with desktop-3.11.6-4.1.x86_64 kernel works fine on the supermicro /AMD machine
Seems something has gone wrong between desktop-3.11.6-4.1.x86_64 and 3.11.10-21 x86_64 that may be
AMD Opteron 6272 or Supermicro H8DG6 specific to my shared memory namd runs. Charmrun works in all cases.
Thanks
TOm
On 11/04/2014 10:44 PM, Leili Zhang wrote:
LeiliDear all:I recently compiled NAMD-2.10b1 for Linux-x86_64-MPI. I ran normal MD simulations perfectly fine with 16-128 cores of CPU. However when I tried to start metadynamics simulations, I got the following error messages:
...
colvars: Collective variables biases initialized, 1 in total.
colvars: ----------------------------------------------------------------------
colvars: Collective variables module initialized.
colvars: ----------------------------------------------------------------------
Info: Startup phase 10 took 0.015816 s, 381.293 MB of memory in use
Info: Startup phase 11 took 0.000250816 s, 381.293 MB of memory in use
Info: useSync: 1 useProxySync: 0
Info: Startup phase 12 took 0.000249147 s, 381.293 MB of memory in use
Info: Finished startup at 2.20578 s, 381.293 MB of memory in use
TCL: Running for 10000000 steps
colvars: Error: NAMD does not have yet a way to communicate atom velocities to the colvars.
colvars: If this error message is unclear, try recompiling with -DCOLVARS_DEBUG.
FATAL ERROR: Error in the collective variables module: exiting.
: Success
[0] Stack Traceback:
[0:0] _Z8NAMD_errPKc+0xde [0x61345e]
[0:1] _ZN16colvarproxy_namd11fatal_errorERKSs+0x52 [0xa67232]
[0:2] _ZN12colvarmodule4atom13read_velocityEv+0x2c [0xa631ac]
[0:3] _ZN12colvarmodule10atom_group15read_velocitiesEv+0x1fc [0xa2d04c]
[0:4] _ZN6colvar4calcEv+0x11a [0x9ee14a]
[0:5] _ZN12colvarmodule4calcEv+0x55 [0x9bff85]
[0:6] _ZN16colvarproxy_namd9calculateEv+0x5e2 [0xa64be2]
[0:7] _ZN12GlobalMaster11processDataEPiS0_P6VectorS2_S2_PdS3_S0_S0_S2_S0_S0_S2_+0x6e [0x9979be]
[0:8] _ZN18GlobalMasterServer11callClientsEv+0xcfc [0x99a48c]
[0:9] _ZN18GlobalMasterServer8recvDataEP20ComputeGlobalDataMsg+0x67c [0x998eac]
[0:10] _Z15_processHandlerPvP11CkCoreState+0x705 [0xcb5da5]
[0:11] CsdScheduler+0x47d [0xe15bdd]
[0:12] _ZN9ScriptTcl7Tcl_runEPvP10Tcl_InterpiPPc+0x2c5 [0xba44d5]
[0:13] TclInvokeStringCommand+0x88 [0xe712a8]
[0:14] [0xe73ec7]
[0:15] [0xe752e2]
[0:16] Tcl_EvalEx+0x16 [0xe75b06]
[0:17] Tcl_FSEvalFileEx+0x151 [0xed7cb1]
[0:18] Tcl_EvalFile+0x2e [0xed7e6e]
[0:19] _ZN9ScriptTcl4loadEPc+0xf [0xba126f]
[0:20] main+0x3e7 [0x617ac7]
[0:21] __libc_start_main+0xfd [0x300081ecdd]
[0:22] [0x57ccf9]
The input files worked also fine with NAMD-2.9 on, say, gordon cluster or stampede. Unfortunately I cannot successfully compile NAMD-2.9 on our current cluster after several tries. So I cannot say..
Thanks in advance for any advices!
This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:23:00 CST