From: Artem Zhmurov (zhmurov_at_gmail.com)
Date: Mon Mar 01 2010 - 14:47:14 CST
Daria,
Try using earlier version of NAMD. 2.7b2 is a beta version.
Artem
2010/3/1 ไมาุั ๛มฬมลืม <skorovesna_at_inbox.ru>:
>
>
> Hello NAMD users,
>
> I'm trying to run protein MD in Amber FF using NAMD 2.7b2 on 4 local CPUs. The problem is that I have to launch NAMD several time to start a successful run.
> That means, starting NAMD in absolutely the same conditions - same folder, same input files and same files currently in the folder (FFTW_NAMD_2.7b2_Linux-x86_64-MPI.txt is being manually removed after each unsuccessful run) has a different result.
>
> Do you have any suggestions of how this could happen? Could AMBER in NAMD be dependent on some pseudo-random number generated from time-seed?
>
> Thanks,
> Daria
>
> PS
> Unsuccessful start looks like this:
>
>
> da_shal_at_linux-s6ps:~/SUBVERSION/AMBER_PREPARE/1l8t> mpirun -n 4 namd2 1l8t.em1.conf
> WARNING: Unable to read mpd.hosts or list of hosts isn't provided. MPI job will be run on the current machine only.
> Charm++> Running on MPI version: 2.0 multi-thread support: MPI_THREAD_SINGLE (max supported: MPI_THREAD_SINGLE)
> Warning> Randomization of stack pointer is turned on in kernel, thread migration may not work! Run 'echo 0 > /proc/sys/kernel/randomize_va_space' as root to disable it, or try run with '+isomalloc_sync'.
> Charm++> cpu topology info is being gathered.
> Charm++> Running on 1 unique compute nodes (4-way SMP).
> Info: NAMD 2.7b2 for Linux-x86_64-MPI
> Info:
> Info: Please visit http://www.ks.uiuc.edu/Research/namd/
> Info: and send feedback or bug reports to namd_at_ks.uiuc.edu
> Info:
> Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
> Info: in all publications reporting results obtained with NAMD.
> Info:
> Info: Based on Charm++/Converse 60200 for mpi-linux-amd64
> Info: Built Sun Jan 24 00:59:32 MSK 2010 by sda on linux-s6ps
> Info: 1 NAMD 2.7b2 Linux-x86_64-MPI 4 linux-s6ps da_shal
> Info: Running on 4 processors.
> Info: CPU topology information available.
> Info: Charm++/Converse parallel runtime startup completed at 0.0023849 s
> Info: 36.3047 MB of memory in use based on /proc/self/stat
> Info: Configuration file is 1l8t.em1.conf
> TCL: Suspending until startup complete.
> Info: SIMULATION PARAMETERS:
> Info: TIMESTEP 2
> Info: NUMBER OF STEPS 0
> Info: STEPS PER CYCLE 10
> Info: PERIODIC CELL BASIS 1 74.125 0 0
> Info: PERIODIC CELL BASIS 2 0 74.393 0
> Info: PERIODIC CELL BASIS 3 0 0 75.109
> Info: PERIODIC CELL CENTER 0.0719321 -0.145147 -0.0488507
> Info: WRAPPING WATERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
> Info: WRAPPING ALL CLUSTERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
> Info: LOAD BALANCE STRATEGY New Load Balancers -- ASB
> Info: LDB PERIOD 2000 steps
> Info: FIRST LDB TIMESTEP 50
> Info: LAST LDB TIMESTEP -1
> Info: LDB BACKGROUND SCALING 1
> Info: HOM BACKGROUND SCALING 1
> Info: PME BACKGROUND SCALING 1
> Info: MAX SELF PARTITIONS 20
> Info: MAX PAIR PARTITIONS 8
> Info: SELF PARTITION ATOMS 154
> Info: SELF2 PARTITION ATOMS 154
> Info: PAIR PARTITION ATOMS 318
> Info: PAIR2 PARTITION ATOMS 637
> Info: MIN ATOMS PER PATCH 100
> Info: INITIAL TEMPERATURE 0
> Info: CENTER OF MASS MOVING INITIALLY? NO
> Info: DIELECTRIC 1
> Info: EXCLUDE SCALED ONE-FOUR
> Info: 1-4 SCALE FACTOR 1
> Info: DCD FILENAME 1l8t.em1.dcd
> Info: DCD FREQUENCY 100
> Info: DCD FIRST STEP 100
> Info: DCD FILE WILL CONTAIN UNIT CELL DATA
> Info: XST FILENAME 1l8t.em1.xst
> Info: XST FREQUENCY 100
> Info: NO VELOCITY DCD OUTPUT
> Info: OUTPUT FILENAME 1l8t.em1
> Info: BINARY OUTPUT FILES WILL BE USED
> Info: RESTART FILENAME 1l8t.em1.restart
> Info: RESTART FREQUENCY 100
> Info: BINARY RESTART FILES WILL BE USED
> Info: SWITCHING ACTIVE
> Info: SWITCHING ON 10
> Info: SWITCHING OFF 12
> Info: PAIRLIST DISTANCE 14
> Info: PAIRLIST SHRINK RATE 0.01
> Info: PAIRLIST GROW RATE 0.01
> Info: PAIRLIST TRIGGER 0.3
> Info: PAIRLISTS PER CYCLE 2
> Info: PAIRLISTS ENABLED
> Info: MARGIN 2.5
> Info: HYDROGEN GROUP CUTOFF 2.5
> Info: PATCH DIMENSION 19
> Info: ENERGY OUTPUT STEPS 100
> Info: CROSSTERM ENERGY INCLUDED IN DIHEDRAL
> Info: TIMING OUTPUT STEPS 1000
> Info: PRESSURE OUTPUT STEPS 100
> Info: LANGEVIN DYNAMICS ACTIVE
> Info: LANGEVIN TEMPERATURE 0
> Info: LANGEVIN DAMPING COEFFICIENT IS 5 INVERSE PS
> Info: LANGEVIN DYNAMICS NOT APPLIED TO HYDROGENS
> Info: LANGEVIN PISTON PRESSURE CONTROL ACTIVE
> Info: TARGET PRESSURE IS 1.01325 BAR
> Info: OSCILLATION PERIOD IS 100 FS
> Info: DECAY TIME IS 50 FS
> Info: PISTON TEMPERATURE IS 300 K
> Info: PRESSURE CONTROL IS GROUP-BASED
> Info: INITIAL STRAIN RATE IS 0 0 0
> Info: CELL FLUCTUATION IS ISOTROPIC
> Info: PARTICLE MESH EWALD (PME) ACTIVE
> Info: PME TOLERANCE 1e-06
> Info: PME EWALD COEFFICIENT 0.257952
> Info: PME INTERPOLATION ORDER 4
> Info: PME GRID DIMENSIONS 80 80 80
> Info: PME MAXIMUM GRID SPACING 1
> Info: Attempting to read FFTW data from FFTW_NAMD_2.7b2_Linux-x86_64-MPI.txt
> Info: Optimizing 6 FFT steps. 1... 2... 3... 4... 5... 6... Done.
> Info: Writing FFTW data to FFTW_NAMD_2.7b2_Linux-x86_64-MPI.txt
> Info: FULL ELECTROSTATIC EVALUATION FREQUENCY 2
> Info: USING VERLET I (r-RESPA) MTS SCHEME.
> Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
> Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
> Info: RIGID BONDS TO HYDROGEN : ALL
> Info: ERROR TOLERANCE : 1e-08
> Info: MAX ITERATIONS : 100
> Info: RIGID WATER USING SETTLE ALGORITHM
> Info: RANDOM NUMBER SEED 1267454429
> Info: USE HYDROGEN BONDS? NO
> Info: Using AMBER format force field!
> Info: AMBER PARM FILE ./1l8t.prmtop
> Info: AMBER COORDINATE FILE ./1l8t.inpcrd
> Info: Exclusions will be read from PARM file!
> Info: SCNB (VDW SCALING) 2
> Info: USING ARITHMETIC MEAN TO COMBINE L-J SIGMA PARAMETERS
> Reading parm file (./1l8t.prmtop) ...
> PARM file in AMBER 7 format
> Warning: Encounter 10-12 H-bond term
> Warning: Found 11041 H-H bonds.
> Info: SUMMARY OF PARAMETERS:
> Info: 43 BONDS
> Info: 89 ANGLES
> Info: 42 DIHEDRAL
> Info: 0 IMPROPER
> Info: 0 CROSSTERM
> Info: 0 VDW
> Info: 171 VDW_PAIRS
> Info: TIME FOR READING PDB FILE: 9.53674e-07
> Info:
> Info: ****************************
> Info: STRUCTURE SUMMARY:
> Info: 37437 ATOMS
> Info: 37440 BONDS
> Info: 7814 ANGLES
> Info: 16431 DIHEDRALS
> Info: 0 IMPROPERS
> Info: 0 CROSSTERMS
> Info: 56553 EXCLUSIONS
> Info: 35222 RIGID BONDS
> Info: 77089 DEGREES OF FREEDOM
> Info: 13256 HYDROGEN GROUPS
> Info: TOTAL MASS = 230892 amu
> Info: TOTAL CHARGE = -8.17166e-06 e
> Info: MASS DENSITY = 0.925718 g/cm^3
> Info: ATOM DENSITY = 0.0903883 atoms/A^3
> Info: *****************************
> Info:
> Info: Entering startup at 0.190294 s, 48.1016 MB of memory in use
> Info: Startup phase 0 took 0.000167847 s, 48.1016 MB of memory in use
> ------------- Processor 0 Exiting: Caught Signal ------------
> Signal: 11
> [0] Stack Traceback:
> [0:0] /lib64/libc.so.6 [0x7facbf080560]
> [0:1] memcpy+0xa0 [0x7facbf0cd990]
> [0:2] _ZN8MOStream3PutEPcm+0x73 [0x82ae63]
> [0:3] _ZN10Parameters15send_ParametersEP8MOStream+0x17d9 [0x867cd9]
> [0:4] _ZN4Node11namdOneSendEv+0x92 [0x858aa2]
> [0:5] _ZN4Node7startupEv+0x738 [0x85bd38]
> [0:6] CkDeliverMessageFree+0x34 [0x9508c3]
> [0:7] _Z15_processHandlerPvP11CkCoreState+0x2c3 [0x95523c]
> [0:8] CmiHandleMessage+0x27 [0x9c101c]
> [0:9] CsdScheduleForever+0x5e [0x9c31f7]
> [0:10] CsdScheduler+0xd [0x9c3284]
> [0:11] _ZN9ScriptTcl12Tcl_minimizeEPvP10Tcl_InterpiPPc+0x28 [0x8af748]
> [0:12] TclInvokeStringCommand+0x84 [0x9e38a9]
> [0:13] namd2 [0xa1b001]
> [0:14] Tcl_EvalEx+0x173 [0xa1c393]
> [0:15] Tcl_EvalFile+0x1b0 [0xa13942]
> [0:16] _ZN9ScriptTcl3runEPc+0x14 [0x8af464]
> [0:17] _Z18after_backend_initiPPc+0x29c [0x50f4cc]
> [0:18] main+0x22 [0x50f562]
> [0:19] __libc_start_main+0xfd [0x7facbf06ca7d]
> [0:20] namd2 [0x50a079]
> [cli_0]: [cli_1]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> [cli_2]: [cli_3]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 3
> aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 2
> aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1
> rank 3 in job 1 linux-s6ps_46591 caused collective abort of all ranks
> exit status of rank 3: killed by signal 9
> rank 2 in job 1 linux-s6ps_46591 caused collective abort of all ranks
> exit status of rank 2: killed by signal 9
>
>
This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:53:50 CST