Re: Is there a problem of ORCA running for NAMD MPI?

From: Gerard Rowe (GerardR_at_usca.edu)
Date: Tue Nov 20 2018 - 08:12:49 CST

I found a quirk in the way resources get allocated when running QM/MM calculations. On a single machine with 8 cores, if I launch NAMD with +p8, orca runs extremely slowly during the QM phase because NAMD is still holding onto the resources allocated to it during launch. When I drop NAMD down to 2 processors and run orca with PAL6, the calculations run much more quickly. It's important to recognize that Orca is running pretty much independently of NAMD in its own working folder. If your calculation is taking a very long time to get through one cycle, you can check the .TmpOut file generated in the working directory.

You can distinguish between a NAMD and Orca issue by copying the contents of the QM working directory to another location and running Orca directly on the input file. For a system as small as yours, a single point B3LYP/6-31G shouldn't take 3 hours.

Gerard Rowe

University of South Carolina Aiken

________________________________
From: owner-namd-l_at_ks.uiuc.edu <owner-namd-l_at_ks.uiuc.edu> on behalf of Francesco Pietra <chiendarret_at_gmail.com>
Sent: Tuesday, November 20, 2018 4:35:17 AM
To: NAMD
Subject: namd-l: Is there a problem of ORCA running for NAMD MPI?

Hi
On running Example1 tutorial QM-MM, I wonder whether there is a problem with my cluster concerning ORCA running for NAMD MPI: Following failure to proceed beyond

TCL: Minimizing for 100 steps
Info: List of ranks running QM simulations: 2

on one node, 36 tasks, 1 cpu per task, I am trying on four nodes, 144 tasks, 1 cpu per task, with little hope, giving the small size of Example1. After 3 hrs, qm is still running. Below the log file . Hope to get some advice on what I am unable to detect.
francesco pietra
Charm++> Running on MPI version: 3.0
Charm++> level of thread support used: MPI_THREAD_SINGLE (desired: MPI_THREAD_SINGLE)
Charm++> Running in non-SMP mode: numPes 144
Charm++> Using recursive bisection (scheme 3) for topology aware partitions
Converse/Charm++ Commit ID: v6.7.1-0-gbdf6a1b-namd-charm-6.7.1-build-2016-Nov-07-136676
Warning> Randomization of stack pointer is turned on in kernel, thread migration may not work! Run 'echo 0 > /proc/sys/kernel/randomize_va_space' as root to disable it, or try run with '+isomalloc_sync'.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 4 unique compute nodes (36-way SMP).
Charm++> cpu topology info is gathered in 0.042 seconds.
Info: NAMD 2.12 for Linux-x86_64-MPI
Info:
Info: Please visit http://www.ks.uiuc.edu/Research/namd/
Info: for updates, documentation, and support information.
Info:
Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
Info: in all publications reporting results obtained with NAMD.
Info:
Info: Based on Charm++/Converse 60701 for mpi-linux-x86_64
Info: Built mar 7 mar 2017, 17.38.45, CET by propro01 on node165
Info: 1 NAMD 2.12 Linux-x86_64-MPI 144 node419 fpietra0
Info: Running on 144 processors, 144 nodes, 4 physical nodes.
Info: CPU topology information available.
Info: Charm++/Converse parallel runtime startup completed at 0.229772 s
Info: 695.176 MB of memory in use based on /proc/self/stat
Info: Configuration file is namd_ORCA-01.conf
Info: Working in the current directory /gpfs/scratch/userexternal/fpietra0/QM-MM/NAMD_Example1_ORCA_24h_4nodes
TCL: Suspending until startup complete.
Info: SIMULATION PARAMETERS:
Info: TIMESTEP 0.5
Info: NUMBER OF STEPS 0
Info: STEPS PER CYCLE 1
Info: PERIODIC CELL BASIS 1 29 0 0
Info: PERIODIC CELL BASIS 2 0 34 0
Info: PERIODIC CELL BASIS 3 0 0 28
Info: PERIODIC CELL CENTER -0.021 0.008 0.108
Info: WRAPPING WATERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
Info: WRAPPING ALL CLUSTERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
Info: LOAD BALANCER Centralized
Info: LOAD BALANCING STRATEGY New Load Balancers -- DEFAULT
Info: LDB PERIOD 200 steps
Info: FIRST LDB TIMESTEP 5
Info: LAST LDB TIMESTEP -1
Info: LDB BACKGROUND SCALING 1
Info: HOM BACKGROUND SCALING 1
Info: PME BACKGROUND SCALING 1
Info: REMOVING LOAD FROM NODE 0
Info: REMOVING PATCHES FROM PROCESSOR 0
Info: MIN ATOMS PER PATCH 40
Info: INITIAL TEMPERATURE 300
Info: CENTER OF MASS MOVING INITIALLY? NO
Info: DIELECTRIC 1
Info: EXCLUDE SCALED ONE-FOUR
Info: 1-4 ELECTROSTATICS SCALED BY 1
Info: MODIFIED 1-4 VDW PARAMETERS WILL BE USED
Info: DCD FILENAME PolyAla_out.dcd
Info: DCD FREQUENCY 1
Info: DCD FIRST STEP 1
Info: DCD FILE WILL CONTAIN UNIT CELL DATA
Info: XST FILENAME PolyAla_out.xst
Info: XST FREQUENCY 1
Info: NO VELOCITY DCD OUTPUT
Info: NO FORCE DCD OUTPUT
Info: OUTPUT FILENAME PolyAla_out
Info: RESTART FILENAME PolyAla_out.restart
Info: RESTART FREQUENCY 100
Info: BINARY RESTART FILES WILL BE USED
Info: SWITCHING ACTIVE
Info: SWITCHING ON 10
Info: SWITCHING OFF 12
Info: PAIRLIST DISTANCE 14
Info: PAIRLIST SHRINK RATE 0.01
Info: PAIRLIST GROW RATE 0.01
Info: PAIRLIST TRIGGER 0.3
Info: PAIRLISTS PER CYCLE 2
Info: PAIRLISTS ENABLED
Info: MARGIN 0.495
Info: HYDROGEN GROUP CUTOFF 2.5
Info: PATCH DIMENSION 16.995
Info: CROSSTERM ENERGY INCLUDED IN DIHEDRAL
Info: TIMING OUTPUT STEPS 1
Info: PRESSURE OUTPUT STEPS 1
Info: QM FORCES ACTIVE
Info: QM PDB PARAMETER FILE: PolyAla-qm.pdb
Info: QM SOFTWARE: orca
Info: QM ATOM CHARGES FROM QM SOFTWARE: MULLIKEN
Info: QM EXECUTABLE PATH: /cineca/prod/opt/applications/orca/4.0.1/binary/bin/orca
Info: QM COLUMN: beta
Info: QM BOND COLUMN: occ
Info: QM WILL DETECT BONDS BETWEEN QM AND MM ATOMS.
Info: QM-MM BOND SCHEME: Charge Shift.
Info: QM BASE DIRECTORY: /gpfs/scratch/userexternal/fpietra0/QM-MM/NAMD_Example1_ORCA_24h_100GB_1node
Info: QM CONFIG LINE: ! B3LYP 6-31G Grid4 PAL4 EnGrad TightSCF
Info: QM CONFIG LINE: %%output PrintLevel Mini Print[ P_Mulliken ] 1 Print[P_AtCharges_M] 1 end
Info: QM POINT CHARGES WILL BE SELECTED EVERY 1 STEPS.
Info: QM Point Charge Switching: ON.
Info: QM Point Charge SCHEME: none.
Info: QM executions per node: 1
Info: LANGEVIN DYNAMICS ACTIVE
Info: LANGEVIN TEMPERATURE 300
Info: LANGEVIN USING BBK INTEGRATOR
Info: LANGEVIN DAMPING COEFFICIENT IS 50 INVERSE PS
Info: LANGEVIN DYNAMICS APPLIED TO HYDROGENS
Info: LANGEVIN PISTON PRESSURE CONTROL ACTIVE
Info: TARGET PRESSURE IS 1.01325 BAR
Info: OSCILLATION PERIOD IS 200 FS
Info: DECAY TIME IS 100 FS
Info: PISTON TEMPERATURE IS 300 K
Info: PRESSURE CONTROL IS GROUP-BASED
Info: INITIAL STRAIN RATE IS 0 0 0
Info: CELL FLUCTUATION IS ISOTROPIC
Info: PARTICLE MESH EWALD (PME) ACTIVE
Info: PME TOLERANCE 1e-06
Info: PME EWALD COEFFICIENT 0.257952
Info: PME INTERPOLATION ORDER 4
Info: PME GRID DIMENSIONS 32 36 28
Info: PME MAXIMUM GRID SPACING 1
Info: Attempting to read FFTW data from system
Info: Attempting to read FFTW data from FFTW_NAMD_2.12_Linux-x86_64-MPI_FFTW3.txt
Info: Optimizing 6 FFT steps. 1... 2... 3... 4... 5... 6... Done.
Info: Writing FFTW data to FFTW_NAMD_2.12_Linux-x86_64-MPI_FFTW3.txt
Info: FULL ELECTROSTATIC EVALUATION FREQUENCY 1
Info: USING VERLET I (r-RESPA) MTS SCHEME.
Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
Info: RANDOM NUMBER SEED 7910881
Info: USE HYDROGEN BONDS? NO
Info: COORDINATE PDB PolyAla.pdb
Info: STRUCTURE FILE PolyAla.psf
Info: PARAMETER file: CHARMM format!
Info: PARAMETERS CHARMpars/toppar_all36_carb_glycopeptide.str
Info: PARAMETERS CHARMpars/toppar_water_ions_namd.str
Info: PARAMETERS CHARMpars/toppar_all36_na_nad_ppi_gdp_gtp.str
Info: PARAMETERS CHARMpars/par_all36_carb.prm
Info: PARAMETERS CHARMpars/par_all36_cgenff.prm
Info: PARAMETERS CHARMpars/par_all36_lipid.prm
Info: PARAMETERS CHARMpars/par_all36_na.prm
Info: PARAMETERS CHARMpars/par_all36_prot.prm
Info: USING ARITHMETIC MEAN TO COMBINE L-J SIGMA PARAMETERS
Info: SKIPPING rtf SECTION IN STREAM FILE
Info: SKIPPING rtf SECTION IN STREAM FILE
Info: SKIPPING rtf SECTION IN STREAM FILE
Info: SUMMARY OF PARAMETERS:
Info: 937 BONDS
Info: 2734 ANGLES
Info: 6671 DIHEDRAL
Info: 203 IMPROPER
Info: 6 CROSSTERM
Info: 357 VDW
Info: 6 VDW_PAIRS
Info: 0 NBTHOLE_PAIRS
Info: TIME FOR READING PSF FILE: 0.0370231
Info: Reading pdb file PolyAla.pdb
Info: TIME FOR READING PDB FILE: 0.034543
Info:
Info: Using the following PDB file for QM parameters: PolyAla-qm.pdb
Info: Number of QM atoms (excluding Dummy atoms): 20
Info: We found 2 QM-MM bonds.
Info: Applying user defined multiplicity 1 to QM group ID 1
Info: 1) Group ID: 1 ; Group size: 20 atoms ; Total charge: 0
Info: MM-QM pair: 24:30 -> Value (distance or ratio): 1.09 (QM Group 0 ID 1)
Info: MM-QM pair: 50:44 -> Value (distance or ratio): 1.09 (QM Group 0 ID 1)
Info: ****************************
Info: STRUCTURE SUMMARY:
Info: 2279 ATOMS
Info: 1546 BONDS
Info: 879 ANGLES
Info: 199 DIHEDRALS
Info: 15 IMPROPERS
Info: 6 CROSSTERMS
Info: 0 EXCLUSIONS
Info: 6837 DEGREES OF FREEDOM
Info: 773 HYDROGEN GROUPS
Info: 4 ATOMS IN LARGEST HYDROGEN GROUP
Info: 773 MIGRATION GROUPS
Info: 4 ATOMS IN LARGEST MIGRATION GROUP
Info: TOTAL MASS = 13773.9 amu
Info: TOTAL CHARGE = 2.98023e-08 e
Info: MASS DENSITY = 0.82848 g/cm^3
Info: ATOM DENSITY = 0.0825485 atoms/A^3
Info: *****************************
Info:
Info: Entering startup at 0.70037 s, 799.754 MB of memory in use
Info: Startup phase 0 took 0.00322795 s, 799.754 MB of memory in use
Info: The QM region will remove 19 bonds, 31 angles, 37 dihedrals, 3 impropers and 1 crossterms.
Info: ADDED 2624 IMPLICIT EXCLUSIONS
Info: Startup phase 1 took 0.709255 s, 799.887 MB of memory in use
Info: NONBONDED TABLE R-SQUARED SPACING: 0.0625
Info: NONBONDED TABLE SIZE: 769 POINTS
Info: INCONSISTENCY IN FAST TABLE ENERGY VS FORCE: 0.000325096 AT 11.9556
Info: INCONSISTENCY IN SCOR TABLE ENERGY VS FORCE: 0.000324844 AT 11.9556
Info: ABSOLUTE IMPRECISION IN VDWA TABLE ENERGY: 4.59334e-32 AT 11.9974
Info: RELATIVE IMPRECISION IN VDWA TABLE ENERGY: 7.4108e-17 AT 11.9974
Info: INCONSISTENCY IN VDWA TABLE ENERGY VS FORCE: 0.0040507 AT 0.251946
Info: ABSOLUTE IMPRECISION IN VDWB TABLE ENERGY: 1.53481e-26 AT 11.9974
Info: RELATIVE IMPRECISION IN VDWB TABLE ENERGY: 7.96691e-18 AT 11.9974
Info: INCONSISTENCY IN VDWB TABLE ENERGY VS FORCE: 0.00150189 AT 0.251946
Info: Startup phase 2 took 0.0194581 s, 804.121 MB of memory in use
Info: Startup phase 3 took 0.000361919 s, 804.121 MB of memory in use
Info: Startup phase 4 took 0.00718594 s, 804.121 MB of memory in use
Info: Startup phase 5 took 0.000344038 s, 804.121 MB of memory in use
Info: PATCH GRID IS 3 (PERIODIC) BY 4 (PERIODIC) BY 3 (PERIODIC)
Info: PATCH GRID IS 2-AWAY BY 2-AWAY BY 2-AWAY
Info: REMOVING COM VELOCITY -0.188499 0.149382 0.0208025
Info: LARGEST PATCH (17) HAS 78 ATOMS
Info: TORUS A SIZE 144 USING 0 36 72 108
Info: TORUS B SIZE 1 USING 0
Info: TORUS C SIZE 1 USING 0
Info: TORUS MINIMAL MESH SIZE IS 109 BY 1 BY 1
Info: Placed 100% of base nodes on same physical node as patch
Info: Startup phase 6 took 0.0212991 s, 805.082 MB of memory in use
Info: PME using 16 and 18 processors for FFT and reciprocal sum.
Info: PME GRID LOCATIONS: 7 15 23 31 43 51 59 67 79 87 ...
Info: PME TRANS LOCATIONS: 11 19 27 35 39 47 55 63 71 83 ...
Info: PME USING 16 GRID NODES AND 18 TRANS NODES
Info: Startup phase 7 took 0.113867 s, 805.75 MB of memory in use
Info: Startup phase 8 took 0.00489211 s, 805.75 MB of memory in use
LDB: Central LB being created...
Info: Startup phase 9 took 0.0102289 s, 805.75 MB of memory in use
Info: CREATING 2736 COMPUTE OBJECTS
Info: Startup phase 10 took 0.0117202 s, 805.75 MB of memory in use
Info: useSync: 1 useProxySync: 0
Info: Building spanning tree ... send: 1 recv: 0 with branch factor 4
Info: Startup phase 11 took 0.00923896 s, 805.75 MB of memory in use
Info: Startup phase 12 took 0.000352859 s, 805.75 MB of memory in use
Info: Finished startup at 1.6118 s, 805.75 MB of memory in use

TCL: Minimizing for 100 steps
Info: List of ranks running QM simulations: 2.
...............................

This archive was generated by hypermail 2.1.6 : Sun Sep 15 2019 - 23:20:08 CDT