Re: Is there a problem of ORCA running for NAMD MPI?

From: Francesco Pietra (chiendarret_at_gmail.com)
Date: Tue Nov 20 2018 - 09:32:34 CST

OK, I reported in my previous mail of a few days ago, concerning the same
simulation on one node, that i could reach orca step 7 by running namd2/12
qm-mm on a 4core desktop, surely in a matter of a few hours at most,
although I did not take notice of the time (it was last year). I have now
killed the simulation, after 8hr.

The final part of the.TmpOut file, after 8hr on 4nodes, or 24hr on one
node, reads:

Checking for AutoStart:
> The File:
> /gpfs/scratch/userexternal/fpietra0/QM-MM/NAMD_Example1_ORCA_24h_100GB_1node/0/qmmm_0.input.gbw
> exists
> Trying to determine its content:
> ... Fine, the file contains calculation information
> ... Fine, the calculation information was read
> ... Fine, the file contains a basis set
> ... Fine, the basis set was read
> ... Fine, the file contains a geometry
> ... Fine, the geometry was read
> ... The file does not contain orbitals - skipping AutoStart

Does that tell you anything?

Thanks a lot for you very useful intervention. Hope that bugs will be
discovered in the way I set up the simulation. The systems in my project
are so large that namd can only be run on a cluster.
frncesco

On Tue, Nov 20, 2018 at 3:14 PM Gerard Rowe <GerardR_at_usca.edu> wrote:

> I found a quirk in the way resources get allocated when running QM/MM
> calculations. On a single machine with 8 cores, if I launch NAMD with +p8,
> orca runs extremely slowly during the QM phase because NAMD is still
> holding onto the resources allocated to it during launch. When I drop NAMD
> down to 2 processors and run orca with PAL6, the calculations run much more
> quickly. It's important to recognize that Orca is running pretty much
> independently of NAMD in its own working folder. If your calculation is
> taking a very long time to get through one cycle, you can check the .TmpOut
> file generated in the working directory.
>
>
> You can distinguish between a NAMD and Orca issue by copying the contents
> of the QM working directory to another location and running Orca directly
> on the input file. For a system as small as yours, a single point
> B3LYP/6-31G shouldn't take 3 hours.
>
>
> Gerard Rowe
>
> University of South Carolina Aiken
> ------------------------------
> *From:* owner-namd-l_at_ks.uiuc.edu <owner-namd-l_at_ks.uiuc.edu> on behalf of
> Francesco Pietra <chiendarret_at_gmail.com>
> *Sent:* Tuesday, November 20, 2018 4:35:17 AM
> *To:* NAMD
> *Subject:* namd-l: Is there a problem of ORCA running for NAMD MPI?
>
> Hi
> On running Example1 tutorial QM-MM, I wonder whether there is a problem
> with my cluster concerning ORCA running for NAMD MPI: Following failure to
> proceed beyond
>
> TCL: Minimizing for 100 steps
> Info: List of ranks running QM simulations: 2
>
>
> on one node, 36 tasks, 1 cpu per task, I am trying on four nodes, 144
> tasks, 1 cpu per task, with little hope, giving the small size of Example1.
> After 3 hrs, qm is still running. Below the log file . Hope to get some
> advice on what I am unable to detect.
> francesco pietra
> Charm++> Running on MPI version: 3.0
> Charm++> level of thread support used: MPI_THREAD_SINGLE (desired:
> MPI_THREAD_SINGLE)
> Charm++> Running in non-SMP mode: numPes 144
> Charm++> Using recursive bisection (scheme 3) for topology aware partitions
> Converse/Charm++ Commit ID:
> v6.7.1-0-gbdf6a1b-namd-charm-6.7.1-build-2016-Nov-07-136676
> Warning> Randomization of stack pointer is turned on in kernel, thread
> migration may not work! Run 'echo 0 > /proc/sys/kernel/randomize_va_space'
> as root to disable it, or try run with '+isomalloc_sync'.
> CharmLB> Load balancer assumes all CPUs are same.
> Charm++> Running on 4 unique compute nodes (36-way SMP).
> Charm++> cpu topology info is gathered in 0.042 seconds.
> Info: NAMD 2.12 for Linux-x86_64-MPI
> Info:
> Info: Please visit http://www.ks.uiuc.edu/Research/namd/
> Info: for updates, documentation, and support information.
> Info:
> Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
> Info: in all publications reporting results obtained with NAMD.
> Info:
> Info: Based on Charm++/Converse 60701 for mpi-linux-x86_64
> Info: Built mar 7 mar 2017, 17.38.45, CET by propro01 on node165
> Info: 1 NAMD 2.12 Linux-x86_64-MPI 144 node419 fpietra0
> Info: Running on 144 processors, 144 nodes, 4 physical nodes.
> Info: CPU topology information available.
> Info: Charm++/Converse parallel runtime startup completed at 0.229772 s
> Info: 695.176 MB of memory in use based on /proc/self/stat
> Info: Configuration file is namd_ORCA-01.conf
> Info: Working in the current directory
> /gpfs/scratch/userexternal/fpietra0/QM-MM/NAMD_Example1_ORCA_24h_4nodes
> TCL: Suspending until startup complete.
> Info: SIMULATION PARAMETERS:
> Info: TIMESTEP 0.5
> Info: NUMBER OF STEPS 0
> Info: STEPS PER CYCLE 1
> Info: PERIODIC CELL BASIS 1 29 0 0
> Info: PERIODIC CELL BASIS 2 0 34 0
> Info: PERIODIC CELL BASIS 3 0 0 28
> Info: PERIODIC CELL CENTER -0.021 0.008 0.108
> Info: WRAPPING WATERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
> Info: WRAPPING ALL CLUSTERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
> Info: LOAD BALANCER Centralized
> Info: LOAD BALANCING STRATEGY New Load Balancers -- DEFAULT
> Info: LDB PERIOD 200 steps
> Info: FIRST LDB TIMESTEP 5
> Info: LAST LDB TIMESTEP -1
> Info: LDB BACKGROUND SCALING 1
> Info: HOM BACKGROUND SCALING 1
> Info: PME BACKGROUND SCALING 1
> Info: REMOVING LOAD FROM NODE 0
> Info: REMOVING PATCHES FROM PROCESSOR 0
> Info: MIN ATOMS PER PATCH 40
> Info: INITIAL TEMPERATURE 300
> Info: CENTER OF MASS MOVING INITIALLY? NO
> Info: DIELECTRIC 1
> Info: EXCLUDE SCALED ONE-FOUR
> Info: 1-4 ELECTROSTATICS SCALED BY 1
> Info: MODIFIED 1-4 VDW PARAMETERS WILL BE USED
> Info: DCD FILENAME PolyAla_out.dcd
> Info: DCD FREQUENCY 1
> Info: DCD FIRST STEP 1
> Info: DCD FILE WILL CONTAIN UNIT CELL DATA
> Info: XST FILENAME PolyAla_out.xst
> Info: XST FREQUENCY 1
> Info: NO VELOCITY DCD OUTPUT
> Info: NO FORCE DCD OUTPUT
> Info: OUTPUT FILENAME PolyAla_out
> Info: RESTART FILENAME PolyAla_out.restart
> Info: RESTART FREQUENCY 100
> Info: BINARY RESTART FILES WILL BE USED
> Info: SWITCHING ACTIVE
> Info: SWITCHING ON 10
> Info: SWITCHING OFF 12
> Info: PAIRLIST DISTANCE 14
> Info: PAIRLIST SHRINK RATE 0.01
> Info: PAIRLIST GROW RATE 0.01
> Info: PAIRLIST TRIGGER 0.3
> Info: PAIRLISTS PER CYCLE 2
> Info: PAIRLISTS ENABLED
> Info: MARGIN 0.495
> Info: HYDROGEN GROUP CUTOFF 2.5
> Info: PATCH DIMENSION 16.995
> Info: CROSSTERM ENERGY INCLUDED IN DIHEDRAL
> Info: TIMING OUTPUT STEPS 1
> Info: PRESSURE OUTPUT STEPS 1
> Info: QM FORCES ACTIVE
> Info: QM PDB PARAMETER FILE: PolyAla-qm.pdb
> Info: QM SOFTWARE: orca
> Info: QM ATOM CHARGES FROM QM SOFTWARE: MULLIKEN
> Info: QM EXECUTABLE PATH:
> /cineca/prod/opt/applications/orca/4.0.1/binary/bin/orca
> Info: QM COLUMN: beta
> Info: QM BOND COLUMN: occ
> Info: QM WILL DETECT BONDS BETWEEN QM AND MM ATOMS.
> Info: QM-MM BOND SCHEME: Charge Shift.
> Info: QM BASE DIRECTORY:
> /gpfs/scratch/userexternal/fpietra0/QM-MM/NAMD_Example1_ORCA_24h_100GB_1node
> Info: QM CONFIG LINE: ! B3LYP 6-31G Grid4 PAL4 EnGrad TightSCF
> Info: QM CONFIG LINE: %%output PrintLevel Mini Print[ P_Mulliken ] 1
> Print[P_AtCharges_M] 1 end
> Info: QM POINT CHARGES WILL BE SELECTED EVERY 1 STEPS.
> Info: QM Point Charge Switching: ON.
> Info: QM Point Charge SCHEME: none.
> Info: QM executions per node: 1
> Info: LANGEVIN DYNAMICS ACTIVE
> Info: LANGEVIN TEMPERATURE 300
> Info: LANGEVIN USING BBK INTEGRATOR
> Info: LANGEVIN DAMPING COEFFICIENT IS 50 INVERSE PS
> Info: LANGEVIN DYNAMICS APPLIED TO HYDROGENS
> Info: LANGEVIN PISTON PRESSURE CONTROL ACTIVE
> Info: TARGET PRESSURE IS 1.01325 BAR
> Info: OSCILLATION PERIOD IS 200 FS
> Info: DECAY TIME IS 100 FS
> Info: PISTON TEMPERATURE IS 300 K
> Info: PRESSURE CONTROL IS GROUP-BASED
> Info: INITIAL STRAIN RATE IS 0 0 0
> Info: CELL FLUCTUATION IS ISOTROPIC
> Info: PARTICLE MESH EWALD (PME) ACTIVE
> Info: PME TOLERANCE 1e-06
> Info: PME EWALD COEFFICIENT 0.257952
> Info: PME INTERPOLATION ORDER 4
> Info: PME GRID DIMENSIONS 32 36 28
> Info: PME MAXIMUM GRID SPACING 1
> Info: Attempting to read FFTW data from system
> Info: Attempting to read FFTW data from
> FFTW_NAMD_2.12_Linux-x86_64-MPI_FFTW3.txt
> Info: Optimizing 6 FFT steps. 1... 2... 3... 4... 5... 6... Done.
> Info: Writing FFTW data to FFTW_NAMD_2.12_Linux-x86_64-MPI_FFTW3.txt
> Info: FULL ELECTROSTATIC EVALUATION FREQUENCY 1
> Info: USING VERLET I (r-RESPA) MTS SCHEME.
> Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
> Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
> Info: RANDOM NUMBER SEED 7910881
> Info: USE HYDROGEN BONDS? NO
> Info: COORDINATE PDB PolyAla.pdb
> Info: STRUCTURE FILE PolyAla.psf
> Info: PARAMETER file: CHARMM format!
> Info: PARAMETERS CHARMpars/toppar_all36_carb_glycopeptide.str
> Info: PARAMETERS CHARMpars/toppar_water_ions_namd.str
> Info: PARAMETERS CHARMpars/toppar_all36_na_nad_ppi_gdp_gtp.str
> Info: PARAMETERS CHARMpars/par_all36_carb.prm
> Info: PARAMETERS CHARMpars/par_all36_cgenff.prm
> Info: PARAMETERS CHARMpars/par_all36_lipid.prm
> Info: PARAMETERS CHARMpars/par_all36_na.prm
> Info: PARAMETERS CHARMpars/par_all36_prot.prm
> Info: USING ARITHMETIC MEAN TO COMBINE L-J SIGMA PARAMETERS
> Info: SKIPPING rtf SECTION IN STREAM FILE
> Info: SKIPPING rtf SECTION IN STREAM FILE
> Info: SKIPPING rtf SECTION IN STREAM FILE
> Info: SUMMARY OF PARAMETERS:
> Info: 937 BONDS
> Info: 2734 ANGLES
> Info: 6671 DIHEDRAL
> Info: 203 IMPROPER
> Info: 6 CROSSTERM
> Info: 357 VDW
> Info: 6 VDW_PAIRS
> Info: 0 NBTHOLE_PAIRS
> Info: TIME FOR READING PSF FILE: 0.0370231
> Info: Reading pdb file PolyAla.pdb
> Info: TIME FOR READING PDB FILE: 0.034543
> Info:
> Info: Using the following PDB file for QM parameters: PolyAla-qm.pdb
> Info: Number of QM atoms (excluding Dummy atoms): 20
> Info: We found 2 QM-MM bonds.
> Info: Applying user defined multiplicity 1 to QM group ID 1
> Info: 1) Group ID: 1 ; Group size: 20 atoms ; Total charge: 0
> Info: MM-QM pair: 24:30 -> Value (distance or ratio): 1.09 (QM Group 0 ID
> 1)
> Info: MM-QM pair: 50:44 -> Value (distance or ratio): 1.09 (QM Group 0 ID
> 1)
> Info: ****************************
> Info: STRUCTURE SUMMARY:
> Info: 2279 ATOMS
> Info: 1546 BONDS
> Info: 879 ANGLES
> Info: 199 DIHEDRALS
> Info: 15 IMPROPERS
> Info: 6 CROSSTERMS
> Info: 0 EXCLUSIONS
> Info: 6837 DEGREES OF FREEDOM
> Info: 773 HYDROGEN GROUPS
> Info: 4 ATOMS IN LARGEST HYDROGEN GROUP
> Info: 773 MIGRATION GROUPS
> Info: 4 ATOMS IN LARGEST MIGRATION GROUP
> Info: TOTAL MASS = 13773.9 amu
> Info: TOTAL CHARGE = 2.98023e-08 e
> Info: MASS DENSITY = 0.82848 g/cm^3
> Info: ATOM DENSITY = 0.0825485 atoms/A^3
> Info: *****************************
> Info:
> Info: Entering startup at 0.70037 s, 799.754 MB of memory in use
> Info: Startup phase 0 took 0.00322795 s, 799.754 MB of memory in use
> Info: The QM region will remove 19 bonds, 31 angles, 37 dihedrals, 3
> impropers and 1 crossterms.
> Info: ADDED 2624 IMPLICIT EXCLUSIONS
> Info: Startup phase 1 took 0.709255 s, 799.887 MB of memory in use
> Info: NONBONDED TABLE R-SQUARED SPACING: 0.0625
> Info: NONBONDED TABLE SIZE: 769 POINTS
> Info: INCONSISTENCY IN FAST TABLE ENERGY VS FORCE: 0.000325096 AT 11.9556
> Info: INCONSISTENCY IN SCOR TABLE ENERGY VS FORCE: 0.000324844 AT 11.9556
> Info: ABSOLUTE IMPRECISION IN VDWA TABLE ENERGY: 4.59334e-32 AT 11.9974
> Info: RELATIVE IMPRECISION IN VDWA TABLE ENERGY: 7.4108e-17 AT 11.9974
> Info: INCONSISTENCY IN VDWA TABLE ENERGY VS FORCE: 0.0040507 AT 0.251946
> Info: ABSOLUTE IMPRECISION IN VDWB TABLE ENERGY: 1.53481e-26 AT 11.9974
> Info: RELATIVE IMPRECISION IN VDWB TABLE ENERGY: 7.96691e-18 AT 11.9974
> Info: INCONSISTENCY IN VDWB TABLE ENERGY VS FORCE: 0.00150189 AT 0.251946
> Info: Startup phase 2 took 0.0194581 s, 804.121 MB of memory in use
> Info: Startup phase 3 took 0.000361919 s, 804.121 MB of memory in use
> Info: Startup phase 4 took 0.00718594 s, 804.121 MB of memory in use
> Info: Startup phase 5 took 0.000344038 s, 804.121 MB of memory in use
> Info: PATCH GRID IS 3 (PERIODIC) BY 4 (PERIODIC) BY 3 (PERIODIC)
> Info: PATCH GRID IS 2-AWAY BY 2-AWAY BY 2-AWAY
> Info: REMOVING COM VELOCITY -0.188499 0.149382 0.0208025
> Info: LARGEST PATCH (17) HAS 78 ATOMS
> Info: TORUS A SIZE 144 USING 0 36 72 108
> Info: TORUS B SIZE 1 USING 0
> Info: TORUS C SIZE 1 USING 0
> Info: TORUS MINIMAL MESH SIZE IS 109 BY 1 BY 1
> Info: Placed 100% of base nodes on same physical node as patch
> Info: Startup phase 6 took 0.0212991 s, 805.082 MB of memory in use
> Info: PME using 16 and 18 processors for FFT and reciprocal sum.
> Info: PME GRID LOCATIONS: 7 15 23 31 43 51 59 67 79 87 ...
> Info: PME TRANS LOCATIONS: 11 19 27 35 39 47 55 63 71 83 ...
> Info: PME USING 16 GRID NODES AND 18 TRANS NODES
> Info: Startup phase 7 took 0.113867 s, 805.75 MB of memory in use
> Info: Startup phase 8 took 0.00489211 s, 805.75 MB of memory in use
> LDB: Central LB being created...
> Info: Startup phase 9 took 0.0102289 s, 805.75 MB of memory in use
> Info: CREATING 2736 COMPUTE OBJECTS
> Info: Startup phase 10 took 0.0117202 s, 805.75 MB of memory in use
> Info: useSync: 1 useProxySync: 0
> Info: Building spanning tree ... send: 1 recv: 0 with branch factor 4
> Info: Startup phase 11 took 0.00923896 s, 805.75 MB of memory in use
> Info: Startup phase 12 took 0.000352859 s, 805.75 MB of memory in use
> Info: Finished startup at 1.6118 s, 805.75 MB of memory in use
>
> TCL: Minimizing for 100 steps
> Info: List of ranks running QM simulations: 2.
> ................................
>
>
>
>

This archive was generated by hypermail 2.1.6 : Sun Sep 15 2019 - 23:20:08 CDT