Re: Bug in NAMD

From: Pawel Kedzierski (pawel.kedzierski_at_pwr.edu.pl)
Date: Sat Jun 22 2019 - 04:52:43 CDT

W dniu 21.06.2019 o 18:27, Marcelo C. R. Melo pisze:
> Hi All,
> I am not an expert on the internals of the minimizer in NAMD, but is
> it possible that it has a heuristic that "walks back" if it notices an
> increase in energy?

Such heuristics would not be reasonable. Same coordinates generate same
forces, so the cycle never ends.

> It looks like the energies Pawel is getting in between repeated
> coordinates are higher in energy, then it "re-sets" to a previous
> coordinate set that had lower energy and tries again.

This is not dependent on the minimization algorithm either - I did test
both the default cg and the older and simpler velocity quenching, no change.

Conjugate gradients is expected to avoid directions of movement which
increase the energy, but not this way.

> The QM-Only DCD writes the exact coordinate that NAMD is sending for
> QM calculations, so somehow NAMD is producing the exact same coordinates.

Well, that is the essence of the bug.

I tried to look at the source code but I didn't even understand fully
the details of what is going on. I understand thar regular NAMD
calculations are distributed into batches (or patches?) to be calculated
on many cores or processors. Before NAMD starts the QM process, it has
to collect the coordinates of all QM atoms from these different batches,
which is done by message passing (I think).

Since this is may be the only thing specific to QM calculations and the
bug do not depend on ANY other NAMD setting, I suspect the bug may be
there in the coordinate collection. Or in the code which distributes
back the forces calculated using QM software. Could it be the case, that
the master process is getting *OLD* coordinates (outdated by a few
steps)? It may be a synchronization issue, or use of some buffer/cache
which is not properly updated. But my knowledge of C++ is rather basic,
that of parallel programming is almost zero, so I feel not competent
enough to debug it myself.

Cheers,

Pawel

>
> Best,
> Marcelo
>
> On Fri, 21 Jun 2019 at 08:01, Pawel Kedzierski
> <pawel.kedzierski_at_pwr.edu.pl <mailto:pawel.kedzierski_at_pwr.edu.pl>> wrote:
>
> W dniu 20.06.2019 o 16:46, Marcelo C. R. Melo pisze:
>> 2) How are you checking the coordinates sent to the QM software?
>> Are you saving a DCD at every step?
>
> I have confirmed that the coordinates saved by NAMD in the QM-only
> DCD file are repeated every few steps.
>
> Python code used to check this is pasted below.
>
>> 3) I noticed you are using the NAMD build from Nov/2018.  (NAMD
>> 2.13 for Linux-x86_64-multicore, Built Fri Nov 9 14:39:16 CST
>> 2018 by jim on ganymede.ks.uiuc.edu <http://ganymede.ks.uiuc.edu/>)
>> Do you have access to a new one? Could you download the
>> pre-compiled one from the NAMD website for tests?
>
> Downloaded and tested this binary build:
>
> NAMD Git-2019-06-21 for Linux-x86_64-multicore
> Based on Charm++/Converse 60800 for multicore-linux64-iccstatic
> Built Fri Jun 21 02:03:40 CDT 2019 by jim on belfast.ks.uiuc.edu
> <http://belfast.ks.uiuc.edu>
>
> Same problem.
>
> Best regards,
>
> Paweł Kędzierski
>
> #!/usr/bin/env python3
> # Load coordinates from DCD trajectory and compare them. Requires catdcd,
> # Python >=3.5 and numpy Python library
> # Arguments: file.psf file.dcd
> import numpy as np, subprocess as sp, sys
>
> inpsf, indcd = sys.argv[1:3]
> sp.run(['catdcd', '-o', 'tmp.xyz <http://tmp.xyz>', '-otype', 'xyz', '-s', inpsf,
> '-stype', 'psf', indcd], check=True, stdout=sp.PIPE)
> prec = None
> frames = []
> with open('tmp.xyz <http://tmp.xyz>', 'r') as xyz:
> for line in xyz:
> words = line.split()
> if len(words)==1: # the line with atom count
> natoms = int(words[0])
> next(xyz) # skip comment
> coords = np.zeros((natoms,3), dtype=float) # array for coordinates
> for i in range(natoms):
> words = next(xyz).split()
> if not prec: # recognize precision of coords in xyz file
> prec = len(words[1].split('.')[1])
> coords[i] = words[1:4] # read x,y,z line by line
> frames.append(coords) # remember this frame of coordinates
> N = len(frames)
> print("\nSuccessfully read %d frames with precision to the %d'th decimal point."
> % (N, prec))
>
> # Now, compare the frames
> eps = eval('1e-%d' % prec)
> found = False
> for i in range(N-1):
> anynew = False
> for j in range(i+1, N):
> if np.all(np.abs(frames[i] - frames[j]) < eps):
> if not found:
> print('Pairs of identical frames (0-based indices):')
> found = True
> anynew = True
> print('%d-%d' % (i,j), end=' ')
> if anynew:
> print() # flush line
> if not found:
> print('None of the frames were identical.')
>


This archive was generated by hypermail 2.1.6 : Tue Dec 10 2019 - 23:20:50 CST