From: Jim Phillips (jim_at_ks.uiuc.edu)
Date: Mon Dec 17 2018 - 09:12:25 CST
Since you are asking Slurm for 10 tasks with 1 cpu-per-task it is possible
that all 34 threads are running on a single core. You can check this with
top (hit "1" to see per-core load) if you can ssh to the execution host.
You should probably request --ntasks=1 --cpus-per-task=34 (or 36) so that
Slurm will allocate all of the cores you wish to use. The number of cores
used by NAMD is controlled by +p10 and you will need THREADS=24 for MOPAC.
It is a good idea to use top to confirm that all cores are being used.
On Sun, 16 Dec 2018, Francesco Pietra wrote:
> I had early taken into consideration the relative nr of threads, by
> imposing them also to MOPAC.
> Out of the many such trials, namd.config:
> qmConfigLine "PM7 XYZ T=2M 1SCF MOZYME CUTOFF=9.0 AUX LET GRAD QMMM
> GEO-OK THREADS=24"
> qmExecPath "/galileo/home/userexternal/fpietra0/mopac/MOPAC2016.exe"
> corresponding SLURM:
> #SBATCH --nodes=1
> #SBATCH --ntasks=10
> #SBATCH --cpus-per-task=1
> namd-01.conf +p10 > namd-01.log
> Thus, 24+10=34, while the number of cores on the node was 36. Again,
> execution took nearly two hours, slower than on my vintage VAIO with two
> cores (1hr and half).
> As to the MKL_NUM_THREADS, I am lost, there is no such environment variable
> in MOPAC's list. On the other hand, the namd night build I used performs as
> effective as it should with classical MD simulations on one node of the
> same cluster.
> On Fri, Dec 14, 2018 at 4:29 PM Jim Phillips <jim_at_ks.uiuc.edu> wrote:
>> The performance of a QM/MM simulation is typically limited by the QM
>> program, not the MD program. Do you know how many threads MOPAC is
>> launching? Do you need to set the MKL_NUM_THREADS environment variable?
>> You want the number of NAMD threads (+p#) plus the number of MOPAC threads
>> to be less than the number of cores on your machine.
>> On Fri, 14 Dec 2018, Francesco Pietra wrote:
>>> Hi all
>>> I resumed my attempts at finding the best settings for running namd qmmm
>>> a cluster. I used Example1, Polyala).
>>> In order to use namd2/13 multicore night build, I was limited to a single
>>> multicore node, 2*18-core Intel(R) Xeon(R) E5-2697 v4 @ 2.30GHz and 128
>>> GB RAM (Broadwell)
>>> qmConfigLine "PM7 XYZ T=2M 1SCF MOZYME CUTOFF=9.0 AUX LET GRAD QMMM
>>> qmExecPath "/galileo/home/userexternal/fpietra0/mopac/MOPAC2016.exe"
>>> of course, on the cluster the simulation can't be run on shm
>>> execution line
>>> namd-01.conf +p# > namd-01.log
>>> where # was either 4, 10, 15, 36
>>> With either 36 or 15 core; segmentation fault
>>> With either 4 of 10 core, execution of the 20,000 steps of Example 1 took
>>> nearly two hours. From the .ou file in folder /0, the execution took 0.18
>>> My question is what is wrong in my attempts to rationalize such
>>> disappointing performance.
>>> Thanks for advice
>>> francesco pietra
This archive was generated by hypermail 2.1.6 : Sat Dec 07 2019 - 23:20:20 CST