RE: FATAL ERROR with "File exists"

From: Vermaas, Joshua (Joshua.Vermaas_at_nrel.gov)
Date: Wed Jan 16 2019 - 13:51:51 CST

Two things I can think of quickly: (1) Have you tried a -np option to mpirun? Based on the configuration file, I would expect something like: mpirun -np 32 ~/NAMD2/NAMD_2.13_Linux-x86_64/namd2 md1_3PTBP.namd
(2) Are you absolutely, positively sure that this is a MPI-build of NAMD? Based on the directory structure, it looks like the ethernet build, which will do funny things like run 32 independent-but-identical simulations that are all fighting to rename files and stuff. The Info lines at the top of the log are going to be the most useful, and for a MPI build will look something like this:

Info: NAMD 2.13 for Linux-x86_64-MPI

Info:

Info: Please visit http://www.ks.uiuc.edu/Research/namd/

Info: for updates, documentation, and support information.

Info:

Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)

Info: in all publications reporting results obtained with NAMD.

Info:

Info: Based on Charm++/Converse 60800 for mpi-linux-x86_64-mpicxx

Info: Built Tue Nov 27 19:19:47 MST 2018 by jvermaas on ed1

Info: 1 NAMD 2.13 Linux-x86_64-MPI 36 r2i4n28 jvermaas

Info: Running on 36 processors, 36 nodes, 1 physical nodes.

-Josh

On 2019-01-16 11:37:44-07:00 owner-namd-l_at_ks.uiuc.edu wrote:

Hello

I got my system to run over several nodes now using Slurm, but it terminates now with the ERROR:

OPENING EXTENDED SYSTEM TRAJECTORY FILE
ERROR: Error on renaming file md1Solv_New3PTBP.xst to md1Solv_New3PTBP.xst.BAK: No such file or directory
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: Unable to open text file md1Solv_New3PTBP.xst: File exists
=====================================

I am assuming it has something to do with running on several nodes and the file is attempting to be rewritten when on other nodes and it does not want to do this...The dcd file will also pose a problem right...? I need NAMD to overwrite the output files..?

Below is my "Slurm Input" script...

#!/bin/bash

#SBATCH --job-name=Seibold # Job name
#SBATCH --partition=mridata # Partition Name (Required)
##SBATCH --mail-type=END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL)
##SBATCH --mail-user=stevesi_at_ku.edu # Where to send mail
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=16
#SBATCH --time=24:00:00
#SBATCH --output=md1_3PTBP_%j.log # Standard output and error log

pwd; hostname; date

#module load namd/2.12_multicore

echo "Running on $SLURM_CPUS_ON_NODE cores"

#namd2 md7_3BP.namd

mpirun ~/NAMD2/NAMD_2.13_Linux-x86_64/namd2 md1_3PTBP.namd

======================

This archive was generated by hypermail 2.1.6 : Sun Dec 08 2019 - 23:20:22 CST