Diagnosing CUDA (non)performance

From: Chris Goedde (chris.goedde_at_gmail.com)
Date: Mon Oct 31 2016 - 07:48:31 CDT

Hi all,

tl;dr: When I run the f1atpase benchmark on a machine with a GTX 1080, I get a speedup of ?x with the CUDA version of namd 2.11. When I run my system on the same machine, I get no speed up. I’m trying to figure out why.

I’m running namd 2.11 on a linux box with a 16-core processor and a GTX 1080 GPU. When I run the f1atpase benchmarks, I get the following results:

CUDA: 2.3 ns/day
Non-CUDA: 0.3 ns/day

The speed up is approximately 7.5x.

I’m using this machine to simulate water flowing through a carbon nanotube. It’s a fairly small system, a few thousand carbon atoms and a few hundred waters. When I run my system on the same machine, I get:

CUDA: 39 ns/day
Non-CUDA: 36 ns/day

For a speed up of 1.08x. I’m wondering why this is, and if there’s anything I can do about it, or if it’s just the particulars of my system and how namd allocates resources. Any insight would be greatly appreciated. I’m included my .conf file below.


Chris Goedde

## Start of namd configuration file

# Limit the length of the log file

outputEnergies 100000

# Set up periodic boundary conditions

cellBasisVector1 50.000 0.000 0.000
cellBasisVector2 0.000 50.000 0.000
cellBasisVector3 0.000 0.000 491.200
cellOrigin 0 0 0

wrapWater off
wrapAll off

# Set the input files

structure Config.psf
coordinates Config.pdb

# Set the force field parameters

paraTypeCharmm on
parameters par_all27_prot_lipid.prm
exclude scaled1-4
cutoff 12.0
pairlistdist 14.0
switching on
switchdist 10.0
PME yes
PMEGridSpacing 1.0
FFTWWisdomFile FFTW_NAMD_2.11_nanotube.txt

# Set the integration parameters

timestep 1
nonbondedFreq 2
fullElectFrequency 4
stepspercycle 20
rigidBonds water

restartfreq 1000
dcdfreq 1000
veldcdfreq 1000
forcedcdfreq 1000

# Set the restraints on the carbon

constraints on
consref Config-restraint.pdb
conskfile Config-restraint.pdb
conskcol O

# Set the external forcing

constantForce yes
consForceScaling 0.014392621
consForceFile Config-forcing.pdb

# Set the Langevin thermostat

langevin on
langevinFile Config-langevin.pdb
langevinCol O
langevinTemp 300

# Set the execution parameters

outputname Data
bincoordinates Config.restart.coor
binvelocities Config.restart.vel
run 10000000

