Re: Segmentation fault for the tutorial

From: David Hardy (dhardy_at_ks.uiuc.edu)
Date: Wed Mar 07 2018 - 10:51:44 CST

Hi Mahmood,

It looks like you are building from the NAMD 2.12 released source back in Dec 2016.

Unfortunately, the new CUDA kernels released at that time introduced a bug in which MSM was not getting properly initialized, hence the seg fault you are seeing. The bug was discovered and fixed around March 2017.

See this earlier namd-l thread regarding the problem (and there are others):
http://www.ks.uiuc.edu/Research/namd/mailing_list/namd-l.2017-2018/0747.html <http://www.ks.uiuc.edu/Research/namd/mailing_list/namd-l.2017-2018/0747.html>

Since you are building NAMD, why not use an up-to-date version of the source? You can get it with this command:
git clone https://charm.cs.illinois.edu/gerrit/namd.git

(Besides which, support for CUDA 9 was not even possible back in 2016.)

See http://www.ks.uiuc.edu/Research/namd/development.html <http://www.ks.uiuc.edu/Research/namd/development.html> for more details on building NAMD.

Best regards,
Dave

--
David J. Hardy, Ph.D.
Beckman Institute
University of Illinois at Urbana-Champaign
405 N. Mathews Ave., Urbana, IL 61801
dhardy_at_ks.uiuc.edu, http://www.ks.uiuc.edu/~dhardy/
> On Mar 7, 2018, at 9:48 AM, Mahmood Naderan <mahmood.nt_at_gmail.com> wrote:
> 
> Joshua,
> As you can see below, the driver and smi versions are the same
> 
> mahmood_at_orca:~$ nvidia-smi
> Wed Mar  7 19:15:13 2018
> +-----------------------------------------------------------------------------+
> | NVIDIA-SMI 384.81                 Driver Version: 384.81                    |
> |-------------------------------+----------------------+----------------------+
> | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
> | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
> |===============================+======================+======================|
> |   0  Quadro M2000        Off  | 00000000:23:00.0  On |                  N/A |
> | 56%   40C    P0    24W /  75W |    230MiB /  4035MiB |      3%      Default |
> +-------------------------------+----------------------+----------------------+
> 
> +-----------------------------------------------------------------------------+
> | Processes:                                                       GPU Memory |
> |  GPU       PID   Type   Process name                             Usage      |
> |=============================================================================|
> |    0      1144      G   /usr/lib/xorg/Xorg                           129MiB |
> |    0      1627      G   compiz                                        98MiB |
> +-----------------------------------------------------------------------------+
> mahmood_at_orca:~$ nvcc --version
> nvcc: NVIDIA (R) Cuda compiler driver
> Copyright (c) 2005-2017 NVIDIA Corporation
> Built on Fri_Sep__1_21:08:03_CDT_2017
> Cuda compilation tools, release 9.0, V9.0.176
> 
> 
> 
> 
> Any more idea?
> 
> Regards,
> Mahmood
> 
> 
> 
> 
> On Wed, Mar 7, 2018 at 11:03 AM, Mahmood Naderan <mahmood.nt_at_gmail.com> wrote:
>> The OS is Ubuntu and I installed the cuda-9 driver which comes from
>> the toolkit package. The installation was successful and I can work
>> with the nvccc, and other things. For example, the Nvidia X server
>> setting window correctly shows the device, utilization, temperature
>> and other things.
>> 
>> Your sentence that the driver version must match smi version is not
>> clear for me. Can you explain more? In what situations they might be
>> different?
>> 
>> Regards,
>> Mahmood
>> 

This archive was generated by hypermail 2.1.6 : Wed Dec 11 2019 - 23:19:42 CST