From: Chris Harrison (charris5_at_gmail.com)
Date: Fri Aug 31 2012 - 13:14:02 CDT
How reproducible is the error and does it occur on other GPU boards? I
ask b/c if you have a system where it occurs reproducibly at ~320K steps
or very close to that we would ask you to send us the inputs so we can
use it to track down the problem.
Ashley Chew <ashley.chew_at_uwa.edu.au> writes:
> Date: Fri, 31 Aug 2012 17:02:48 +0800
> From: Ashley Chew <ashley.chew_at_uwa.edu.au>
> To: "namd-l_at_ks.uiuc.edu" <namd-l_at_ks.uiuc.edu>
> Subject: namd-l: NAMD 2.9 with CUDA runs
> Hi this is my first post in regards to NAMD
> I was wondering if anyone in the community was having problems with NAMD built with CUDA (Using a single Tesla M2075 6gb, node has 72GB of Ram) once it passes a certain point (In his case pass 320k steps)
> In our case one of the researcher notice the errors returned in the output are common internal errors to do with unstable simulations but if they checkpoint and stop the runs prior to 320K steps, and then restart from the restart files internally generated by NAMD, the restarted simulation runs past the previous crash point.
> I have even rebuilt the NAMD from CVS 20120828 build with fftw3 (which works) but it pretty much did the same things once it passes a certain point.
> Ashley Chew
> HPC System Administrator
> iVEC_at_UWA (MBDP: M024)
> The University of Western Australia
> 35 Stirling Highway
> CRAWLEY WA 6009
> E: ashley.chew_at_uwa.edu.au<mailto:ashley.chew_at_uwa.edu.au>
> P: +61 8 6488 8742
> F: +61 8 6488 1015
> CRICOS Provider Code: 00126G
> Confidentiality and Privacy Notice
> The contents of this email are strictly private and intended only for the addressee. This email may contain legally privileged or confidential information. If you receive this communication in error, please notify the sender immediately by reply email and delete both emails and any attachments contained therein. No further disclosure, copying or relaying of any part of this correspondence is permitted without the express permission of the sender. The contents of this email, and any response or further correspondence, may be stored on an electronic filing record system pursuant to the privacy statement for records at The University of Western Australia. The University accepts no liability in connection with computer virus, data corruption, delay, interruption, unauthorized access or unauthorized amendment. This notice should not be removed.
> P Save a tree...please don't print this e-mail unless you really need to
-- Chris Harrison, Ph.D. NIH Center for Macromolecular Modeling and Bioinformatics Theoretical and Computational Biophysics Group Beckman Institute for Advanced Science and Technology University of Illinois, 405 N. Mathews Ave., Urbana, IL 61801 http://www.ks.uiuc.edu/Research/namd Voice: 773-570-6078 http://www.ks.uiuc.edu/~char Fax: 217-244-6078
This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:22:29 CST