From: Lenz Fiedler (l.fiedler_at_hzdr.de)
Date: Tue Apr 05 2022 - 01:29:09 CDT

Hi John,

Thanks for the clarification, that makes sense.

I have tried multiple setups, but in all the cases I am using the full
memory of a node and run only one rank (with 1 CPU) per node. So VMD
gets 1, 2 or 4 CPUs all with the full 360GB of memory per node.

I have two representations in my file: One VDW for the ~130000 atoms and
the second one, an isosurface, for their electronic density.
This is the tcl script I am using (I deleted some rotation commands in
between), which I created by plotting a smaller file locally and piping
the tcl commands into a file:

menu files off
menu files on
display resetview
display resetview
mol addrep 0
display resetview
  mol new {Be131072_density.cube} type {cube} first 0 last -1 step 1
waitfor 1 volsets {0 }
animate style Loop
menu files off
menu graphics off
menu graphics on
mol modstyle 0 0 VDW 1.000000 12.000000
mol modstyle 0 0 VDW 0.900000 12.000000
mol modstyle 0 0 VDW 0.800000 12.000000
mol modstyle 0 0 VDW 0.700000 12.000000
mol modstyle 0 0 VDW 0.600000 12.000000
mol modstyle 0 0 VDW 0.500000 12.000000
mol modstyle 0 0 VDW 0.400000 12.000000
mol modstyle 0 0 VDW 0.300000 12.000000
mol modstyle 0 0 VDW 0.200000 12.000000
mol modstyle 0 0 VDW 0.100000 12.000000
mol modmaterial 0 0 BrushedMetal
mol modcolor 0 0 ColorID 0
mol modcolor 0 0 ColorID 12
mol color ColorID 12
mol representation VDW 0.100000 12.000000
mol selection all
mol material BrushedMetal
mol addrep 0
mol modstyle 1 0 Isosurface 0.000000 0 2 2 1 1
mol modstyle 1 0 Isosurface 0.000000 0 0 2 1 1
mol modstyle 1 0 Isosurface 0.000000 0 0 0 1 1
mol modmaterial 1 0 Transparent
mol modcolor 1 0 ColorID 31
mol modstyle 1 0 Isosurface 0.038714 0 0 0 1 1
render TachyonInternal vmdscene.tga display %s

If I uncomment everything after "mol addrep 0" up until the rendering,
the file renders fine, showing only the atoms without the density.
The file is a large Beryllium cell in slightly disordered hcp geometry.
I would be very grateful for ideas on how to render this file!

Kind regards,
Lenz

-- 
Lenz Fiedler, M. Sc.
PhD Candidate | Matter under Extreme Conditions
Tel.: +49 3581 37523 55
E-Mail: l.fiedler_at_hzdr.de
https://www.casus.science
CASUS - Center for Advanced Systems Understanding
Helmholtz-Zentrum Dresden-Rossendorf e.V. (HZDR)
Untermarkt 20
02826 Görlitz
Vorstand: Prof. Dr. Sebastian M. Schmidt, Dr. Diana Stiller
Vereinsregister: VR 1693 beim Amtsgericht Dresden
On 4/4/22 18:37, John Stone wrote:
> Hi,
>    Right, using the non-MPI Tachyon within VMD is correct.
> It will result in the individual MPI ranks doing their own Tachyon
> renderings, which is the right thing for most typical VMD+MPI
> workloads like movie renderings.
>
> If you're running out of node memory, there are a few ways
> we might "tame" the memory use in VMD/Tachyon for your cube file
> scenario.  The 9GB cube file doesn't sound like it should result
> in a scene that would create a huge memory footprint.  Are you running
> multiple VMD MPI ranks on the same machine still?  If so, then I would
> begin by avoiding that, so that each MPI process gets the full node
> memory.
>
> Regarding the rendering of the cube file, what representations are you
> using?  Just isosurface, or do you have lots of other representations
> as well?  Is there any other molecular geometry?
>
> I might have suggestions for you to try the reduce that memory footprint
> assuming you've already switched to running only one MPI rank per node.
>
> Best,
>    John Stone
>
>
> On Mon, Apr 04, 2022 at 05:45:52PM +0200, Lenz Fiedler wrote:
>> Hi John,
>>
>>
>> Thank you so much - the error was indeed from the tachyon MPI
>> version! It was just as you described, I had compiled the MPI
>> version for both VMD and tachyon. After using the serial version for
>> the latter, I don't get the crash anymore! :)
>>
>> Does this mean then that the rendering will be done in serial only
>> on rank 0? I am trying to render an image based on a very large
>> (9GB) .cube file (with isosurface), and so far using either 1, 2 and
>> 4 nodes with 360GB shared memory have resulted in a segmentation
>> fault. I assume it is memory related, because I can render smaller
>> files just fine.
>>
>>
>> Also thanks for the info regarding the threading, I will keep that in mind!
>>
>>
>> Kind regards,
>>
>> Lenz
>>
>>
>> -- 
>> Lenz Fiedler, M. Sc.
>> PhD Candidate | Matter under Extreme Conditions
>>
>> Tel.: +49 3581 37523 55
>> E-Mail: l.fiedler_at_hzdr.de
>> https://www.casus.science
>>
>> CASUS - Center for Advanced Systems Understanding
>> Helmholtz-Zentrum Dresden-Rossendorf e.V. (HZDR)
>> Untermarkt 20
>> 02826 Görlitz
>>
>> Vorstand: Prof. Dr. Sebastian M. Schmidt, Dr. Diana Stiller
>> Vereinsregister: VR 1693 beim Amtsgericht Dresden
>>
>> On 4/4/22 17:04, John Stone wrote:
>>> Hi,
>>>    The MPI bindings for VMD are really intended for multi-node runs
>>> rather than for dividing up the CPUs within a single node.  The output
>>> you're seeing shows that VMD is counting 48 CPUs (hyperthreading, no doubt)
>>> for each MPI rank, even though they're all being launched on the same node.
>>> The existing VMD startup code doesn't automatically determine when sharing
>>> like this occurs, so it's just behaving the same way it would if you had
>>> launched the job on 8 completely separate cluster nodes.  You can set some
>>> environment variables to restrict the number of shared memory threads
>>> VMD/Tachyon use if you really want to run all of your ranks on the same node.
>>>
>>> The warning you're getting from OpenMPI about multiple initialization
>>> is interesting.  When you compiled VMD, you didn't compile both VMD
>>> and the built-in Tachyon with MPI enabled did you?  If Tachyon is also
>>> trying to call MPI_Init() or MPI_Init_Thread() that might explain
>>> that particular error message.  Have a look at that and make sure
>>> that (for now at least) you're not compiling the built-in Tachyon
>>> with MPI turned on, and let's see if we can rid you of the
>>> OpenMPI initialization errors+warnings.
>>>
>>> Best,
>>>    John Stone
>>>    vmd_at_ks.uiuc.edu
>>>
>>> On Mon, Apr 04, 2022 at 04:39:17PM +0200, Lenz Fiedler wrote:
>>>> Dear VMD users and developers,
>>>>
>>>>
>>>> I am facing a problem in running VMD using MPI.
>>>>
>>>> I compiled VMD from source (alongside Tachyon, which I would like to
>>>> use for rendering). I had first checked everything in serial, there
>>>> it worked. Now, after parallel compilation, I struggle to run VMD.
>>>>
>>>> E.g. I am allocating 8 CPUs on a cluster node that has 24 CPUs in
>>>> total. Afterwards, I am trying to do:
>>>>
>>>> mpirun -np 8 vmd
>>>>
>>>> and I get this output:
>>>>
>>>> Info) VMD for LINUXAMD64, version 1.9.3 (April 4, 2022)
>>>> Info) http://www.ks.uiuc.edu/Research/vmd/
>>>> Info) Email questions and bug reports to vmd_at_ks.uiuc.edu
>>>> Info) Please include this reference in published work using VMD:
>>>> Info)    Humphrey, W., Dalke, A. and Schulten, K., `VMD - Visual
>>>> Info)    Molecular Dynamics', J. Molec. Graphics 1996, 14.1, 33-38.
>>>> Info) -------------------------------------------------------------
>>>> Info) Initializing parallel VMD instances via MPI...
>>>> Info) Found 8 VMD MPI nodes containing a total of 384 CPUs and 0 GPUs:
>>>> Info)    0:  48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name: gv002.cluster
>>>> Info)    1:  48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name: gv002.cluster
>>>> Info)    2:  48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name: gv002.cluster
>>>> Info)    3:  48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name: gv002.cluster
>>>> Info)    4:  48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name: gv002.cluster
>>>> Info)    5:  48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name: gv002.cluster
>>>> Info)    6:  48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name: gv002.cluster
>>>> Info)    7:  48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name: gv002.cluster
>>>> --------------------------------------------------------------------------
>>>> Open MPI has detected that this process has attempted to initialize
>>>> MPI (via MPI_INIT or MPI_INIT_THREAD) more than once.  This is
>>>> erroneous.
>>>> --------------------------------------------------------------------------
>>>> [gv002:139339] *** An error occurred in MPI_Init
>>>> [gv002:139339] *** reported by process [530644993,2]
>>>> [gv002:139339] *** on a NULL communicator
>>>> [gv002:139339] *** Unknown error
>>>> [gv002:139339] *** MPI_ERRORS_ARE_FATAL (processes in this
>>>> communicator will now abort,
>>>> [gv002:139339] ***    and potentially your MPI job)
>>>>
>>>>
>>>>  From the output it seems to me that each of the 8 MPI ranks assumes
>>>> it is rank zero? At least the fact that each rank gives 48 CPUs
>>>> (24*2 I assume?) makes me believe that.
>>>>
>>>> Could anyone give me a hint on what I might be doing wrong? The
>>>> OpenMPI installation I am using has been used for many other
>>>> programs on this cluster, so I would assume it is working correctly.
>>>>
>>>>
>>>> Kind regards,
>>>>
>>>> Lenz
>>>>
>>>> -- 
>>>> Lenz Fiedler, M. Sc.
>>>> PhD Candidate | Matter under Extreme Conditions
>>>>
>>>> Tel.: +49 3581 37523 55
>>>> E-Mail: l.fiedler_at_hzdr.de
>>>> https://www.casus.science
>>>>
>>>> CASUS - Center for Advanced Systems Understanding
>>>> Helmholtz-Zentrum Dresden-Rossendorf e.V. (HZDR)
>>>> Untermarkt 20
>>>> 02826 Görlitz
>>>>
>>>> Vorstand: Prof. Dr. Sebastian M. Schmidt, Dr. Diana Stiller
>>>> Vereinsregister: VR 1693 beim Amtsgericht Dresden
>>>>
>>>>
>>>
>
>