From: John Stone (johns_at_ks.uiuc.edu)
Date: Tue Apr 26 2022 - 15:41:32 CDT

Hi,
  So what is described there isn't what we ended up implementing.
There is functionality along these lines in developmental parts of
VMD now, but not in the general case yet for a few reasons.
1) I ended up going against mmap(), and instead we use another API
   known as direct I/O, which is more portable among operating systems
   rather than Unix-only.

2) To get the performance we want for out-of-core I/O (via any API),
   it ultimately requires a more purpose-designed trajectory format,
   which is what I ended up doing in the so-called ".js" file format,
   an early version of which is described here:
     https://link.springer.com/chapter/10.1007/978-3-642-24031-7_1

3) As we've implemented an increasing number of analytical and
   visualization features with GPU acceleration, ensuring that we
   had a way of supporting GPUs with out-of-core I/O became an
   increasingly important consideration that was not met by any
   existing approach. There is now a prototype implementation in
   VMD using a combination of 2) and 3) here, that can achieve
   massive I/O rates (over 70GB/sec for example, to network
   attached storage from a single DGX-2 compute node). This
   requires further specialization of the trajectory I/O code,
   and I've done it for some early cases, but it needs to become
   pervasive through VMD, and this is something that will be done
   using modern C++ constructs that requires C++ >= 2014.
   Again, here we will still need special file formats to do it well,
   so at the outset, it will only be the ".js" format that is supported.

For the time being, using DCD files or other legacy file formats,
the bigDCD script or your own "for" loop script is going to be the
best way to go because a lot of the readers for these existing
trajectory formats can't do full random access as currently implemented,
so to get reasonable performance they'll have to be processed "in-order"
for best performance at present.
    
If you have specific needs that require random access, let me know more
of the details. So far I haven't heard anything that would be an argument
against using BigDCD or similar methods with simple scripting approaches.

Best,
  John

On Tue, Apr 26, 2022 at 04:41:29PM -0300, Leonardo Palmieri wrote:
> Well, I also found this:
>
> I think it was from you, John.
>
> From: John Stone (johns_at_ks.uiuc.edu)
> Date: Fri Feb 15 2008 - 17:03:15 CST
>
> "...I'm also working on a future design change for the VMD internals that will
> enable it to work with trajectories that are far larger than the amount
> of physical memory in the machine through a new out-of-core trajectory
> plugin API. I will likely implement this first using my own special
> trajectory format and use mmap() and related kernel VM calls to allow
> VMD to map monstrously huge MD trajectories into virtual memory.
> The trick will be to add code for pre-fetching threads during trajectory
> analysis and playback, and to give the OS kernel "hints" about which
> timesteps need to be in-core and which ones can optionally be paged out.
> Later on, I hope to have a more general implementation that can work with
> any reasonable trajectory format (and without the need for mmap()), where
> VMD will keep a working set of frames in-core, and will dynamically
> load/free frames as analysis/visualization operations demand. This too
> will attempt to use scout threads to prefetch frames on-the-fly before
> they are needed so that the user "feels" like they were already in memory.
> I don't have a timeline for these developments yet, I'll know more once
> my experiments with my initial Unix-specific mmap() based implementation
> have made significant progress."
>
> That's I'm talking about...
>
> 2022-04-26 16:35 GMT-03:00, Leonardo Palmieri <leopalmieri1_at_gmail.com>:
> > BigDCD and scripts works well, some problems sometimes...
> >
> > but the point is:
> >
> > I'm interested in use extensions from Extension > Analysis, in graphic
> > mode, remotely accessing interactively a node in the computer where
> > the trajectory is stored.
> >
> > I'm using compressed X11 forwarding to have the graphic VMD working
> > remotely, but the memory available per node cannot store the entire
> > trajectory and the VMD crashes when it run out of memory.
> >
> > That's the reason.
> >
> > Thanks!
> >
> > 2022-04-26 16:17 GMT-03:00, John Stone <johns_at_ks.uiuc.edu>:
> >> Can you tell us why the bigDCD script isn't a choice for you?
> >>
> >> Best,
> >> John
> >>
> >> On Tue, Apr 26, 2022 at 04:12:18PM -0300, Leonardo Palmieri wrote:
> >>> Hi everybody,
> >>>
> >>> Is there a way to analyse a trajectory without loading the entire
> >>> trajectory file in RAM's computer?
> >>>
> >>> I know that is possible to do choosing a sub set of frames or choosing
> >>> a larger stride or scripting using BigDCD, but all of those is not a
> >>> choice for me. Is there another way?
> >>>
> >>> Thanks in advance!
> >>>
> >>>
> >>> --
> >>> att
> >>>
> >>> Leonardo Palmieri
> >>
> >> --
> >> NIH Center for Macromolecular Modeling and Bioinformatics
> >> Beckman Institute for Advanced Science and Technology
> >> University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
> >> http://www.ks.uiuc.edu/~johns/ Phone: 217-244-3349
> >> http://www.ks.uiuc.edu/Research/vmd/
> >>
> >
> >
> > --
> > att
> >
> > Leonardo Palmieri
> >
> > Pai de gente
> > Pai de planta
> >
>
>
> --
> att
>
> Leonardo Palmieri
>
> Pai de gente
> Pai de planta

-- 
NIH Center for Macromolecular Modeling and Bioinformatics
Beckman Institute for Advanced Science and Technology
University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
http://www.ks.uiuc.edu/~johns/           Phone: 217-244-3349
http://www.ks.uiuc.edu/Research/vmd/