Questions on Non-equilibrium TI Phase with Slurm Scheduler in 'AZtutorial.py' Script

jungyong · November 21, 2023, 10:01am

Hello PMX community,

I have been using a customized ‘AZtutorial.py’ script from the tutorial, and I’ve come across a few questions related to non-equilibrium TI phase calculations with the Slurm scheduler.

Absence of Slurm’s Array Argument in ‘AZtutorial.py’ Script: I noticed that there is no slurm’s array argument in the _submission_script() method inside the ‘AZtutorial.py’ script. However, in the ‘jobscript.py’ script within pxm’s src, there is an slurm array argument in the _submission_script() method. Was the exclusion of the array in ‘AZtutorial.py’ intentional, perhaps for tutorial simplicity?
Performance Increase with Slurm Array: I am interested in utilizing Slurm’s array feature using the _submission_script() method from the ‘jobscript.py’ script in pxm’s src. It seems that using an array could lead to a significant performance increase. Have others in the community experimented with this, and if so, what were the outcomes?
Differences in Results or TI Calculations with Array: When using Slurm’s array, do the results differ from those obtained using a for loop? Are there any considerations or potential issues that one should be aware of when employing the array feature, especially concerning the calculation of non-equilibrium TI?
System Size Consistency for Unbound and Bound States: In the context of using non-equilibrium transitions between the unbound state and the bound state (complex), is it necessary for the system sizes to be the same when creating the two-state boxes? Does maintaining equal system sizes play a crucial role in achieving accurate results for such transitions?

Thank you!

bgroot · November 21, 2023, 10:43am

Hi,

I think the tutorial is older, the slurm argument was probably added later and included in the pmx distribution
I haven’t tried this, but please go ahead and test (and report back)
the calculations should be independent so the results should not be affected. sanity-checking is always a good idea, though. So if you see any differences, please report back.
no, this is not necessary, as the dH/dl is computed only for the relevant involved particles. So as long as the systems are “large enough”, there is no need for the bound and unbound states to be identical

Bert

jungyong · December 4, 2023, 9:37am

I have observed that harnessing Slurm arrays can yield superior performance compared to running a traditional for loop.

Let’s consider a scenario with 80 frames in a production run(TI step) using pmx.

Instead of executing a single iteration of grompp and mdrun with 96 cores for a total of 80 loops, I opted for a more efficient approach. Using the Slurm array, I concurrently ran 16 array jobs, allocating 6 cores to each job. Essentially, this is akin to running 5 loops simultaneously, utilizing 6 cores for 16 tasks concurrently, rather than looping 80 times with 96 cores for a single task. The result is an approximately 2x increase in performance.

Similarly, a setting with an array that provides a twofold performance boost involves running 48 array jobs and allocating 2 cores to each job.

It’s essential to note that the effectiveness of this strategy depends on factors such as the nature of the data, data structure, and molecular dynamics parameters. Additionally, results may vary based on the number of frames set in the simulation.

In my case, I observed a ‘significant’ reduction in execution time. These factors can wield a considerable impact on overall performance and should be carefully considered when implementing such optimizations.

It’s worth mentioning that this method of reducing the number of cores while calculating frames in parallel appears particularly effective in the TI stage, where fast transitions are crucial.

jungyong

Topic		Replies	Views
How to use haddock in SLURM system HADDOCK	13	2383	March 8, 2022
Result analysis of QM/MM output from Gromacs_CP2K QM/MM for Biomolecular Simulation	52	1644	September 5, 2022
Maize: Job Scheduler Webinar #72	1	182	September 6, 2023
Pmx - problems reproducing benchmarking from de Groot 2020 pmx	2	695	October 17, 2021
Run HADDOCK2.4 using slurm	7	166	February 22, 2024

Questions on Non-equilibrium TI Phase with Slurm Scheduler in 'AZtutorial.py' Script

Related topics