Ken Jordan's Theoretical Chemistry Group

NCSA Nanotechnology Initiative

Additional details for running GMIN in parallel

Platforms supported

The parallel code has been tested on the NT cluster at NCSA (communications via Myrinet) and on the IBM RS6000 Cluster (communications via 100mBps Ethernet) at CMMS. Implementation and testing on a parallel LINUX clusters is underway.

Input format

Online manual for the scalar version of GMIN.

Up to 36 GMIN optimization searches may be run in parallel.

Each search can differ in step size, temperature or or the initial coordinates.
The number of atoms in each initial coordinate set must be the same.
It is best to make sure that there are sufficient processors available so that each optimization takes place on a separate node. If this is not done, then the output data from two searches with different initial conditions (ie. different initial coordinates, step size or temperature for search) will appear in the same output file, making the analysis difficult.
The "coords.serial_search_number" file which contains (number of different coordinate sets*number of different step sizes) sets of coordinates (see example below).

Up to 100 optimization jobs (parallel or scalar) may be combined into one overall job, each part of which runs in serial.

The data file used is the same in each job.
Therefore, the system being studied must be the same in each serial job.
The combination of temperatures and step sizes utilized for the parallel searches currently must be the same for each serial job.
The number of atoms in the cluster may change for each of the serial jobs.

Examples of jobs
1 No. of serial jobs = 1, each consisting of a parallel search on 6 processors.
Click here to see the data file needed for this input.

First job: 3 sets of coordinates for an 80 atom cluster, running at 2 different temperatures for each coordinate set and using only 1 step size.
coords.0 file contains just the coordinates for each initial configuration of the 80-atom cluster.

2 No. of serial jobs = 3, each consisting of searches running in parallel on 36 processors.
Click here to see the data file needed for this input.

First job: 4 sets of coordinates for 80-atom cluster, with searches for each set at 3 different temperatures and 3 different step sizes.
Total number of searches in parallel = (Number of different sets of coordinates * Number of different step sizes * Number of different temperatures for search)
coords.0 file contains 3*coordinate_set_1 followed by 3*coordinate_set_2 followed by 3*coordinate_set_3.
Second job: 4 sets of coordinates for 81-atom cluster, with searches for each set at 3 different temperatures and 3 different step sizes.
coords.1 file contains 3*coordinate_set_1 followed by 3*coordinate_set_2 followed by 3*coordinate_set_3 all for 81-atom cluster. Third job: 4 sets of coordinates for 82-atom cluster, with searches for each set at 3 different temperatures and 3 different step sizes.
coords.2 file contains 3*coordinate_set_1 followed by 3*coordinate_set_2 followed by 3*coordinate_set_3 all for 82-atom cluster.

Input files required are for the parallel version are:

"data" file containing keywords - see Examples of data files
"coords.serial_run_number" - Coordinates files for each of the serial jobs - the first file is always coords.0.
Number of sets of atom coordinates in file is number of different coordinate sets*number of different step sizes.
Directories, each labeled "number_of_serial_run-1", must be created prior to execution of GMIN for each serial run.

Flags required in the Makefiles

The following flags must appear in the makefile in order to ensure that the GMIN program compiles and runs successfully on these platforms. The flags may also be generally applicable.

On the NCSA NT supercluster:

Compiler and options:
F77 = fl32
FFLAGS = /4L132 /G5 /Ox /nologo

/4l132 allows the line length to extend to 132 characters.
/G5 is an optimization flag for a pentium.
/nologo allows a customized build.

To ensure successful linking:
LD = link /NODEFAULTLIB:libc.lib /NODEFAULTLIB:libcmt.lib /out:$(OUTNM)
where OUTNM = gmin.x

Set HPVM_HOME variable to the correct location:
HPVM_HOME = D:\apps\hpvm
MS_HOME = D:\apps\ms\DevStudio\VC

Libraries and files which are required:

MPI_INCLUDE = /I $(HPVM_HOME)\include\mpif.h

MPI_LIB = $(HPVM_HOME)\lib\mpi.lib
FM_LIB = $(HPVM_HOME)\lib\fm.lib
ADVAP_LIB = D:\apps\ms\DevStudio\VC\lib\advapi32.lib
KERNEL_LIB= D:\apps\ms\DevStudio\VC\lib\kernel32.lib
WSOCK_LIB = D:\apps\ms\DevStudio\VC\lib\wsock32.lib
LIBS = $(MPI_LIB) $(FM_LIB) $(ADVAP_LIB) $(KERNEL_LIB) $(WSOCK_LIB) $(IMPLEMENTATION_LIBS)

Compilation:
$(PROG): $(SRCS)

$(F77) -c /D_WIN32 $(MPI_INCLUDE) $(SOURCE_FILES)

$(LD) /OUT:$(OUTNM) $(OBJECT_FILES) $(LIBS)

Dependencies on files included in subroutines are not required in the makefile.

On an IBM RS6000 cluster running the AIX operating system:

We used the mpxlf Fortran compiler which invokes a shell script to compile the Fortran programs while linking in the Partition Manager, Message Passing Interface (MPI) and Library (MPL).

FFLAGS = -qfixed=132 -O2 -qarch=pwr3
LIBS = -lblas

The code does not run properly if compiled with the -O3 flag.

Must ensure that dependencies on files included in subroutines are listed in the makefile:
i.e., file.o: mpif.h params.h commons.h

On the parallel LINUX cluster at UNM:

We used the mpif77 compile script which compiles the code with g77 and automatically links in all the MPI libraries and include files.

FFLAGS = -ffixed-line-length-132 -O2
LIBS = -llapack -lblas

The code does not run properly if compiled with the -funroll-loops flag.

It is important to remember that, in general, libf2c in g77 accepts file unit numbers only in the range 0 through 99. Therefore all OPEN, WRITE, READ(Unit=100) type statements will lead to a run-time crash in the program. See the GNU Fortran online manual for more detail.

Job submission

On the NCSA NT supercluster:
Using the HPVM Client server:

Command line: \\ntsc-file1\home\username\gmin.x data -np 36 -key Trial
Output: \\ntsc-file1\home\username\output.err

On the AIX cluster:

Using LoadLeveller:
See the IBM manual for Job Command File examples

Additional keywords

Some adaptations and additions have been made to keywords for the parallel version.

PARALLEL npar nserial: This indicates that a number of searches will be done in parallel. npar is the (number of different coordinate sets * number of different initial step sizes). nserial is the number of parallel jobs to be run in serial.
This keyword must be at the top of the data file since the TEMPERATURE and STEP flags are adjusted if this is set.
STEPS mcsteps1 tfac1 mcsteps2 tfac2 mcsteps3 tfac3 : determines the number of steps in the MC searches through the integers mcstepsn, and the annealing protocols through the real variables tfacn. The temperature is multiplied by the corresponding tfac after every step in each search, unless TADD is set.
In parallel searches, the step size and temperature for the first three parallel searches are determined by the input on the STEP and STEPS lines. These values are then used in the same order to determine the initial conditions in the other parallel searches. (eg. MC searches 4, 7 and 10 will have tfac=tfac1 etc.)
TADD: This keyword indicates that the integer tfacn on the STEPS line will be added to the temperature set by the keyword TEMPERATURE at the beginning of the MC search.

Examples

1. This is an example of a data input file for running 6 searches in parallel. There are three different sets of coordinates for an n-atom LJ cluster in the coords.0 file.

PARALLEL 3
STEPS 10000 0.0 10000 0.2
STEP 0.35 0.4
TADD
TEMPERATURE 0.6
BASIN 0.1 0.01
SAVE 5
EDIFF 0.001
QMAX 1.0D-9 1.0D-3
MAXIT 250 500
SORT

Each basin-hopping search:

10000 MC steps
The STEP keyword specifies the maximum change of any Cartesian coordinate to 0.35. The tolerance on the binding energy of individual atoms (only for Morse and LJ potentials) below which an angular step is taken for an atom is 0.4.
reduced temperature is set at 0.6 in the searches on the first 3 processors (nodes 0-2) for coordinates sets 1-3, by the "0.6" specified on the TEMPERATURE line.
The TADD keyword in combination with the "0.2" specified on the STEPS line sets the reduced temperature to 0.8 in the searches in parallel on an additional 3 processors (nodes 3-5) for each of the coordinate sets 1-3.
The temperature is held constant and the step size is altered every 50 steps to achieve the default acceptance ratio of 0.5 for the MC steps.
If the TADD keyword were removed, the "0.2" specified on the STEPS line would be the scaling factor for the temperature at each MC step.
The BASIN keyword indicates that the convergence criteria for the change in energy between successive steps and the RMS force for the basin-hopping quenches are 0.1 and 0.01 respectively. Equivalent to the SLOPPYCONV keyword in the most recent scalar version of GMIN.
At the end of the 10000 MC steps, GMIN will save the 5 lowest-energy minimum structures and converge these more tightly.
Quench minima are only considered to be different if their energies in the basin-hopping quenches differ by at least 0.001, as given in the EDIFF line.
The QMAX keyword sets the convergence criteria for the energy and the RMS force between successive steps in the final set of quenches to 10^-9 and 10^-3 respectively. Equivalent to the TIGHTCONV keyword in the most recent scalar version of GMIN.
The maximum number of iterations (MAXIT) in the energy quenches is 250 for the basin-hopping quenches and 500 for the final quenches.
The SORT keyword is useful for pairwise potentials such as the LJ potential. It sorts the coordinates printed in the "lowest" files from the most to least strongly bound.

Initial conditions for search on each processor node
	Number of processor node
	0	1	2	3	4	5
Temperature	T₁			T₂
Coord. Set	1	2	3	1	2	3
Step	1

2 3 jobs in serial consisting of 36 searches in parallel:
1st job: LJ_n atom cluster - 4 sets of different coordinates into coords.0
2nd job: LJ_n+1 atom cluster - 4 sets of different coordinates into coords.1
3rd job: LJ_n+2 atom cluster - 4 sets of different coordinates into coords.2

PARALLEL 12 3
STEPS 10000 0.0 10000 0.2 10000 0.4
STEP 0.25 0.4 0.35 0.4 0.45 0.4
TADD
TEMPERATURE 0.6
BASIN 0.1 0.01
SAVE 5
EDIFF 0.001
QMAX 1.0D-9 1.0D-3
MAXIT 250 500
SORT

Each basin-hopping search:

10000 MC steps
The step size and temperature for the first three parallel searches are determined by the input on the STEPS line. These values are then used in the same order to determine the initial conditions in the other parallel searches.
Temperature is 0.6 for searches on nodes 0-11.
Temperature is 0.8 for searches on nodes 12-23.
Temperature is 1.0 for searches on nodes 24-35.
The maximum linear step size is 0.25 for each set of coordinates at each temperature.
ie. on nodes 0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33.
The maximum linear step size is 0.35 for each set of coordinates at each temperature.
ie. on nodes 1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34.
The maximum linear step size is 0.45 for each set of coordinates at each temperature.
ie. on nodes 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, 35.
See Summary of output format for an explanation of how the output from each node and each serial run may be identified.

Initial conditions for search on each processor node
	Number of processor node
	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	...	24	25	26	27	...	35
Temperature	T₁												T₂				...	T₃				...	T₃
Coord. Set	1	1	1	2	2	2	3	3	3	4	4	4	1	1	1	2	...	1	1	1	2	...	4
Step	1	2	3	1	2	3	1	2	3	1	2	3	1	2	3	1	...	1	2	3	1	...	3

Summary of output format

The output for a search which utilizes the coordinates in the "coords.serial_run_number" file will write its output to "output.serial_run_number.processor_number" files in the directory labeled "serial_run_number".

For the first example shown above:

Create directory 0/ and files data and coords.0 before starting GMIN.
Submit job.
Error data, details of which processors are being used and notification of starting next job in serial run is sent to the standard output file stated in submission script.
In directory 0/, the files output.0.0 - output.0.2 contain the details of every 100th quench and the final quench for each coordinate set at a temperature of 0.6.
Similarly, the files output.0.3 - output.0.5 contain the details of every 100th quench and the final quench for each coordinate set at a temperature of 0.8.
The files lowest.0.0 - lowest.0.2 contain the energy and points files for the 5 lowest energy minima found from the searches for each coordinate set at a temperature of 0.6.
Similarly, the files lowest.0.3 - lowest.0.5 contain the energy and points files for the 5 lowest energy minima found from the searches for each coordinate set at a temperature of 0.8.

For the second example shown above:

Create directories 0/, 1/ and 2/ and files data and coords.0, coords.1 and coords.2 before starting GMIN.
Submit job.
Error data, details of which processors are being used and notification of starting next job in serial run is sent to the standard output file stated in submission script.
In directory 0/, output.0.0 - output.0.35 and lowest.0.0 - lowest.0.35 files will be created with the initial conditions indicated shown here for the sets of coordinates for LJ_n.
Similarly the output for LJ_n+1 and LJ_n+2 will be found in directories 1/ and 2/.

Kenneth D. Jordan
Dept. of Chemistry, University of Pittsburgh,
219 Parkman Avenue, Pittsburgh, PA 15260
Phone: (412) 624-8690 FAX: (412) 624-8611 email: jordan at pitt.edu
This page last updated: