|
NCSA Nanotechnology Initiative
Additional details for running GMIN in parallel
The parallel code has been tested on the NT cluster at
NCSA (communications
via Myrinet)
and on the IBM RS6000 Cluster (communications via 100mBps Ethernet) at
CMMS.
Implementation and testing on a parallel LINUX clusters is underway.
Online manual for the scalar
version of GMIN.
Up to 36 GMIN optimization searches may be run in parallel.
- Each search can differ in step size, temperature or
or the initial coordinates.
- The number of atoms in each initial coordinate set must be the same.
- It is best to make sure that there are sufficient
processors available so that each optimization
takes place on a separate node. If this is not done, then the output data
from two searches with different initial conditions (ie. different initial coordinates, step size or temperature
for search) will appear in the same output file, making the analysis difficult.
- The "coords.serial_search_number"
file which contains (number of different coordinate sets*number of different step sizes) sets
of coordinates (see example below).
Up to 100 optimization jobs (parallel or scalar) may be combined into one overall
job, each part of which runs in serial.
- The data file used is the same in each job.
- Therefore, the system being studied must be the same in each serial job.
- The combination of temperatures and step sizes utilized for the parallel searches currently
must be the same for each serial job.
- The number of atoms in the cluster may change for each of the serial jobs.
Examples of jobs
1 No. of serial jobs = 1, each consisting of a parallel search on 6 processors.
Click here to see the data file needed for this input.
First job: 3 sets of coordinates for an 80 atom cluster, running at 2 different temperatures
for each coordinate set and using only 1 step size.
coords.0 file contains just the coordinates for each initial configuration of the 80-atom cluster.
2 No. of serial jobs = 3, each consisting of searches running in parallel on 36 processors.
Click here to see the data file needed for this input.
First job: 4 sets of coordinates for 80-atom cluster, with searches
for each set at 3 different temperatures
and 3 different step sizes.
Total number of searches in parallel = (Number of different sets of coordinates * Number of different step sizes
* Number of different temperatures for search)
coords.0 file contains 3*coordinate_set_1 followed by 3*coordinate_set_2 followed by 3*coordinate_set_3.
Second job: 4 sets of coordinates for 81-atom cluster, with searches
for each set at 3 different temperatures
and 3 different step sizes.
coords.1 file contains 3*coordinate_set_1 followed by 3*coordinate_set_2 followed by 3*coordinate_set_3
all for 81-atom cluster.
Third job: 4 sets of coordinates for 82-atom cluster, with searches
for each set at 3 different temperatures
and 3 different step sizes.
coords.2 file contains 3*coordinate_set_1 followed by 3*coordinate_set_2 followed by 3*coordinate_set_3
all for 82-atom cluster.
Input files required are for the parallel version are:
- "data" file containing keywords - see
Examples of data files
- "coords.serial_run_number" - Coordinates files for each of the serial jobs - the first file
is always coords.0.
Number of sets of atom coordinates in file is
number of different coordinate sets*number of different step sizes.
- Directories, each labeled "number_of_serial_run-1", must be created prior to execution of GMIN for
each serial run.
The following flags must appear in the makefile in order to ensure that the GMIN program compiles
and runs successfully on these platforms. The flags may also be generally applicable.
On the NCSA NT supercluster:
Compiler and options:
F77 = fl32
FFLAGS = /4L132 /G5 /Ox /nologo
- /4l132 allows the line length to extend to 132 characters.
- /G5 is an optimization flag for a pentium.
- /nologo allows a customized build.
To ensure successful linking:
LD = link /NODEFAULTLIB:libc.lib /NODEFAULTLIB:libcmt.lib
/out:$(OUTNM)
where OUTNM = gmin.x
Set HPVM_HOME variable to the correct location:
HPVM_HOME = D:\apps\hpvm
MS_HOME = D:\apps\ms\DevStudio\VC
Libraries and files which are required:
- MPI_INCLUDE = /I $(HPVM_HOME)\include\mpif.h
- MPI_LIB = $(HPVM_HOME)\lib\mpi.lib
- FM_LIB = $(HPVM_HOME)\lib\fm.lib
- ADVAP_LIB = D:\apps\ms\DevStudio\VC\lib\advapi32.lib
- KERNEL_LIB= D:\apps\ms\DevStudio\VC\lib\kernel32.lib
- WSOCK_LIB = D:\apps\ms\DevStudio\VC\lib\wsock32.lib
- LIBS = $(MPI_LIB) $(FM_LIB) $(ADVAP_LIB) $(KERNEL_LIB) $(WSOCK_LIB) $(IMPLEMENTATION_LIBS)
Compilation:
$(PROG): $(SRCS)
$(F77) -c /D_WIN32 $(MPI_INCLUDE) $(SOURCE_FILES)
$(LD) /OUT:$(OUTNM) $(OBJECT_FILES) $(LIBS)
Dependencies on files included in subroutines are not required in the makefile.
On an IBM RS6000 cluster running the AIX operating system:
We used the
mpxlf
Fortran compiler which invokes a shell script to compile the Fortran programs while linking in the Partition
Manager, Message Passing Interface (MPI) and Library (MPL).
FFLAGS = -qfixed=132 -O2 -qarch=pwr3
LIBS = -lblas
The code does not run properly if compiled with the -O3 flag.
Must ensure that dependencies on files included in subroutines are listed
in the makefile:
i.e., file.o: mpif.h params.h commons.h
On the parallel LINUX cluster at UNM:
We used the mpif77 compile
script which compiles the code with g77 and automatically links in all the MPI libraries and include files.
FFLAGS = -ffixed-line-length-132 -O2
LIBS = -llapack -lblas
The code does not run properly if compiled with the -funroll-loops flag.
It is important to remember that, in general, libf2c in g77 accepts file
unit numbers only in the range 0 through 99.
Therefore all OPEN, WRITE, READ(Unit=100) type statements will lead to a run-time crash
in the program.
See the GNU
Fortran online manual for more detail.
On the NCSA NT supercluster:
Using the HPVM Client server:
Command line: \\ntsc-file1\home\username\gmin.x data -np 36 -key Trial
Output: \\ntsc-file1\home\username\output.err
On the AIX cluster:
Using LoadLeveller:
See the IBM manual for
Job Command File examples
Some adaptations and additions have been made to keywords for the parallel version.
- PARALLEL npar nserial: This indicates that a number of searches
will be done in parallel.
npar is the (number of different coordinate sets * number of different initial step sizes).
nserial is the number of parallel jobs to be run in serial.
This keyword must be at the top of the data file since the TEMPERATURE and STEP flags are adjusted
if this is set.
- STEPS mcsteps1 tfac1 mcsteps2 tfac2 mcsteps3 tfac3 :
determines the number of steps in the MC searches through the
integers mcstepsn, and the annealing protocols through the real variables tfacn.
The temperature is multiplied by the corresponding tfac after every step in each search,
unless TADD is set.
In parallel searches, the step size and temperature for the first three parallel searches are determined by the input
on the STEP and STEPS lines. These values are then used in the same order to determine
the initial conditions in the other parallel searches.
(eg. MC searches 4, 7 and 10 will have tfac=tfac1 etc.)
- TADD: This keyword indicates that the integer tfacn on the STEPS line will be added
to the temperature set by the keyword TEMPERATURE at the beginning of the MC search.
1. This is an example of a data input file for running 6 searches in parallel.
There are three different sets of coordinates for an n-atom LJ cluster
in the coords.0 file.
PARALLEL 3
STEPS 10000 0.0 10000 0.2
STEP 0.35 0.4
TADD
TEMPERATURE 0.6
BASIN 0.1 0.01
SAVE 5
EDIFF 0.001
QMAX 1.0D-9 1.0D-3
MAXIT 250 500
SORT
Each basin-hopping search:
- 10000 MC steps
- The STEP keyword specifies the maximum change of any Cartesian
coordinate to 0.35. The tolerance on the binding energy of individual
atoms (only for Morse and LJ potentials) below which an angular step is
taken for an atom is 0.4.
- reduced temperature is set at 0.6 in the searches on the first 3 processors (nodes 0-2)
for coordinates sets 1-3, by the "0.6" specified on the TEMPERATURE line.
- The TADD keyword in combination with the "0.2" specified on the STEPS line
sets the reduced temperature to 0.8 in the searches in parallel on an additional 3 processors (nodes 3-5) for each
of the coordinate sets 1-3.
- The temperature is held constant and the step size is altered every 50 steps
to achieve the default acceptance ratio of 0.5 for the MC steps.
- If the TADD keyword were removed,
the "0.2" specified on the STEPS line would be the scaling factor for the temperature at each MC step.
- The BASIN keyword indicates that the convergence criteria for the change in energy
between successive steps and the RMS force for the basin-hopping quenches are 0.1 and 0.01 respectively.
Equivalent to the SLOPPYCONV keyword in the most recent scalar version of GMIN.
- At the end of the 10000 MC steps, GMIN will save the 5 lowest-energy minimum structures and converge these
more tightly.
- Quench minima are only considered to be different if their energies in the basin-hopping quenches
differ by at least 0.001, as given in the EDIFF line.
- The QMAX keyword sets the convergence criteria for the energy and the RMS force between successive
steps in the final set of quenches to 10-9 and 10-3 respectively.
Equivalent to the TIGHTCONV keyword in the most recent scalar version of GMIN.
- The maximum number of iterations (MAXIT) in the energy quenches is 250 for the basin-hopping quenches
and 500 for the final quenches.
- The SORT keyword is useful for pairwise potentials such as the LJ potential. It sorts
the coordinates printed in the "lowest" files from the most to least strongly bound.
Initial conditions for search on each processor node
|
Number of processor node |
|
0 |
1 |
2 |
3 |
4 |
5 |
Temperature |
T1 |
T2 |
Coord. Set |
1 |
2 |
3 |
1 |
2 |
3 |
Step |
1 |
2
3 jobs in serial consisting of 36 searches in parallel:
1st job: LJn atom cluster - 4 sets of different coordinates into coords.0
2nd job: LJn+1 atom cluster - 4 sets of different coordinates into coords.1
3rd job: LJn+2 atom cluster - 4 sets of different coordinates into coords.2
PARALLEL 12 3
STEPS 10000 0.0 10000 0.2 10000 0.4
STEP 0.25 0.4 0.35 0.4 0.45 0.4
TADD
TEMPERATURE 0.6
BASIN 0.1 0.01
SAVE 5
EDIFF 0.001
QMAX 1.0D-9 1.0D-3
MAXIT 250 500
SORT
Each basin-hopping search:
- 10000 MC steps
- The step size and temperature for the first three parallel searches are determined by the input
on the STEPS line. These values are then used in the same order to determine
the initial conditions in the other parallel searches.
- Temperature is 0.6 for searches on nodes 0-11.
- Temperature is 0.8 for searches on nodes 12-23.
- Temperature is 1.0 for searches on nodes 24-35.
- The maximum linear step size is 0.25 for each set of coordinates at each temperature.
ie. on nodes 0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33.
- The maximum linear step size is 0.35 for each set of coordinates at each temperature.
ie. on nodes 1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34.
- The maximum linear step size is 0.45 for each set of coordinates at each temperature.
ie. on nodes 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, 35.
- See Summary of output format for an explanation
of how the output from each node and each serial run may be identified.
Initial conditions for search on each processor node
|
Number of processor node |
|
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
... |
24 |
25 |
26 |
27 |
... |
35 |
Temperature |
T1 |
T2 |
... |
T3 |
... |
T3 |
Coord. Set |
1 |
1 |
1 |
2 |
2 |
2 |
3 |
3 |
3 |
4 |
4 |
4 |
1 |
1 |
1 |
2 |
... |
1 |
1 |
1 |
2 |
... |
4 |
Step |
1 |
2 |
3 |
1 |
2 |
3 |
1 |
2 |
3 |
1 |
2 |
3 |
1 |
2 |
3 |
1 |
... |
1 |
2 |
3 |
1 |
... |
3 |
The output for a search which utilizes the coordinates in the "coords.serial_run_number"
file will write its output
to "output.serial_run_number.processor_number" files in the directory labeled "serial_run_number".
For the first example shown above:
- Create directory 0/ and files data and coords.0 before starting GMIN.
- Submit job.
- Error data, details of which processors are being used and notification of starting next job
in serial run is sent to the standard output file stated in submission script.
- In directory 0/, the files output.0.0 - output.0.2 contain the details
of every 100th quench and the final quench for each coordinate set at a temperature of 0.6.
- Similarly, the files output.0.3 - output.0.5 contain the details
of every 100th quench and the final quench for each coordinate set at a temperature of 0.8.
- The files lowest.0.0 - lowest.0.2 contain the energy and points files
for the 5 lowest energy minima found from the searches for each coordinate set at a temperature of 0.6.
- Similarly, the files lowest.0.3 - lowest.0.5 contain the energy and points files
for the 5 lowest energy minima found from the searches for each coordinate set at a temperature of 0.8.
For the second example shown above:
- Create directories 0/, 1/ and 2/ and files data and coords.0,
coords.1 and coords.2 before starting GMIN.
- Submit job.
- Error data, details of which processors are being used and notification of starting next job
in serial run is sent to the standard output file stated in submission script.
- In directory 0/, output.0.0 - output.0.35 and lowest.0.0 - lowest.0.35
files will be created with the initial conditions indicated shown
here for the sets of coordinates for LJn.
- Similarly the output for LJn+1 and LJn+2 will be found in directories 1/ and
2/.
|