PageRenderTime 13ms CodeModel.GetById 2ms app.highlight 9ms RepoModel.GetById 1ms app.codeStats 0ms

/README.QsNet

https://code.google.com/
Unknown | 49 lines | 38 code | 11 blank | 0 comment | 0 complexity | c875bb9030a266495a389ccae54496cc MD5 | raw file
 1Running Elan MPI jobs on QsNet Clusters
 2---------------------------------------
 3
 4If built with --with-qshell or --with-mqshell, pdsh may be used to run
 5MPI jobs on a QsNet interconnect.  This option requires that you have
 6the Elan user space libraries installed (qsnetlibs or qswelan-r RPM for
 7Linux) and that your kernel be patched to run the 'elan3' or 'elan4' and
 8'rms' device drivers. Pdsh can run independently of the RMS product (the
 9'rms' kernel module, which is used by pdsh, is a distinct facility from
10the RMS product).
11
12Quadrics has provided a PDSH FAQ which may answer some common questions
13about getting the qshell module to run MPI jobs. Please see
14
15  http://web1.quadrics.com/twiki/bin/view/FAQs/SetupPDSH
16
17
18rms pdsh module
19---------------------------------------
20Pdsh can also be run via the Quadrics RMS 'allocate' command such that
21allocate takes care of the node reservations and passes a batch ID through
22to pdsh via the RMS_RESOURCEID environment variable.  Pdsh retrieves
23the list of allocated nodes out of the RMS database using the rmsquery
24command. This functionality is provided by the "rms" pdsh module
25(--with-rms).
26
27slurm pdsh module
28---------------------------------------
29Similar to the rms pdsh module, the slurm module allows pdsh to target
30nodes based on SLURM allocations, either targetting an already running job
31or by running under ``srun --allocate'' The SLURM jobid can be passed to
32pdsh using the `-j' option provided by the module, or via the SLURM_JOBID
33environment variable, which is set by --allocate.
34
35
36The `/etc/elanhosts' config file
37---------------------------------------
38Pdsh uses a simple config file, /etc/elanhosts, to describe hosts
39containing Elan adapters (and on which Elan MPI jobs may be run). The
40config file is also used by the daemons qshd and mqshd to initialize
41the Elan network error resolver thread. Parsing of the /etc/elanhosts
42file is accomplished by using the libelanhosts(3) library, upon which
43pdsh depends (when building for QsNet). See the elanhosts(5)
44man page for a description of the /etc/elanhosts file format.
45
46The libelanhosts package may be obtained from
47
48  ftp://ftp.llnl.gov/pub/linux/libelanhosts/
49