MPC++/ULT/SCore-S Getting Started


This manual describes how a C/C++ program can be converted to an MPC++ program, how the converted program is compiled, how the compiled MPC++ program is invoked on a workstation cluster.

Porting "Hello World" program to MPC++

Here is how the simplest but most famous C program, "Hello, World" is converted to MPC++ program.

#include <stdio.h>
main() {
    printf("hello, world\n");
    exit( 0 );
}

First of all, you have to add an include file, mpcxx.h. Then the main() function should be renamed to mpc_main(). The last thing to do is to rename all exit() functions in the C/C++ program to mpc_exit(). Essentially this is all you have to do.

#include <stdio.h>
#include <mpcxx.h>
mpc_main() {
    printf("hello, world\n");
    mpc_exit( 0 );
}

Compilation

The compilation of an MPC++ program is as easy as normal C/C++ compilation. Let us assume the MPC++ program converted above is saved as hello.cc. Because the MPC++ compiler only accepts source files suffixed with .cc, filename hello.c cannot be compiled.

# mpc++ hello.cc
compiling  hello.cc
#

Just as a normal C/C++ compiler does, the MPC++ compiler produces an executable file, named a.out. The MPC++ compiler accepts the most of compiler options that the C/C++ compiler accepts.

Running a program on a workstation

The compiled hello program can run on your workstation even if you do not have a workstation cluster. However, in this case, no actual parallel execution is made.

# a.out
<0> SCore: Single processor mode
hello, world
#

Now you get the same result as normal C/C++ does.

Running a program on a workstation cluster

First of all, you have to invoke msgb to find some free hosts in the workstation cluster. Then, login to one of the free workstation cluster hosts. Finally you can invoke the program. Do not forget to add command arguments, like -score pe=2. The first argument -score lets the SCore-S runtime library know that the next argument is an option string for SCore-S. The second srgument in this example is pe=2. This means this program requires 2 nodes to run.

# a.out -score pe=2
<0> SCore: 2 processors are ready
hello, world
#

The result is exactly the same as the run on one host. What is wrong ? Normally, MPC++ runtime library calls the mpc_main() function on the host which is assigned to as node number 0. To get parallel hello program execution, you have to add SPMD_MODE after including mpcxx.h.

#include <stdio.h>
#include <mpcxx.h>
SPMD_MODE;
mpc_main() {
    printf("hello, world\n");
    mpc_exit( 0 );
}

Do not forget to compile the modified file. Let us try again.

# a.out -score pe=2
<0> SCore: 2 processors are ready
hello, world
hello, world
#

Now you get two answers, because the mpc_main() function is called on each host. If you specify four nodes, then you will get four answers. To make sure, the hello program is now slightly modified.

#include <stdio.h>
#include <mpcxx.h>
SPMD_MODE;
mpc_main() {
    printf("hello, world (from node %d)\n", myNode);
    mpc_exit( 0 );
}

The variable myNode is defined in the MPC++/ULT runtime library to contain the node number. Integer variable numNode indicates the number of nodes allocated for the parallel execution. Note that the value of numNode can be different from the number of nodes specified in the command argument. Therefore any MPC++ program should not assume that numNode is assigned to some specific value. If your program can only run with some specific number of nodes, check the value of numNode at the very beginning of your program.

However, you can specify some hints for the allocated number of nodes in your program. The SCORE_HINT_PENUM(MIN,MAX) gives the hints to MPC++/ULT runtime library. MIN is the minimum number of nodes, and MAX is the maximum numer of nodes.

#include <stdio.h>
#include <mpcxx.h>
SPMD_MODE;
SCORE_HINT_PENUM(4,16);
mpc_main() {
    printf("hello, world (from node %d)\n", myNode);
    mpc_exit( 0 );
}

Now, let us try to run the modified program again. This time, requesting the number nodes can be omitted, and then minimum number of nodes, four in this case, will be allocated.

# a.out
<0> SCore: 4 processors are ready
hello, world (from node 0)
hello, world (from node 2)
hello, world (from node 1)
hello, world (from node 3)
#

Now you get four answers, exact one answer from each node. Note that the sequence of the answer. Since the mpc_main() function is called and executed on each node in parallel, there is no way to predict which one is the first, and so on. This is the nature of parallel computation.

SEE ALSO

SCore Cluster System Manuals


hori@rwcp.or.jp

Last Updated:4/15/97

Parallel Distributed System Software Research Lab.

Tsukuba Research Center, Real World Computing Partnership