Our Books: Parallel and Distributed Programming Using C++

In numericCombinations() function in Example 6.4, the PVM task is sent an array of floating point numbers as opposed to an array of bytes representing strings. So the colorCombinations() functions send its data to the PVM tasks using:

pvm_pkbyte(Buffer,strlen(Buffer),1);

pvm_send(TaskId,MessageId);

The numericCombination() function sends it data to the PVM tasks using:

pvm_pkdouble(ImportantNumbers,5,1);

pvm_send(TaskId,MessageId);

The colorCombinations() function in Example 6.4 builds a string of colors and then copies that string of colors into an array of char called Buffer. The array of char is then packed and sent to the PVM task using the pvm_pkbyte()and pvm_send() functions. The numericCombinations() function in Example 6.5 creates an array of doubles and sends it to the PVM task using the pvm_pkdouble()and pvm_send() functions. One function sends a character array, the other function sends an array of doubles. In both cases the PVM tasks are executing same program pvm_generic_combination. This is where the advantage of using C++ templates and genericity comes in. The same tasks are able to do work not only with different data but on different data types without a code change. The template facility in C++ helps to make the SPMD model more flexible and efficient. The pvm_generic_combination program is almost unaware of what data types it will be working with. The use of C++ container classes allows it to generate combinations of any vector<T> of objects. The pvm_generic_combinationprogram does know that it will be working with two data types. Example 6.6 shows a section of code from the pvm_generic_combination program.



// Example 6.6 Using the MessageId tag to distinguish data types

pvm_bufinfo(N,&NumBytes,&MessageId,&Ptid);
if(MessageId == 1){
   vector Source;
   Buf = new char[NumBytes];
   pvm_upkbyte(Buf,NumBytes,1);
   strstream Buffer;
   Buffer << Buf << ends;
   while(Buffer.good())
   {
      Buffer >> Color;
      if(!Buffer.eof()){
         Source.push_back(Color);
      }
   }
   generateCombinations(Source,Ptid,Value);
   delete Buf;
}
if(MessageId == 2){
   vector Source;
   double *ImportantNumber;
   NumBytes = NumBytes / sizeof(double);
   ImportantNumber = new double[NumBytes];
   pvm_upkdouble(ImportantNumber,NumBytes,1);
   copy(ImportantNumber,ImportantNumber +(NumBytes + 1),
   inserter(Source,Source.begin()));
   generateCombinations(Source,Ptid,Value);
   delete ImportantNumber;

}

Here we use theMessageId tag to distinguish which data type we are working with. But in C++ we can do better. If the MessageIdtag contains a 1 then we are working with strings therefore we make the declaration:

vector<string> Source;

If the MessageId tag contains a 2 then we know we are working with floating point numbers, and we make the declaration:

vector<double> Source;

Once we declare what type of data the vector Source will contain, the rest of the function in the pvm_generic_combination is generalized. Notice in Example 6.6 that each if() statement calls the

generateCombinations() function. This generateCombinations() function is a template function. This template architecture helps us to achieve the genericity that will extend the SPMD and the MPMD scenarios for our PVM programs. We will come back to the discussion of our pvm_generic_combination program after we present the basic mechanics of the PVM environment. It is important to note that C++ container classes, stream classes, and template algorithms add flexibility to PVM programming that cannot be easily implemented in other PVM environments. This flexibility creates opportunities for sophisticated yet elegant parallel architectures.

6.2.5.2. Using the MPMD (MIMD) Model with PVM and C++

Whereas the SPMD model uses the pvm_spawn() function to create some number of tasks executing the same program but on potentially different data or resources, the MPMD model will use the pvm_spawn() function to create tasks that are executing different programs each with their own data sets. Example 6.7 shows how a single C++ program could implement a MPMD model of computation using PVM calls.



// Example 6.7 Using PVM to implement MPMD model of computation

int main(int argc, char *argv[])
{
   int Task1[20];
   int Task2[50];
   int Task3[30];
   //...
   pvm_spawn("pvm_generic_combination", NULL,1,"host1",20,Task1);
   pvm_spawn("generate_plans",argv,0,"",50,Task2);
   pvm_spawn("agent_filters",argv++,1,"host 3",30,&Task3);
   //...
}

The code in Example 6.7 creates 100 tasks. The first twenty tasks are generating combinations. The next fifty tasks are generating plans from the combinations as the combinations are being created, The last thirty tasks are filtering the best plans from the set of plans being generated by the set of fifty tasks. This is in contrast to the SPMD model where all of the programs spawned by the pvm_spawn() function were the same. Here, we have pvm_generic_combination, generate_plans, and agent_filters performing the work of the PVM tasks. They are all executing concurrently. They each have their own set of data. Although, they are working with transformations of the data. The pvm_generic_combination transforms its input into something that generate_plans can use. The generate_plans program transforms its input into something that agent_filters can use. Obviously these tasks will send messages to each other. The messages will represent input and control information between the processes. Also notice in Example 6.7 that we used the pvm_spawn() routine to allocate 20 pvm_generic_combination on a computer named host1. The generate_plans task was allocated to 50 anonymous processors, but each of the fifty tasks received the same command line argument through the argv parameter. The agent_filters tasks were also directed to a particular computer. In this case, the computer was host 3, and each task received the same command line argument through the argv parameter. This emphasizes the flexibility and power of the PVM library. Figure 6-5 shows some options available for MPMD configurations using the PVM environment.

Figure 6-5

We can take advantage of particular resources of particular computers if so desired. We can use arbitrary anonymous computers in other cases. In addition we can assign different work to different tasks simultaneously. In Figure 6-5 Computer A is a MPP (Massively Parallel Processor) computer, and Computer B has a number of specialized numeric processors. Also notice that the PVM in Figure 6-5 consists of PowerPCs, Sparcs, Crays, etc. In some cases we don't care what specific capabilities of the computers in a PVM are but in other cases we do. The pvm_spawn() routine allows the C++ programmer to use the anonymous approach by simply not specifying which computer to create the tasks on. On the other hand if there is something special about some member of the PVM then that feature can be exploited by specifying the particular member using pvm_spawn().

<<Sidebar 6.1 Combination Notation>>

Suppose we wish to choose a team of 8 programmers from a pool of 24 candidates. How many different teams of 8 programmers could we come up with? One of the results that follow from the Fundamental Principle of Counting tells us there are 735,471 different teams consisting of 8 programmers that can be selected from a pool of 24. The notation C(n,r) is read the number of combination of n choose r. That is the number of choices taken r at a time from n items. C(n,r) is calculated by the formula (S-Figure 6-1):

Sidebar Figure 6-1

When we have a set that represents a combination e.g, {a,b,c} would be considered the same as the set {b,a,c}, or {c,b,a}. That is we don't care about the order of the members in the set. We are only concerned about the members in the set. Many parallel programs, search algorithms, heuristics, and artificial intelligence based programs have to deal with large sets of combinations and their close relative, permutations.

<<End Sidebar 6.1>>

6.3. The Basic Mechanics of the PVM

The PVM environment consists of two components:

· The PVM daemon (pvmd)

· The pvmd library

One pvmd daemon runs on each host computer in the virtual machine. The pvmd serves as a message router and controller. A pvmd is used to start additional pvmds. Each pvmd manages the lists of PVM tasks on its host machine. The pvmd performs process control, some minimal authentication and fault tolerance. Usually the first pvmd is started manually. This pvmd then starts the other pvmds. Only the original pvmd may start additional pvmds. Only the original pvmd may unconditionally stop another pvmd.

The pvmd library contains the routines that allow one PVM task to interact with other PVM tasks.The library also contains routines that allow the PVM task to communicate with its pvmd. Figure 6-6 shows the basic architecture of the PVM environment.

Figure 6-6

The PVM environment will consist of two or more PVM tasks. Each task will contain one or more send buffers. However, only one send buffer may be active at a time. This is called the active send buffer. Each task has an active receive buffer. Notice in Figure 6-6 that communication between PVM tasks is actually accomplished using TCP sockets. The pvm_send() routines make socket access transparent. The programmer does not access the TCP socket calls directly. Figure 6-6 also shows PVM tasks communicating to their pvmds using TCP sockets and pvmds communicating between each themselves using UDP sockets. Again the socket calls are performed by the PVM routines. The programmer does not have to do low-level socket programming. The PVM routines we use in this book fit into four categories:

· Process Management & Control

· Messaging Packing & Sending

· Message Unpacking & Receiving

· Message Buffer Management

While there are other categories of PVM routines such as the Information and Utility Functions and the Group Operations, we focus on the message processing and process management routines. We will discuss any other routines in the context of the programs in which they are used.

Part 2: Adding Parallel Programming Capabilities to C++ Through the PVM