Our Books: Parallel and Distributed Programming Using C++

Chapter Summaries

Chapter 1: The Joys of Concurrent Programming

Chapter 2: The Challenges of Parallel and Distributed Programming

Chapter 3: Dividing C++ Programs into Multiple Tasks

Chapter 4: Dividing C++ Programs into Multiple Threads

Chapter 5: Synchronizing Concurrency between Tasks

Chapter 6: Adding Parallel Programming Capabilities to C++ through the PVM

Chapter 7: Error Handling, Exceptions, and Software Reliability

Chapter 8: Distributed Object-Oriented Programming in C++

Chapter 9: SPMD and MPMD Using Templates and the MPI

Chapter 10: Visualizing Concurrent and Distributed System Design

Chapter 11: Designing Components That Support Concurrency

Chapter 12: Implementing Agent-Oriented Architectures

Chapter 13: Blackboard Architectures Using PVM, Threads, and C++ Components

Publisher:
Addison Wesley
ISBN:
0-13-101376-9
Year:
2004
Page Numbers:
691
Chapters:
13

Chapter 1: The Joys of Concurrent Programming

Throughout this book we present a modeling an architectural approach to parallel and distributed programming. The emphasis is placed on uncovering the natural parallelism within a problem and its solution. This parallelism is captured within the software model for the solution. We suggest object-oriented methods to help manage the complexity of parallel and distributed programming. Our mantra is function follows form. We use the library approach to provide parallelism support for the C++ language. The libraries we recommend are based on national and international standards. Each library is freely available and in wide use. Techniques and concepts presented in the book are vendor independent, non-proprietary, and rely on open standards and open architectures. The C++ programmer and software developer can use different parallel models to serve different needs because each parallelism model is captured within a library. The library approach to parallel and distributed programming gives the C++ programmer the greatest possible flexibility. While parallel and distributed programming can be fun and rewarding, it presents several challenges. In the next chapter we will provide an overview of the most common challenges to parallel and distributed programming.

Chapter 2: The Challenges of Parallel and Distributed Programming

Parallel and distributed programming present challenges in several areas. New approaches to software design and architectures must be adopted. Many of the fundamentals assumptions that are held in the sequential model of programming don't apply in the realm of parallel and distributed programming. The four primary coordination problems: data race, indefinite postponement, deadlock, and communication synchronization are among the major obstacles to programs that require concurrency. Every aspect of the software development life cycle is impacted when the requirements include parallelism or distribution from the initial design down to the testing and documentation. In this book, we present architectural approaches to many of these problems. In addition to the architectural approach, we take advantage of the multi-paradigm capabilities of C++ to provide techniques for managing the complexity of parallel and distributed programs.

Chapter 3: Dividing C++ Programs into Multiple Tasks

Concurrency in a C++ program is accomplished by factoring your program into either multiple processes or multiple threads. A process is a unit of work created by the operating system. It is an artifact of the operating system where programs are artifacts of the developer. A program may consist of multiple processes or processes might not be associated with any particular program. Operating systems are capable of managing hundreds or even thousands of concurrently loaded processes. Some information and attributes of a process are stored in the process control block (PCB) used by the operating system to identify the process. This information is needed by the operating system to manage each process. The operating system multitask between processes by performing a context switch. It saves the current state of the executing process and its context to the PCB save area in order to restart the process the next time it is assigned to the CPU. When the process is utilizing a processor, it is in a running state. When it is waiting to use the CPU, it is in a ready state. The ps utility can be used to monitor the executing processes on the system.

Processes that create other processes have a parent-child relationship with the created process. The creator of the process is the parent and the created process is the child process. Child processes inherit many attributes from the parent. The parent's key responsibility is to wait for the child process so it can exit the system. There are several system calls that can be used to create processes: fork(), fork-exec, system() and posit_spawn(). The fork(), fork-exec, and posix_spawn() creates processes that are asynchronous to the parent process where system() creates a child process that is synchronous to the parent process. Asynchronous parents can call wait() function and at that point synchronously wait for child processes to terminate or retrieve exit codes for already terminated child processes.

A program can be divided into several processes. These processes can be spawned from a parent process, or launched from a shell script as separate binaries. Dedicated processes can spawn other processes as needed that only perform certain types of work. Processes can be spawned from functions or from methods.

Chapter 4: Dividing C++ Programs into Multiple Threads

In a sequential program, work can be divided between routines within a program where one task finishes then another task can perform work. With other programs, work is executed as mini-programs within the main program where the mini-programs execute concurrently with the main program. These mini-programs can be executed as processes or threads. With processes, each process has its own address space and require interprocess communication if the processes are to communicate. Threads share the address space of the process does not require special communication techniques between threads of the same process. Synchronization mechanisms such as mutexes are needed to protect share memory in order to control race conditions.

There are several models that can be used delegate work among threads and manage when threads are created and canceled. In the delegation model, a single thread (boss) creates the threads (workers) and assigns each a task. The boss thread waits until each worker thread completes its task. With the peer-to-peer model, there is a single thread that initially creates all the threads needed to perform all the tasks, that thread is considered a worker thread and does no delegation. All threads have equal status. The pipeline model is characterized as an assembly line in which a stream of items are processed in stages. Each stage, a thread executes work performed on the unit of input. The input moves from one thread to the next, processing it until completion. The last stage or thread produces the result of the pipeline. In the producer-consumer model, there is a producer thread that produces data to be consumed by the consumer thread. The data is stored in a block of memory shared between the producer and consumer threads. Objects can be made to be multithreaded. The threads are declared within the object. A member function can create a thread that executes a free floating function that in turns invokes one of the member functions of the object.

The Pthread library can be used to create and mange the threads of a multithreaded application. The Pthread library is based on a standardized programming interface for the creation and maintenance of threads. The thread interface has been specified by the IEEE standards committee in the POSIX 1003.1c standard. Third-party vendors supply an implementation that adheres to the POSIX standard.

Chapter 5: Synchronizing Concurrency between Tasks

Synchronization can be used to coordination the order of execution of processes and threads called task synchronization as well as access the shared data called data synchronization. There are four basic task synchronization relationships. A start-to-start relationship means task A cannot start until task B starts. A finish-to-start relationship means task A cannot finish until task B starts. A start- to-finish relationship means task A cannot start until task B finishes. A finish-to-finish (FF) relationships means task A cannot finish until task B finishes. The POSIX standard defines a condition variable of type pthread_cond_t that can be used to implement these task synchronization relationships.

The algorithm types of the PRAM model can be used to describe data synchronization. EREW (exclusive read exclusive write) access policy can be implemented with a mutex semaphore. The mutex semaphore protects the critical section by serializing entry into the critical section. Either read access or write access is allowed. The POSIX standard defines a mutex semaphore of type pthread_mutex_t that can be used to implement an EREW access policy. Read-write locks can be can be used to implement the CREW access policy. CREW access policy describes multiple concurrent reads of data but an exclusive write to that data. The POSIX standard defines a read-write lock of type pthread_rwlock_t. An object-oriented approach to data synchronization embeds synchronization inside the data object.

Chapter 6: Adding Parallel Programming Capabilities to C++ through the PVM

The PVM library is a flexible library that supports the major models of parallel programming. The advantage of PVM environment is its ability to work with heterogeneous collections of computers that may consist of different processor speeds, sizes, and architectures. Besides hardware compatibility, it works nicely with the C++ standard library and with the UNIX/Linux system library. When combined with the C++ template capabilities, object-oriented programming capabilities, and collection of algorithms the power of the PVM environment is increased considerably. The template facility has a nice application to SPMD programming. The containers and algorithms can be used to enhance the MIMD (MPMD) capabilities of the PVM. In Chapter 13, we dig a little deeper into the PVM and show how it can be used to help implement blackboards using C++. The blackboard is one of our primary choices for implementing parallel problem solving.

Chapter 7: Error Handling, Exceptions, and Software Reliability

Producing reliable software is serious business. Exception handling and defect removal should be approached with extreme rigour. Thorough testing, then debugging of a software compoent should be the primary defense against software defects. Exception handling should be added to the software system or subsystem after the software has undergone rigorous testing. Throwing exceptions should not be used as a generic error handling technique because it destroys the flow of control of the program. Exceptions should only be thrown after all of the measures have been exhausted. The standard exception handling classes should be used as architectural road maps for the programmer who wishes to design more complete and useful exception classes. If not specialized through inheritance the standard classes can only report errors. More useful exception classes can be built that have corrective functionality as well as more information. In general, both the termination and resumption models allow the program to continue to execute. Both models resist simply aborting the program when an error occurs. For a more complete discussion of exception handling, see The Design and Evolution of C++ (Stroustrup, 1994).

Chapter 8: Distributed Object-Oriented Programming in C++

Distributed programming involves programs that execute in different processes. Each process can potentially reside on a different computer and possibly on a different network with different network protocols. Distributed programming techniques allow the developer to divide an application into separately executing modules that will either have some kind of producer-consumer relationship or peer-to-peer relationship. The modules each have their own address space and computer resources. Distributed programming can be used to take advantage of special processors, peripherals, and other computer resources (i.e., database servers, applications servers, e-mail servers, etc.). CORBA is the standard for distributed object-oriented programming. We provide an introduction to some of the simple basics of CORBA programming. However, this chapter barely scratches the surface of the CORBA specification and CORBA services. It provides only enough to see what the basic components look like and how a simple distributed program can be constructed. The CORBA specifications for Web services, MAF, Naming Services, etc. can be obtained from www.omg.org. Michi Henning and Steve Vionosk provide a detailed resource in their Advanced CORBA Programming with C++. The Naming and Trader graphs provide the basis for powerful distributed knowledge representation that can be used in conjunction with multiagent programming. They provide the basis for the next level of smart web services.

Chapter 9: SPMD and MPMD Using Templates and the MPI

Implementations of the SPMD and MPMD models of concurrency have much to be gained by using templates and taking advantage of polymorphism. While the MPI does include bindings for C++ it does not take advantage of object-oriented programming techniques. This presents an opportunity and challenge to developers using the MPI standard. Inheritance and polymorphism can be used to simplify MPMD programming, Parameterized or genericity programming that is supported using the template facilities of C++ can be used to simplify SPMD programming with the MPI. Dividing a program's work between objects is a natural way to discover and exploit the parallelism within an application. Families of objects can be associated with communicators to facilitate communication in the MPI between multiple groups that have different work responsibilities. Operator overloading can be used to maintain a stream metaphor with the MPI. Using object-oriented programming techniques and parameterized programming techniques within the same MPI application is a multiparadigm approach that simplifies and in most cases shortens the code. It leads to programs that are easier to debug, test, and maintain. MPI tasks implemented by template functions tend to be more reliable across different data types that separately defined functions that have to perform type casting.

Chapter 10: Visualizing Concurrent and Distributed System Design

A model of a system is the body of information gathered for the purpose of studying the system. Documentation is a tool used in modeling a system. The UML, Unified Modeling Language, is a graphical notation used to design, visualize, model, and document the artifacts of a software system created by Grady Booch, James Rumbaugh, and Ivar Jacobson It is the defacto standard for communicating and modeling object-oriented systems. The UML can be used to model concurrent and distributed systems from the structural and behavioral perspectives.

UML diagrams can be used to model to most basic units, the object, to the whole system. An object is the basic unit used in many UML diagrams. Dependency, inheritance, aggregation and composition are some of the relationships that can exist between objects. Interaction diagrams are used to shows the behavior of an object and identify concurrency in the system. Objects can interact with other objects by communicating and invoking methods. Collaborations diagrams depict the interactions between objects working together to perform some particular task. Sequence diagrams are used to represent the interactions between object in time sequence. Statecharts are used to depicts the actions of a single object over its lifetime. Objects that are distributed can be tagged with the location of the node on which it resides.

Deployment diagrams are used to model the delivered system. The basic units of a deployment diagram are nodes and components. A node represents hardware and components are software. Nodes can be depicted to show what objects, or components reside on them. When modeling the whole system, the basic unit is a package. A package can be used to represent systems and subsystems. Packages can have relationships with other packages such as composition, or some type of association.

Chapter 11: Designing Components That Support Concurrency

The challenges to parallel programming introduced in Chapter 2 can be reasonably approached using the building blocks that we introduced in this chapter. The importance of the interface class in simplifying the use of function libraries cannot be overstated. The interface class introduces consistency of API by wrapping the function calls of libraries such as MPI or PVM. Type safety is introduced through interface classes. Reuse is introduced through interface classes. The interface class allows the programmer to work with a familiar metaphor as in the case of PVM streams or MPI streams. IPC is simplified by connecting pipe or message streams to iostreams and overriding the << inserters and >> extractor operators for user-defined classes. The ostream_iterator class proves to be very useful in sending entire container objects and their contents between processes. The ostream_iterator and the istream_iterator also provide the glue between the standard algorithms and IPC components and techniques. Since a large number of parallel or distributed applications use the message passing model, any technique that simplifies passing various datatypes between processes will simplify the programming required for the application. Using the iostreams, the ostream_iterator and the istream_iterator does this simplification. The framework class is introduced here as the basic building block of concurrency applications. We consider classes like mutex class, condition variable classes and the stream classes to low level components that should be hidden from the programmer within the framework class (where possible!). When building medium to large scale applications that require concurrency the programmer should not have to focus on these low level components. Ideally, the framework will be the base level building block for conurrency approaches that we introduce in the remainder of this book. The framework will provide us with patterns for peer-to-peer interaction, and client-server interaction. We can have numeric frameworks, database frameworks, agent frameworks, blackboard frameworks, GUI frameworks, etc. The approach that we advocate for implementing concurrency requirements builds applications from a collection of frameworks that already have the proper synchronization components wired into the proper relationships. In Chapter 12 and 13, we take a closer look at frameworks that support concurrency. We also introduce the use of standard C++ algorithms, containers, and function objects to manage the creation and spawning of multiple tasks or threads in applications that require concurrency.

Chapter 12: Implementing Agent-Oriented Architectures

Agents are rational objects. Agent-oriented programming is another important approach to parallel and distributed programming. Agents oriented programming provides a fresh approach to dealing with the age-old problems of decomposition, communication, and synchronization that are part of every parallel programming or distributed programming project. The C++ support for operator overloading, containers, and templates provide effective tools for implementing a wide range of agent classes. Future massively parallel and large complex distributed systems will rely on agent-oriented implementations because there is almost no other way to approach competently approach such systems. While the agent examples and techniques that we presented in this chapter were introductory, they provide the basis for understanding how practical agent systems can be built The POSIX thread API, MICO, PVM, and MPI libraries that are freely available and in wide use can be used to deploy multiagent systems. Multiagent systems can be used to implement either solutions that require parallel programming or solutions that require distributed programming. This book advocates two primary architectures for parallel programming and distributed programming. Agents provide the first architecture, and blackboards (which assume agents) provide the second. The next chapter provides a discussion of how to use blackboards to implement parallel and distributed programming solutions.

Chapter 13: Blackboard Architectures Using PVM, Threads, And C++ Components

The blackboard model supports concurrency. The concurrency is inherent in the structure of the blackboard and the relationship between the blackboard and the knowledge sources and between the knowledge sources and each other. The blackboard is a problem solving model. The problem is divided up into to knowledge specific areas. Each area is assigned a knowledge source or problem solver. The knowledge sources and problem solvers are typically self-contained and require little interaction with the other knowledge sources. The communication that is necessary occurs through the blackboard. Therefore, the knowledge sources and problem solvers serve to modularize the processing withing the program. These modules can be treated separately and they can execute concurrently without complex synchronization needs. The blackboard may be implemented using CORBA objects. When the blackboard is implemented as CORBA objects the knowledge sources can be distributed across intranets or the Internet. The blackboard acts as a kind of shared distributed memory for tasks within a PVM type environment. The blackboard model easily supports MPMD (MIMD) and the SPMD (SIMD) model. The concept of the blackboard motivates the designer to break down the work that a program needs to do along knowledge lines. This results in the program having a WBS of knowledge specialists. The blackboard will contain software models of the problem domain and the solution space. These software models help the designer and developer to discover any parallelism that is necessary within a program that will be implemented. Along side of the classic client-server model of distributed programming the blackboard model is one of the most powerful models available for both distributed and parallel programming. The knowledge sources or problem solvers in the blackboard model are often implemented as agents. In the next chapter we will task a closer look at how to implement agents and how to deploy multiagent systems to achieve concurrency.

Sample chapter, summaries, captions, table of contents, code example and listings are provided for your information. Copyright 2003 Addison Wesley. All rights reserved. No part of these materials may be duplicated or reproduced, in any form or by any means, without the written permission of the publisher.