Design and Analysis of Distributed Algorithms focuses on developing problem–solving skills and fully exploiting design tools and techniques. Moreover, the author helps readers develop the analytical tools and skills needed to evaluate the costs of complex designs and protocols.
This text is based on a simple and fully reactive computational model that allows for intuitive comprehension and logical designs. The principles and techniques that users learn can be applied to any distributed computing environment (e.g., distributed systems, communication networks, data networks, grid networks, internet, etc.). Based on a method developed and refined during the author′s twenty years of teaching experience, the text provides a wealth of unique material and learning aids that enable the reader to learn how to design algorithms and protocols to solve problems and perform tasks efficiently in a distributed computing environment. Features include:
- Emphasis on developing problem–solving skills and fully leveraging design tools and techniques with a straightforward, easy–to–follow writing style
- Chapter on distributed data and structures, an important area not covered in comparable texts
- Detailed coverage of synchronous computations, a necessary element for "energy aware" computing
- Theoretical and experimental exercises in each chapter that allow readers to apply their newfound skills
All algorithms and protocols presented in the text, as well as those in the exercises, are easily and immediately programmable. References at the end of each chapter lead readers to additional materials for further study.
A natural textbook for upper–level undergraduates and graduate students, with its emphasis on problem solving, this book is also ideal for system–protocol designers and communications software engineers and developers. It will enable them to understand the principles of how to design workable, efficient protocols in any distributed computing environment.
1. Distributed Computing Environments.
1.3 Axioms and Restrictions.
1.4 Cost and Complexity.
1.4.1 Amount of Communication Activities.
1.5 An Example: Broadcasting.
1.6 States and Events.
1.6.1 Time and Events.
1.6.2 States and Configurations.
1.7 Problems and Solutions (∗).
1.8.1 Levels of Knowledge.
1.8.2 Types of Knowledge.
1.9 Technical Considerations.
1.9.3 Communication Mechanism.
1.10 Summary of Definitions.
1.11 Bibliographical Notes.
1.12 Exercises, Problems, and Answers.
1.12.1 Exercises and Problems.
1.12.2 Answers to Exercises.
2. Basic Problems And Protocols.
2.1.1 The Problem.
2.1.2 Cost of Broadcasting.
2.1.3 Broadcasting in Special Networks.
2.2.1 Generic Wake–Up.
2.2.2 Wake–Up in Special Networks.
2.3.1 Depth–First Traversal.
2.3.2 Hacking (∗).
2.3.3 Traversal in Special Networks.
2.3.4 Considerations on Traversal.
2.4 Practical Implications: Use a Subnet.
2.5 Constructing a Spanning Tree.
2.5.1 SPT Construction with a Single Initiator: Shout.
2.5.2 Other SPT Constructions with Single Initiator.
2.5.3 Considerations on the Constructed Tree.
2.5.4 Application: Better Traversal.
2.5.5 Spanning–Tree Construction with Multiple Initiators.
2.5.6 Impossibility Result.
2.5.7 SPT with Initial Distinct Values.
2.6 Computations in Trees.
2.6.1 Saturation: A Basic Technique.
2.6.2 Minimum Finding.
2.6.3 Distributed Function Evaluation.
2.6.4 Finding Eccentricities.
2.6.5 Center Finding.
2.6.6 Other Computations.
2.6.7 Computing in Rooted Trees.
2.7.1 Summary of Problems.
2.7.2 Summary of Techniques.
2.8 Bibliographical Notes.
2.9 Exercises, Problems, and Answers.
2.9.3 Answers to Exercises.
3.1.1 Impossibility Result.
3.1.2 Additional Restrictions.
3.1.3 Solution Strategies.
3.2 Election in Trees.
3.3 Election in Rings.
3.3.1 All the Way.
3.3.2 As Far As It Can.
3.3.3 Controlled Distance.
3.3.4 Electoral Stages.
3.3.5 Stages with Feedback.
3.3.6 Alternating Steps.
3.3.7 Unidirectional Protocols.
3.3.8 Limits to Improvements (∗).
3.3.9 Summary and Lessons.
3.4 Election in Mesh Networks.
3.5 Election in Cube Networks.
3.5.1 Oriented Hypercubes.
3.5.2 Unoriented Hypercubes.
3.6 Election in Complete Networks.
3.6.1 Stages and Territory.
3.6.2 Surprising Limitation.
3.6.3 Harvesting the Communication Power.
3.7 Election in Chordal Rings (∗).
3.7.1 Chordal Rings.
3.7.2 Lower Bounds.
3.8 Universal Election Protocols.
3.8.2 Analysis of Mega–Merger.
3.8.4 Lower Bounds and Equivalences.
3.9 Bibliographical Notes.
3.10 Exercises, Problems, and Answers.
3.10.3 Answers to Exercises.
4. Message Routing and Shortest Paths.
4.2 Shortest Path Routing.
4.2.1 Gossiping the Network Maps.
4.2.2 Iterative Construction of Routing Tables.
4.2.3 Constructing Shortest–Path Spanning Tree.
4.2.4 Constructing All–Pairs Shortest Paths.
4.2.5 Min–Hop Routing.
4.2.6 Suboptimal Solutions: Routing Trees.
4.3 Coping with Changes.
4.3.1 Adaptive Routing.
4.3.2 Fault–Tolerant Tables.
4.3.3 On Correctness and Guarantees.
4.4 Routing in Static Systems: Compact Tables.
4.4.1 The Size of Routing Tables.
4.4.2 Interval Routing.
4.5 Bibliographical Notes.
4.6 Exercises, Problems, and Answers.
4.6.3 Answers to Exercises.
5. Distributed Set Operations.
5.2 Distributed Selection.
5.2.1 Order Statistics.
5.2.2 Selection in a Small Data Set.
5.2.3 Simple Case: Selection Among Two Sites.
5.2.4 General Selection Strategy: RankSelect.
5.2.5 Reducing the Worst Case: ReduceSelect.
5.3 Sorting a Distributed Set.
5.3.1 Distributed Sorting.
5.3.2 Special Case: Sorting on a Ordered Line.
5.3.3 Removing the Topological Constraints: Complete Graph.
5.3.4 Basic Limitations.
5.3.5 Efficient Sorting: SelectSort.
5.3.6 Unrestricted Sorting.
5.4 Distributed Sets Operations.
5.4.1 Operations on Distributed Sets.
5.4.2 Local Structure.
5.4.3 Local Evaluation (∗).
5.4.4 Global Evaluation.
5.4.5 Operational Costs.
5.5 Bibliographical Notes.
5.6 Exercises, Problems, and Answers.
5.6.3 Answers to Exercises.
6. Synchronous Computations.
6.1 Synchronous Distributed Computing.
6.1.1 Fully Synchronous Systems.
6.1.2 Clocks and Unit of Time.
6.1.3 Communication Delays and Size of Messages.
6.1.4 On the Unique Nature of Synchronous Computations.
6.1.5 The Cost of Synchronous Protocols.
6.2 Communicators, Pipeline, and Transformers.
6.2.1 Two–Party Communication.
6.3 Min–Finding and Election: Waiting and Guessing.
6.3.3 Double Wait: Integrating Waiting and Guessing.
6.4 Synchronization Problems: Reset, Unison, and Firing Squad.
6.4.1 Reset /Wake–up.
6.4.3 Firing Squad.
6.5 Bibliographical Notes.
6.6 Exercises, Problems, and Answers.
6.6.3 Answers to Exercises.
7. Computing in Presence of Faults.
7.1.1 Faults and Failures.
7.1.2 Modelling Faults.
7.1.3 Topological Factors.
7.1.4 Fault Tolerance, Agreement, and Common Knowledge.
7.2 The Crushing Impact of Failures.
7.2.1 Node Failures: Single–Fault Disaster.
7.2.2 Consequences of the Single Fault Disaster.
7.3 Localized Entity Failures: Using Synchrony.
7.3.1 Synchronous Consensus with Crash Failures.
7.3.2 Synchronous Consensus with Byzantine Failures.
7.3.3 Limit to Number of Byzantine Entities for Agreement.
7.3.4 From Boolean to General Byzantine Agreement.
7.3.5 Byzantine Agreement in Arbitrary Graphs.
7.4 Localized Entity Failures: Using Randomization.
7.4.1 Random Actions and Coin Flips.
7.4.2 Randomized Asynchronous Consensus: Crash Failures.
7.4.3 Concluding Remarks.
7.5 Localized Entity Failures: Using Fault Detection.
7.5.1 Failure Detectors and Their Properties.
7.5.2 The Weakest Failure Detector.
7.6 Localized Entity Failures: Pre–Execution Failures.
7.6.1 Partial Reliability.
7.6.2 Example: Election in Complete Network.
7.7 Localized Link Failures.
7.7.1 A Tale of Two Synchronous Generals.
7.7.2 Computing With Faulty Links.
7.7.3 Concluding Remarks.
7.7.4 Considerations on Localized Entity Failures.
7.8 Ubiquitous Faults.
7.8.1 Communication Faults and Agreement.
7.8.2 Limits to Number of Ubiquitous Faults for Majority.
7.8.3 Unanimity in Spite of Ubiquitous Faults.
7.9 Bibliographical Notes.
7.10 Exercises, Problems, and Answers.
7.10.3 Answers to Exercises.
8. Detecting Stable Properties.
8.2 Deadlock Detection.
8.2.2 Detecting Deadlock: Wait–for Graph.
8.2.3 Single–Request Systems.
8.2.4 Multiple–Requests Systems.
8.2.5 Dynamic Wait–for Graphs.
8.2.6 Other Requests Systems.
8.3 Global Termination Detection.
8.3.1 A Simple Solution: Repeated Termination Queries.
8.3.2 Improved Protocols: Shrink.
8.3.3 Concluding Remarks.
8.4 Global Stable Property Detection.
8.4.1 General Strategy.
8.4.2 Time Cuts and Consistent Snapshots.
8.4.3 Computing A Consistent Snapshot.
8.4.4 Summary: Putting All Together.
8.5 Bibliographical Notes.
8.6 Exercises, Problems, and Answers.
8.6.3 Answers to Exercises.
9. Continuous Computations.
9.2 Keeping Virtual Time.
9.2.1 Virtual Time and Causal Order.
9.2.2 Causal Order: Counter Clocks.
9.2.3 Complete Causal Order: Vector Clocks.
9.2.4 Concluding Remarks.
9.3 Distributed Mutual Exclusion.
9.3.1 The Problem.
9.3.2 A Simple And Efficient Solution.
9.3.3 Traversing the Network.
9.3.4 Managing a Distributed Queue.
9.3.5 Decentralized Permissions.
9.3.6 Mutual Exclusion in Complete Graphs: Quorum.
9.3.7 Concluding Remarks.
9.4 Deadlock: System Detection and Resolution.
9.4.1 System Detection and Resolution.
9.4.2 Detection and Resolution in Single–Request Systems.
9.4.3 Detection and Resolution in Multiple–Requests Systems.
9.5 Bibliographical Notes.
9.6 Exercises, Problems, and Answers.
9.6.3 Answers to Exercises.