This master thesis proposes a design for distributing the generation con- troller, a component in the FAST search architecture. The purpose of the generation controller is to ensure that the system maintains a consistent in- dex while introducing new content. The content is processed, indexed and queried in a fully distributed and parallel manner, which makes the need for some synchronization scheme apparent. Today s design and implementa- tion of the generation controller builds on two-phase commit. The current generation controller is a single-node solution, and for large installations, re- quiring fault tolerance as well as scaling for performance issues, it is regarded insufficient.
The most important finding in this thesis is Binomial Two-Phase Commit. It is a commit protocol intended for scenarios where there are a large number of participants involved with each transaction and the overhead of receiving votes from each participant is too big for a single coordinator to handle. The coordinator only needs to receive votes from a subset of the participants and the size of the subset is independent of the total number of participants. As a result, it scales with O(1) as number of participants increase. It accomplishes this by trading off the possibility to guarantee that a transaction is committed atomically.