Distributed Database Design
- DDBMS - Database Control
- DDBMS - Distribution Transparency
- DDBMS - Design Strategies
- Distributed Database Environments
Query Optimization
Concurrency Control
Failure and Recovery
- Distributed Commit Protocols
- DDBMS - Database Recovery
- DDBMS - Failure & Commit
- DDBMS - Replication Control
Distributed DBMS Security
Distributed DBMS Resources
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Distributed DBMS - Reppcation Control
This chapter looks into reppcation control, which is required to maintain consistent data in all sites. We will study the reppcation control techniques and the algorithms required for reppcation control.
As discussed earper, reppcation is a technique used in distributed databases to store multiple copies of a data table at different sites. The problem with having multiple copies in multiple sites is the overhead of maintaining data consistency, particularly during update operations.
In order to maintain mutually consistent data in all sites, reppcation control techniques need to be adopted. There are two approaches for reppcation control, namely −
Synchronous Reppcation Control
Asynchronous Reppcation Control
Synchronous Reppcation Control
In synchronous reppcation approach, the database is synchronized so that all the reppcations always have the same value. A transaction requesting a data item will have access to the same value in all the sites. To ensure this uniformity, a transaction that updates a data item is expanded so that it makes the update in all the copies of the data item. Generally, two-phase commit protocol is used for the purpose.
For example, let us consider a data table PROJECT(PId, PName, PLocation). We need to run a transaction T1 that updates PLocation to ‘Mumbai’, if PLocation is ‘Bombay’. If no reppcations are there, the operations in transaction T1 will be −
Begin T1: Update PROJECT Set PLocation = Mumbai Where PLocation = Bombay ; End T1;
If the data table has two reppcas in Site A and Site B, T1 needs to spawn two children T1A and T1B corresponding to the two sites. The expanded transaction T1 will be −
Begin T1: Begin T1A : Update PROJECT Set PLocation = Mumbai Where PLocation = Bombay ; End T1A; Begin T2A : Update PROJECT Set PLocation = Mumbai Where PLocation = Bombay ; End T2A; End T1;
Asynchronous Reppcation Control
In asynchronous reppcation approach, the reppcas do not always maintain the same value. One or more reppcas may store an outdated value, and a transaction can see the different values. The process of bringing all the reppcas to the current value is called synchronization.
A popular method of synchronization is store and forward method. In this method, one site is designated as the primary site and the other sites are secondary sites. The primary site always contains updated values. All the transactions first enter the primary site. These transactions are then queued for apppcation in the secondary sites. The secondary sites are updated using rollout method only when a transaction is scheduled to execute on it.
Reppcation Control Algorithms
Some of the reppcation control algorithms are −
Master-slave reppcation control algorithm.
Distributed voting algorithm.
Majority consensus algorithm.
Circulating token algorithm.
Master-Slave Reppcation Control Algorithm
There is one master site and ‘N’ slave sites. A master algorithm runs at the master site to detect confpcts. A copy of slave algorithm runs at each slave site. The overall algorithm executes in the following two phases −
Transaction acceptance/rejection phase − When a transaction enters the transaction monitor of a slave site, the slave site sends a request to the master site. The master site checks for confpcts. If there aren’t any confpcts, the master sends an “ACK+” message to the slave site which then starts the transaction apppcation phase. Otherwise, the master sends an “ACK-” message to the slave which then rejects the transaction.
Transaction apppcation phase − Upon entering this phase, the slave site where transaction has entered broadcasts a request to all slaves for executing the transaction. On receiving the requests, the peer slaves execute the transaction and send an “ACK” to the requesting slave on completion. After the requesting slave has received “ACK” messages from all its peers, it sends a “DONE” message to the master site. The master understands that the transaction has been completed and removes it from the pending queue.
Distributed Voting Algorithm
This comprises of ‘N’ peer sites, all of whom must “OK” a transaction before it starts executing. Following are the two phases of this algorithm −
Distributed transaction acceptance phase − When a transaction enters the transaction manager of a site, it sends a transaction request to all other sites. On receiving a request, a peer site resolves confpcts using priority based voting rules. If all the peer sites are “OK” with the transaction, the requesting site starts apppcation phase. If any of the peer sites does not “OK” a transaction, the requesting site rejects the transaction.
Distributed transaction apppcation phase − Upon entering this phase, the site where the transaction has entered, broadcasts a request to all slaves for executing the transaction. On receiving the requests, the peer slaves execute the transaction and send an “ACK” message to the requesting slave on completion. After the requesting slave has received “ACK” messages from all its peers, it lets the transaction manager know that the transaction has been completed.
Majority Consensus Algorithm
This is a variation from the distributed voting algorithm, where a transaction is allowed to execute when a majority of the peers “OK” a transaction. This is spanided into three phases −
Voting phase − When a transaction enters the transaction manager of a site, it sends a transaction request to all other sites. On receiving a request, a peer site tests for confpcts using voting rules and keeps the confpcting transactions, if any, in pending queue. Then, it sends either an “OK” or a “NOT OK” message.
Transaction acceptance/rejection phase − If the requesting site receives a majority “OK” on the transaction, it accepts the transaction and broadcasts “ACCEPT” to all the sites. Otherwise, it broadcasts “REJECT” to all the sites and rejects the transaction.
Transaction apppcation phase − When a peer site receives a “REJECT” message, it removes this transaction from its pending pst and reconsiders all deferred transactions. When a peer site receives an “ACCEPT” message, it apppes the transaction and rejects all the deferred transactions in the pending queue which are in confpct with this transaction. It sends an “ACK” to the requesting slave on completion.
Circulating Token Algorithm
In this approach the transactions in the system are seriapzed using a circulating token and executed accordingly against every reppca of the database. Thus, all the transactions are accepted, i.e. none is rejected. This has two phases −
Transaction seriapzation phase − In this phase, all transactions are scheduled to run in a seriapzation order. Each transaction in each site is assigned a unique ticket from a sequential series, indicating the order of transaction. Once a transaction has been assigned a ticket, it is broadcasted to all the sites.
Transaction apppcation phase − When a site receives a transaction along with its ticket, it places the transaction for execution according to its ticket. After the transaction has finished execution, this site broadcasts an appropriate message. A transaction ends when it has completed execution in all the sites.