The concept and architecture of the distribution of databases. Homogeneous and heterogeneous raBD. Data Distribution Strategies in DBA

Lecture




RABD - a set of logically related shared data and their descriptions, which are physically distributed across several computers (nodes) in a computer network. Each table in the DBRD can be divided into a number of parts, called fragments. Fragments can be horizontal, vertical and mixed. The horizontal fragments are subsets of rows, and the vertical fragments are subsets of columns. Fragments are distributed on one or more nodes. In order to improve data availability and improve system performance, replication can be organized for individual fragments — supporting an up-to-date copy of a certain fragment on several different nodes. Replicates are many different physical copies of a certain database object, for which, according to the rules defined in the database, synchronization with a certain “master copy” is maintained. There are several alternative strategies for placing data in the system: separate (fragmented) placement, placement with full replication, and placement with selective replication. The separate placement of the database is divided into non-intersecting fragments, each of which is located on one of the nodes of the system. Failure at any of the nodes will cause the loss of access only to that part of the data that was stored on it. An arrangement with full replication involves placing a full copy of the entire database on each of the nodes in the system. Therefore, the reliability and availability of data, as well as the level of system performance will be maximized. Selective replication deployment is a combination of fragmentation, replication, and centralization methods. Some data arrays are divided into fragments, while others are replicated. All other data is stored centrally. Due to its flexibility, it is this strategy that is most often used. Information about the distribution of data is stored in the data distribution directory and is used to perform the distribution of queries and transactions to determine which copy of the fragment needs to be addressed in order to execute them.

Database replication is affected by: - ​​the size of the database; - frequency of use of the database; - the costs associated with the synchronization of transactions and their parts, while ensuring sufficient fault tolerance, associated with data replicates.

RABD - a set of logically related shared data and their descriptions, which are physically distributed across several computers (nodes) in a computer network. Each table in the DBRD can be divided into a number of parts, called fragments. Fragments can be horizontal, vertical and mixed. The horizontal fragments are subsets of rows, and the vertical fragments are subsets of columns. Fragments are distributed on one or more nodes. In order to improve data availability and improve system performance, replication can be organized for individual fragments — supporting an up-to-date copy of a certain fragment on several different nodes. Replicates are many different physical copies of a certain database object, for which, according to the rules defined in the database, synchronization with a certain “master copy” is maintained. There are several alternative strategies for placing data in the system: separate (fragmented) placement, placement with full replication, and placement with selective replication. The separate placement of the database is divided into non-intersecting fragments, each of which is located on one of the nodes of the system. Failure at any of the nodes will cause the loss of access only to that part of the data that was stored on it. An arrangement with full replication involves placing a full copy of the entire database on each of the nodes in the system. Therefore, the reliability and availability of data, as well as the level of system performance will be maximized. Selective replication deployment is a combination of fragmentation, replication, and centralization methods. Some data arrays are divided into fragments, while others are replicated. All other data is stored centrally. Due to its flexibility, it is this strategy that is most often used. Information about the distribution of data is stored in the data distribution directory and is used to perform the distribution of queries and transactions to determine which copy of the fragment needs to be addressed in order to execute them.

Database replication is affected by: - ​​the size of the database; - frequency of use of the database; - the costs associated with the synchronization of transactions and their parts, while ensuring sufficient fault tolerance, associated with data replicates.

RabD can be classified into homogeneous and heterogeneous. The homogeneous DBA is managed by the same type of DBMS. Heterogeneous DBMS is controlled by various types of DBMS using different data models - relational, network, hierarchical, or object-oriented DBMS. Homogeneous raBD is much easier to design and maintain. In addition, this approach allows you to gradually increase the size of raBD, consistently adding new nodes to the existing raBD. Heterogeneous RDBs usually arise when independent nodes, managed by their own DBMS, are integrated into the newly created RDB. RDBMS is a set of programs for the management of raBD, allowing to make data distribution “grim” for end users. The main task of the RDBMS is to ensure the integration of the local database, so that the user has access to all the databases as a single database.


Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Databases, knowledge and data warehousing. Big data, DBMS and SQL and noSQL

Terms: Databases, knowledge and data warehousing. Big data, DBMS and SQL and noSQL