Sets, relationships, fact table in relational databases

Lecture



Elements of set theory

Sets

The simplest data structure used in mathematics occurs when there are no interconnections between individual isolated data. The combination of such data is a set . The concept of a set is an undefined concept. The set does not have an internal structure. A set can be imagined as a collection of elements with some common property. In order for some set of elements to be called a set, it is necessary that the following conditions be fulfilled:

  1. There should be a rule to determine whether the specified element belongs to this population.
  2. There should be a rule to distinguish elements from each other. (This, in particular, means that the set cannot contain two identical elements).

Sets are usually denoted by capital Latin letters. If the item   Sets, relationships, fact table in relational databases belongs to many   Sets, relationships, fact table in relational databases then it is denoted by:

  Sets, relationships, fact table in relational databases

If each element of the set   Sets, relationships, fact table in relational databases is also an element of the set   Sets, relationships, fact table in relational databases then they say that a lot   Sets, relationships, fact table in relational databases is a subset of the set   Sets, relationships, fact table in relational databases :

  Sets, relationships, fact table in relational databases

Subset   Sets, relationships, fact table in relational databases sets   Sets, relationships, fact table in relational databases called its own subset if

  Sets, relationships, fact table in relational databases

Using the concept of a set, one can construct more complex and informative objects.

Set operations

The main operations on sets are union , intersection, and difference .

Definition 1 . The union of two sets is a new set.

  Sets, relationships, fact table in relational databases

Definition 2 . The intersection of two sets is a new set.

  Sets, relationships, fact table in relational databases

Definition 3 . The difference of two sets is a new set.

  Sets, relationships, fact table in relational databases

If the class of objects on which different sets are defined denote   Sets, relationships, fact table in relational databases ( Universum ), then the complement of the set   Sets, relationships, fact table in relational databases call the difference

  Sets, relationships, fact table in relational databases

Cartesian product of sets

One of the ways to construct new objects from the already existing sets is the Cartesian product of sets .

Let be   Sets, relationships, fact table in relational databases and   Sets, relationships, fact table in relational databases - sets. Expression type   Sets, relationships, fact table in relational databases where   Sets, relationships, fact table in relational databases and   Sets, relationships, fact table in relational databases is called an ordered pair . Equality of the form   Sets, relationships, fact table in relational databases means that   Sets, relationships, fact table in relational databases and   Sets, relationships, fact table in relational databases . In general, you can consider ordered n-ku   Sets, relationships, fact table in relational databases of the elements   Sets, relationships, fact table in relational databases . Ordered n-ki otherwise called sets or tuples .

Definition 4 . Cartesian (direct) product of sets   Sets, relationships, fact table in relational databases is called the set of ordered n-ok (sets, tuples) of the form

  Sets, relationships, fact table in relational databases

Definition 5 . Cartesian degree   Sets, relationships, fact table in relational databases is called the number of sets n included in this Cartesian product.

Remark If all the sets   Sets, relationships, fact table in relational databases are the same then use the designation

  Sets, relationships, fact table in relational databases .

Attitude

Definition 6 . Subset   Sets, relationships, fact table in relational databases Cartesian product of sets   Sets, relationships, fact table in relational databases called the ratio of degree n ( n-ary ratio ).

Definition 7 . The power of a set of tuples related to   Sets, relationships, fact table in relational databases is called the ratio power   Sets, relationships, fact table in relational databases .

Remark The concept of relationship is very important not only from a mathematical point of view. The concept of relationship actually underlies the entire relational theory of databases. As will be shown below, relations are the mathematical analogue of tables . The term "relational data representation", first introduced by Codd [43], comes from the term relation , which is understood precisely in the sense of this definition.

Because Any set can be considered as a Cartesian product of degree 1, then any subset, like any set, can be considered a ratio of degree 1. This is not a very interesting example, indicating only that the terms "ratio of degree 1" and "subset" are synonymous. The nontriviality of the concept of a relationship is manifested when the degree of a relationship is greater than 1. Two points are key here:

First , all the elements of a relationship are of the same type of tuples. Tuples of the same type make it possible to consider them as analogs of rows in a simple table, i.e. in such a table, in which all rows consist of the same number of cells and the corresponding cells contain the same data types. For example, the relation consisting of the following three tuples {(1, "Ivanov", 1000), (2, "Petrov", 2000), (3, "Sidorov", 3000)} can be considered a table containing information about employees and their salaries. Such a table will have three rows and three columns, with each column containing data of the same type.

In contrast, consider the set {(1), (1, 2), (1, 2, 3)}, consisting of different types of numerical tuples. This set is not a relation to   Sets, relationships, fact table in relational databases nor in   Sets, relationships, fact table in relational databases nor in   Sets, relationships, fact table in relational databases . From the tuples belonging to this set it is impossible to make a simple table. However, this set can be considered a ratio of degree 1 on the set of all possible numerical tuples of all possible degrees   Sets, relationships, fact table in relational databases , but such an interpretation gives nothing new in comparison with the notion of a subset.

Secondly . Except in the extreme case when the relation is the Cartesian product itself.   Sets, relationships, fact table in relational databases The relation does not include all possible tuples from the Cartesian product. This means that for each relationship there is a criterion that allows you to determine which tuples are related and which ones are not. This criterion, in essence, determines for us the meaning (semantics) of the relationship.

Indeed, each relation can be associated with a certain logical expression.   Sets, relationships, fact table in relational databases depending on n parameters (n-place predicate) and determining whether a tuple will be   Sets, relationships, fact table in relational databases belong to a relation   Sets, relationships, fact table in relational databases . This logical expression is called the relationship predicate.   Sets, relationships, fact table in relational databases . More precisely, the tuple   Sets, relationships, fact table in relational databases belongs to the relation   Sets, relationships, fact table in relational databases if and only if the predicate of this relationship   Sets, relationships, fact table in relational databases takes the value "true". In turn, each n-local predicate gives some n-ary relation. Thus, there is a one-to-one correspondence between n-ary relations and n-local predicates.

If it does not cause confusion, it is convenient and the relationship, and its predicate denoted by the same letter. For example, the ratio   Sets, relationships, fact table in relational databases has a predicate   Sets, relationships, fact table in relational databases .

Relationship examples

Binary Relations (Relationships of Degree 2)

In mathematics, binary relations play an important role, i.e. relations defined on the Cartesian product of two sets   Sets, relationships, fact table in relational databases .

Equivalence relation

Definition 8 . Attitude   Sets, relationships, fact table in relational databases on set   Sets, relationships, fact table in relational databases is called an equivalence relation if it has the following properties:

  1.   Sets, relationships, fact table in relational databases for all   Sets, relationships, fact table in relational databases (reflexivity)
  2. If a   Sets, relationships, fact table in relational databases then   Sets, relationships, fact table in relational databases (symmetry)
  3. If a   Sets, relationships, fact table in relational databases and   Sets, relationships, fact table in relational databases then   Sets, relationships, fact table in relational databases (transitivity)

Usually the equivalence relation is denoted by   Sets, relationships, fact table in relational databases or   Sets, relationships, fact table in relational databases and they say that it (the relation) is given on the set   Sets, relationships, fact table in relational databases (and not on   Sets, relationships, fact table in relational databases ). Conditions 1-3 in such designations look more natural:

  1.   Sets, relationships, fact table in relational databases for all   Sets, relationships, fact table in relational databases (reflexivity)
  2. If a   Sets, relationships, fact table in relational databases then   Sets, relationships, fact table in relational databases (symmetry)
  3. If a   Sets, relationships, fact table in relational databases and   Sets, relationships, fact table in relational databases then   Sets, relationships, fact table in relational databases (transitivity)

It is easy to prove that if on the set   Sets, relationships, fact table in relational databases given an equivalence relation, then the set   Sets, relationships, fact table in relational databases divided into mutually disjoint subsets consisting of equivalent elements ( equivalence classes ).

Example 1 Consider on the set of real numbers   Sets, relationships, fact table in relational databases a relation defined simply by equality of numbers. The predicate of this relationship is:

  Sets, relationships, fact table in relational databases , or simply   Sets, relationships, fact table in relational databases

Conditions 1-3 are obviously satisfied, therefore this relation is an equivalence relation. Each equivalence class of this relation consists of one number.

Example 2 Consider a more complex equivalence relation. On the set of integers   Sets, relationships, fact table in relational databases set the relationship "equality modulo n" as follows: two numbers   Sets, relationships, fact table in relational databases and   Sets, relationships, fact table in relational databases equal modulo n if their residues when divided by n are equal. For example, modulo 5 equals numbers 2, 7, 12, etc.

Conditions 1–3 are easily verified; therefore, modulo equality is an equivalence relation. The predicate of this relationship is:

  Sets, relationships, fact table in relational databases

The equivalence classes of this relation are made up of numbers that give the same residues when divided by n. There are exactly n such classes:

  [0] = {0, n, 2n, ...}
 [1] = {1, n + 1, 2n + 1, ...}
 ...
 [n-1] = {n-1, n + n-1, 2n + n-1, ...}

Relationship order

Definition 9 . Attitude   Sets, relationships, fact table in relational databases on set   Sets, relationships, fact table in relational databases is called an order relation if it has the following properties:

  1.   Sets, relationships, fact table in relational databases for all   Sets, relationships, fact table in relational databases (reflexivity)
  2. If a   Sets, relationships, fact table in relational databases and   Sets, relationships, fact table in relational databases then   Sets, relationships, fact table in relational databases (antisymmetry)
  3. If a   Sets, relationships, fact table in relational databases and   Sets, relationships, fact table in relational databases then   Sets, relationships, fact table in relational databases (transitivity)

Usually the order relationship is indicated by   Sets, relationships, fact table in relational databases . If for two elements   Sets, relationships, fact table in relational databases and   Sets, relationships, fact table in relational databases performed   Sets, relationships, fact table in relational databases then say that   Sets, relationships, fact table in relational databases "preceded"   Sets, relationships, fact table in relational databases . As for the equivalence relation, conditions 1-3 in such notation look more natural:

  1.   Sets, relationships, fact table in relational databases for all   Sets, relationships, fact table in relational databases (reflexivity)
  2. If a   Sets, relationships, fact table in relational databases and   Sets, relationships, fact table in relational databases then   Sets, relationships, fact table in relational databases (antisymmetry)
  3. If a   Sets, relationships, fact table in relational databases and   Sets, relationships, fact table in relational databases then   Sets, relationships, fact table in relational databases (transitivity)

Example 3 A simple example of an order relation is the relation given by the usual inequality   Sets, relationships, fact table in relational databases on the set of real numbers   Sets, relationships, fact table in relational databases . Note that for any numbers   Sets, relationships, fact table in relational databases and   Sets, relationships, fact table in relational databases performed either   Sets, relationships, fact table in relational databases either   Sets, relationships, fact table in relational databases i.e. any two numbers are comparable. Such relationships are called full order relationships .

The predicate of this relationship is simply a statement.   Sets, relationships, fact table in relational databases .

Пример 4 . Рассмотрим на множестве   Sets, relationships, fact table in relational databases всех сотрудников некоторого предприятия отношение, задаваемое следующим образом: сотрудник   Sets, relationships, fact table in relational databases предшествует сотруднику   Sets, relationships, fact table in relational databases тогда и только тогда, когда выполняется одно из условий:

  •   Sets, relationships, fact table in relational databases
  •   Sets, relationships, fact table in relational databases является начальником (не обязательно непосредственным)   Sets, relationships, fact table in relational databases

Назовем такое отношение "быть начальником". Легко проверить, что отношение "быть начальником" является отношением порядка. Заметим, что в отличие от предыдущего примера, существуют такие пары сотрудников   Sets, relationships, fact table in relational databases and   Sets, relationships, fact table in relational databases , для которых не выполняется ни   Sets, relationships, fact table in relational databases , ни   Sets, relationships, fact table in relational databases (for example, if   Sets, relationships, fact table in relational databases and   Sets, relationships, fact table in relational databases являются сослуживцами). Такие отношения, в которых есть несравнимые между собой элементы, называют отношениями частичного порядка .

Функциональное отношение

Определение 10 . Отношение   Sets, relationships, fact table in relational databases на декартовом произведении двух множеств   Sets, relationships, fact table in relational databases называется функциональным отношением , если оно обладает следующим свойством:

  1. If a   Sets, relationships, fact table in relational databases and   Sets, relationships, fact table in relational databases then   Sets, relationships, fact table in relational databases (однозначность функции).

Обычно, функциональное отношение обозначают в виде функциональной зависимости -   Sets, relationships, fact table in relational databases then and only if   Sets, relationships, fact table in relational databases . Функциональные отношения (подмножества декартового произведения!) называют иначе графиком функции или графиком функциональной зависимости .

Предикат функционального отношения есть просто выражение функциональной зависимости   Sets, relationships, fact table in relational databases .

Еще пример бинарного отношения

Пример 5 . Пусть множество   Sets, relationships, fact table in relational databases есть следующее множество молодых людей: {Вовочка, Петя, Маша, Лена}, причем известны следующие факты:

  1. Little Johnny loves Little Johnny (egoist).
  2. Peter loves Masha (mutually).
  3. Masha loves Petya (mutually).
  4. Masha loves Masha (she does not forget herself).
  5. Lena loves Petya (unhappy love).

Information about the relationship of these young people can be described by the binary "love" relationship given on the set   Sets, relationships, fact table in relational databases . This relationship can be described in several ways.

Method 1. Enumeration of facts in the form of an arbitrary text (as was done above).

Method 2. In the form of a relationship graph:

  Sets, relationships, fact table in relational databases

Figure 1 Relationship Graph

Method 3. Using the relationship matrix:

Whom

Who

Little Johnny Petya Masha Lena

Little Johnny

Loves

Petya

Loves

Masha

Loves Loves

Lena

Loves

Table 1. Relationship Matrix

Method 4. Using the fact table:

Who loves Who love
Little Johnny Little Johnny
Petya Masha
Masha Petya
Masha Masha
Lena Petya

Table 2 Fact Table

From the point of view of relational databases, the fourth method is most preferable, since it allows the most convenient way to store and manipulate information. Indeed, the enumeration of facts as a textual form of information storage is appropriate for a literary work, but it is difficult to process it algorithmatically. The image in the form of a graph is visual, and it is convenient to use it as the final form of presenting information to the user, but it is inconvenient to store data in a graphical form. The relationship matrix is ​​more in line with the requirements of the information system. The matrix is ​​convenient in processing and compactly stored. But one small change, for example, Vasya appeared and fell in love with unfortunate Lena, requires restructuring of the whole matrix, namely, adding both columns and columns. . Таблица фактов свободна от всех этих недостатков - при добавлении новых действующих лиц просто добавляются новые строки.

Что касается предиката данного отношения, то он имеет следующий вид (дизъюнктивная нормальная форма):

R(x,y) = {(x = "Вовочка" AND y = "Вовочка") OR (x = "Петя" AND y = "Маша") OR (x = "Маша" AND y = "Петя") OR (x = "Маша" AND y = "Маша") OR (x = "Лена" AND y = "Петя")}

Замечание. Приведенное отношение не является ни транзитивным, ни симметричным или антисимметричным, ни рефлексивным, поэтому оно не является ни отношением эквивалентности, ни отношением порядка, ни каким-либо другим разумным отношением.

Comment. Most of the world literature exists and makes sense only insofar as the binary relation "to love" is not an equivalence relation. In particular, for this reason, humanity is not divided into equivalence classes of mutually loving individuals. The study of the characteristics of this relationship and the corresponding predicate was engaged in (and continues to be engaged) by a large number of experts, such as Tolstoy L.N., Shakespeare V. and others.

n-ary relations (relations of degree n)

In mathematics, n-ary relations are considered relatively rarely, unlike databases, where the most important are the relations defined on the Cartesian product of more than two sets .

Example 6 . At a university at the Faculty of Mathematics students are students Ivanov, Petrov and Sidorov. Lectures are given to them by the teachers of Pushkins, Tsyganov and Sharipov, and the following facts are known:

  1. Pushnikov lectures on algebra and databases, respectively, 40 and 80 hours per semester.
  2. Tsyganov lectures on geometry, 50 hours per semester.
  3. Sharipov lectures on algebra and geometry, respectively, 40 and 50 hours per semester.
  4. Student Ivanov attends lectures on algebra at Sharipov and on databases at Pushunkov.
  5. Student Petrov attends lectures on algebra in Pushnikov and on geometry in Tsyganov.
  6. Student Sidorov attends lectures on geometry at Tsyganov and on databases at Pushnikov.

Для того чтобы формально описать данную ситуацию (например, в целях разработки информационной системы, учитывающей данные о ходе учебного процесса), введем три множества:

  • Множество преподавателей   Sets, relationships, fact table in relational databases = {Пушников, Цыганов, Шарипов}.
  • Множество предметов   Sets, relationships, fact table in relational databases = {Алгебра, Геометрия, Базы данных}.
  • Множество студентов   Sets, relationships, fact table in relational databases = {Иванов, Петров, Сидоров}.

Имеющиеся факты можно разделить на две группы. 1 группа (факты 1-3) - факты о преподавателях, 2 группа (факты 4-6) - факты о студентах.

Для того чтобы отразить факты 1-3 (характеризующие преподавателей и читаемые ими лекции), введем отношение   Sets, relationships, fact table in relational databases на декартовом произведении   Sets, relationships, fact table in relational databases where   Sets, relationships, fact table in relational databases - множество рациональных чисел. А именно, упорядоченная тройка   Sets, relationships, fact table in relational databases тогда и только тогда, когда преподаватель   Sets, relationships, fact table in relational databases читает лекции по предмету   Sets, relationships, fact table in relational databases в количестве   Sets, relationships, fact table in relational databases часов в семестр. Назовем такое отношение "Читает лекции по…". Множество кортежей, образующих отношение   Sets, relationships, fact table in relational databases удобно представить в виде таблицы:

A (Преподаватель) B (Предмет) Q (Количество часов)
Пушников Algebra 40
Пушников Базы данных 80
Цыганов Geometry 50
Шарипов Algebra 40
Шарипов Geometry 50

Таблица 3 Отношение "Читает лекции по…"

Для того чтобы отразить факты 4-6 (характеризующие посещение студентами лекций), введем отношение   Sets, relationships, fact table in relational databases на декартовом произведении   Sets, relationships, fact table in relational databases . Упорядоченная тройка   Sets, relationships, fact table in relational databases тогда и только тогда, когда студент   Sets, relationships, fact table in relational databases посещает лекции по предмету   Sets, relationships, fact table in relational databases у преподавателя   Sets, relationships, fact table in relational databases . Назовем это отношение "Посещать лекции". Его также представим в виде таблицы:

C (студент) B (предмет) A (Преподаватель)
Ivanov Algebra Шарипов
Ivanov Базы данных Пушников
Петров Algebra Пушников
Петров Geometry Цыганов
Сидоров Geometry Цыганов
Сидоров Базы данных Пушников

Таблица 4 Отношение "Посещать лекции"

Рассмотрим отношение   Sets, relationships, fact table in relational databases подробнее. Оно задано на декартовом произведении   Sets, relationships, fact table in relational databases . Это произведение, содержащее 3*3*3=27 кортежей, можно назвать "Студенты-Лекции-Преподаватели". Множество   Sets, relationships, fact table in relational databases представляет собой совокупность всех возможных вариантов посещения студентами лекций. Отношение же   Sets, relationships, fact table in relational databases показывает текущее состояние учебного процесса. Очевидно, что отношение   Sets, relationships, fact table in relational databases является изменяемым во времени отношением.

Итак, факты о ходе учебного процесса удалось отразить в виде двух отношений третьей степени (3-арных), а сами отношения изобразить в виде таблиц с тремя колонками.

Удобство использования табличной формы для задания отношения определяется в данном случае следующими факторами:

  1. Все используемые множества конечны .
  2. При добавлении или удалении студентов, предметов, преподавателей просто добавляются или удаляются соответствующие строки в таблице.

Нас сейчас не интересует вопрос, хороши ли полученные отношения. Заметим пока только, что, как показывают следующие замечания, не любую строку можно добавить в таблицу "Посещать лекции".

Remark В таблицу "Посещать лекции" нельзя добавить две одинаковые строки, т.к. таблица изображает отношение   Sets, relationships, fact table in relational databases , а в отношении (как и в любом множестве) не может быть двух одинаковых элементов . Это пример синтаксического ограничения - такое ограничение задано в определении понятия отношение (одинаковых строк не может быть ни в одной таблице , задающей отношение).

Remark В таблицу "Посещать лекции" нельзя добавить кортеж (Иванов, Геометрия, Пушников). Действительно, из таблицы "Читает лекции по…", представляющей отношение   Sets, relationships, fact table in relational databases , следует, что Пушников не читает предмет "Геометрия". Оказалось, что таблицы связаны друг с другом, и существенным образом! Это пример семантического ограничения - такое ограничение является следствием нашей трактовки данных, хранящихся в отношении (следствием понимания смысла данных).

Транзитивное замыкание отношений

Введем понятие транзитивного замыкания , связанное с бинарными отношениями, которое понадобится в дальнейшем.

Определение 11. Пусть отношение   Sets, relationships, fact table in relational databases задано на декартовом квадрате   Sets, relationships, fact table in relational databases некоторого множества   Sets, relationships, fact table in relational databases . Транзитивным замыканием отношения   Sets, relationships, fact table in relational databases называется новое отношение   Sets, relationships, fact table in relational databases , состоящее из кортежей   Sets, relationships, fact table in relational databases , для которых выполняется:

  • либо кортеж   Sets, relationships, fact table in relational databases ,
  • либо найдется конечная последовательность элементов   Sets, relationships, fact table in relational databases , такая, что все кортежи   Sets, relationships, fact table in relational databases принадлежат отношению   Sets, relationships, fact table in relational databases .

It's obvious that   Sets, relationships, fact table in relational databases .

Пример 7 . Пусть множество   Sets, relationships, fact table in relational databases представляет собой следующее множество деталей и конструкций:

  Sets, relationships, fact table in relational databases = {Болт, Гайка, Двигатель, Автомобиль, Колесо, Ось}

причем некоторые из деталей и конструкций могут использоваться при сборке других конструкций. Взаимосвязь деталей описывается отношением   Sets, relationships, fact table in relational databases ("непосредственно используется в") и состоит из следующих кортежей:

Design Где используется
Болт Engine
Болт Колесо
Nut Engine
Nut Колесо
Engine Car
Колесо Car
Axis Колесо

Таблица 5 Отношение R

Транзитивное замыкание   Sets, relationships, fact table in relational databases состоит из кортежей (добавленные кортежи помечены серым цветом):

Design Где используется
Болт Engine
Болт Колесо
Nut Engine
Nut Колесо
Engine Car
Колесо Car
Axis Колесо
Болт Car
Nut Car
Axis Car

Таблица 6 Транзитивное замыкание отношения R

Очевидный смысл замыкания   Sets, relationships, fact table in relational databases состоит в описании включения деталей друг в друга не только непосредственно, а через использование их в промежуточных деталях, например, болт используется в автомобиле, т.к. он используется в двигателе, а двигатель используется в автомобиле.

findings

Множество - это неопределяемое понятие, представляющее некоторую совокупность данных. Элементы множества можно отличать друг от друга, а также определять, принадлежит ли данный элемент данному множеству. Над множествами можно выполнять операции объединения, пересечения, разности и дополнения.

Новые множества можно строить при помощи понятия декартового произведения (конечно, есть и другие способы, но они нас в данный момент не интересуют). Декартово произведение нескольких множеств - это множество кортежей , построенный из элементов этих множеств.

Отношение - это подмножество декартового произведения множеств. Отношения состоят из однотипных кортежей. Каждое отношение имеет предикат отношения и каждый n-местный предикат задает n-арное отношение.

Отношение является математическим аналогом понятия "таблица".

Отношения обладают степенью и мощностью . Степень отношения - это количество элементов в каждом кортеже отношения (аналог количества столбцов в таблице). Мощность отношения - это мощность множества кортежей отношения (аналог количества строк в таблице).

В математике чаще всего используют бинарные отношения (отношения степени 2). В теории баз данных основными являются отношения степени   Sets, relationships, fact table in relational databases .In mathematics, as a rule, relations are given on infinite sets and have infinite cardinality. In databases, on the contrary, the power relations are finite (the number of stored rows in the tables is always of course).


Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Databases, knowledge and data warehousing. Big data, DBMS and SQL and noSQL

Terms: Databases, knowledge and data warehousing. Big data, DBMS and SQL and noSQL