5. Transformation of model into relational

Lecture




The transformation of a conceptual model into a relational one is as follows:

    • Build a set of preliminary tables and specify primary keys.
    • To carry out the process of normalization.


We considered the first point in the third lesson, we are not familiar with the second one, but we will get acquainted in practice.

So, we need to build a set of tables. Make it easy, because the tables are our objects, and the fields of the tables are the attributes of the objects. A set of preliminary tables, based on our conceptual model, looks like this:

  5. Transformation of model into relational

Thus, we have defined tables, fields, primary keys (RC) and relationships (FK). Please note that in the Supply Log and Purchase Log tables, primary keys are composite, i.e. consist of two fields. Theoretically, there are tables in which all fields are one composite key.

We proceed to the second point, namely, to the normalization of relations (tables). Normalization is a step-by-step, reversible process of replacing the original schema with another schema, in which the tables have a simpler and more logical structure. What is it for?

First, to eliminate data redundancy. For example, in our example for the forum (from the third lesson), we would leave the following table:

  5. Transformation of model into relational

In the Themes field, the same names are often repeated. Besides the fact that additional storage resources will be required for their storage, when duplicating information, it is very easy to make a mistake when entering attribute values, as a result of which the database goes into an inconsistent state.

In addition, when working with such tables, so-called update anomalies may occur. For example, if we delete the fourth message from this table, then information about the topic will disappear with it. This situation is a deletion anomaly. If we decide to change the name of the topic, then we will have to look through all the lines and in each replace the old topic with a new one. This is the so-called modification anomaly. There are other types of anomalies.

Not always, these shortcomings can be addressed immediately. The normalization process is applied to eliminate them. It includes a set of rules used to validate all database tables. There are:

  • 1NF - the first normal form
  • 2NF - the second normal form
  • 3NF - the third normal form
  • NFBK is the normal form of Boys-Codd
  • 4NF - fourth normal form
  • 5NF - fifth normal form

Each normal form imposes certain restrictions on the data. Each higher level normal form assumes that the table being analyzed is already in normal form one level below the one under consideration. During normalization, the database schema becomes more and more strict, and its tables are less and less susceptible to various kinds of anomalies.

Relational databases require that its tables be in 1NF. Normal forms of higher levels can be used by developers at their discretion. However, a competent specialist tends to bring the database normalization level to at least 3NF, thereby eliminating data redundancy and update anomalies. I must say that NFBK, 4NF and 5NF are used extremely rarely. Therefore, we consider only the first three.

First normal form

A table is in first normal form if all its fields have simple (atomic) values. The very concept of atomicity is difficult to determine. The value atomic in one case may be non-atomic in the other. The general principle here is: the value is not atomic if it is used in parts. It will be clearer by example.

In our Suppliers table there is an Address field. If our store works only with suppliers from one city, then the values ​​of the Address field can be considered atomic, and the table itself - reduced to 1NF.

But what if our suppliers are in different cities? Then, sending the car for goods to a certain city, we must be sure that it will take the goods from all suppliers in this city. Those. We may need information about suppliers located in a particular city. In this case, the values ​​in the Address field are no longer atomic (since we use part of the address), and to bring the table to 1NF, we need to select another field - City:

  5. Transformation of model into relational

Thus it is necessary to analyze all the tables in our database. So, in the Buyer table there is a full name field. If we are going, for example, to congratulate our customers with names (which, as you know, depending on the name), then this field would have to be divided into three: last name, first name and patronymic. Our store is not going to do this, so the full name field can be considered atomic, and the table is reduced to 1NF.

For requests of our store all the other tables are given to 1NF.

Second normal form

This form applies to tables with composite keys. A table whose primary key includes only one field is always in 2NF.

A table is in the second normal form, if it is in the first normal form, and each non-key field is functionally fully dependent on the composite key.

In our database, two tables have a composite key - the Shopping Log and the Shipping Log. The value of the Quantity field depends on both the Delivery (Purchase) and the Goods. This means that our tables are in 2NF.

But suppose that at the conceptual modeling stage of our database, we did not select the Supply and Purchase objects. Then our tables could look like this:

  5. Transformation of model into relational

Now let's look at the table Supply Log: the Quantity field depends on the Product Name and the Delivery Date, but it does not depend on who delivered the goods (the Supplier field). Those. The table is not in 2NF. If at the stage of conceptual modeling of our database, we did not select the Supply and Purchase objects, we would have to do it now. But we allocated them, therefore all our tables are in 2NF.

Third normal form

A table is in third normal form if it is in second normal form, and each non-key field is non-transitively dependent on a primary key.

A transitive relationship is observed if one of the two non-key fields depends on the primary key, and the other depends on the first non-key field. The example will be clearer.

Let's look at our product table. There is a Price field in it, but prices, as is known, tend to change. If we change them right here, then all information about previous prices will disappear. In order not to lose this information, you must add the Date field (when the price has changed). Then our table will look like this:

  5. Transformation of model into relational

Even without resorting to 3NF, it is clear that such a table will contain redundant information. But let's look at its fields: the Name and Date fields depend on the product id, and the Price field also depends on the Date. Those. the table is not in 3NF. To eliminate transitive dependencies, it is necessary to split the object into two:

  5. Transformation of model into relational

All other tables in our database are in 3NF. By the way, in the Goods table it was possible not to enter the product id field, but to make the Name field the primary key, but as already mentioned in the third lesson, surrogate keys are still preferable.

Summarize. The scheme of our database after normalization has changed somewhat and now looks like this:

  5. Transformation of model into relational

Thus, we transformed our conceptual model into a relational one. Further it is necessary to implement this model in a specific DBMS. For this we need the DBMS itself and knowledge of the SQL language. These lessons will be devoted to SQL Lessons.

In the meantime, summarize the lessons of "Basics of the database." Designing the OBD process is usually time-consuming and slow. After all, you need to very well study the subject area to take into account all the nuances, wishes and requirements of users. Then all the collected information is displayed in the form of objects, attributes and links. And this should be done most rationally.

In general, among developers, there are different views on the database design process. Some ignore any theory and are guided only by their experience and common sense. Others see this process as art, relegating the main role of intuition. But in any case, knowledge is not superfluous. And if you add theory to your intuition and common sense, the result will be much better.

Yes, a database is just a data warehouse, but how well you organize this storage will determine the performance of your application using the data. Remember this and do not neglect the theory.

In conclusion, I would like to remind you why you need to be able to design databases. Suppose you decide to organize on your site registration of users in order to provide them access to restricted materials of the site.

To implement this issue, you will need to create a database that will store information about users, their logins and passwords. And also to make html-registration form and enter the closed section.

When a user is registered, this data is programmed (for example, using the PHP language) into the database you created. When a user enters a login and password in the login form in a private section, a request is sent to the database (in SQL) whether there is a user with such data. And if the answer is yes, then the requested page is sent to the user (of course, also using a PHP program).

Thus, in order to implement such applications you need to be able to create databases, build SQL queries to the database, and know some programming language that is applicable for developing dynamic web pages (for example, PHP).

In principle, you can first learn a programming language, and then study the database and SQL. But on this site, training is built in the following order: DB - SQL - PHP. This is done because without the databases, nothing interesting can be done in PHP. So see you in SQL Lessons.

Before moving to the lessons of SQL, you will need to install a MySQL server. In principle, you can install only it, but for full further work you will need the phpMyAdmin interface, and to work with it you need a PHP module and a local server. Therefore, I strongly recommend that you install both the Apache server, the PHP module, and MySQL immediately. In addition, you still have to install them when you start learning PHP. So go to the Preliminary Settings section and install everything in order so that you don’t have to return to it.


Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Databases, knowledge and data warehousing. Big data, DBMS and SQL and noSQL

Terms: Databases, knowledge and data warehousing. Big data, DBMS and SQL and noSQL