PostgreSQL - database overview, basics

Lecture



Standards support, features, features

PostgreSQL is based on SQL and supports many of the features of the SQL: 2011 standard.

Currently, PostgreSQL has the following limitations:

  • Maximum database size: no limit
  • Maximum table size: 32 TB
  • Maximum recording size: 1.6 TB
  • Maximum field size: 1 GB
  • Maximum records in the table: no restrictions
  • Maximum fields in the record: 250 - 1600, depending on field types
  • Maximum indexes in the table: no restrictions

Major PostgreSQL TTH

  • Reliability and stability. The reliability of PostgreSQL is a well-known fact, proved by the example of many projects in which PostgreSQL has been working without failures under high loads for several years.
  • Excellent support. The PostgreSQL community provides expert and quick help. Commercial companies offer their services worldwide.
  • Competitive work with a large load. PostgreSQL uses multi-versioning (MVCC) to ensure reliable and fast performance in competitive conditions under high load.
  • Scalable. PostgreSQL perfectly uses the modern architecture of multi-core processors - its performance grows linearly up to 64 cores. Clustered solutions based on Postgres-XC, Postgres-XL help with horizontal scalability.
  • Cross platform PostgreSQL works under all types of UNIX-like systems, including Linux, FreeBSD, Solaris, HP / UX, Mac OS X, and also under MS Windows.
  • Extensibility PostgreSQL source codes are available, making it possible to add new functionality to your project without additional problems. Extensibility PostgreSQL allows you to create new data types and access methods.
  • Availability. PostgreSQL is distributed under a license close to BSD, which does not impose any restrictions on commercial use and does not require royalties.

Main features

  • High level of compliance with ANSI SQL 92, ANSI SQL 99 and ANSI SQL 2003, 2011.
  • Interfaces for Tcl, Perl, C, C ++, PHP, JSON, ODBC, Embedded SQL in C, Python, Ruby, Java, and others.
  • Integration of data protection with the operating system (SE-Linux).
  • Representations, sequences, inheritance, outer joins, subqueries, referential integrity, window functions, CTE (recursive queries).
  • User functions, stored procedures, triggers.
  • Procedural languages ​​PL / PgSQL, PL / Perl, PL / Python, PL / Java and others.
  • An extensible set of data types with index support (GiST, GIN, SP-GiST).
  • Built-in full-text search with support for all European languages.
  • Native support for semi-structured data (xml, json, jsonb) with index support.
  • Hot backup and replication (synchronous, asynchronous, cascade), PITR, bidirectional (BDR).
  • Full ACID support, isolation levels, efficient transaction serialization.
  • Functional and partial indexes.
  • Internationalization, Unicode and locale support.
  • Downloadable extensions, for example PostGIS, hstore.
  • SSL and Kerberos authentication support.
  • Foreign Data Wrappers (writable), support for all major databases.

Functions

Functions are blocks of code that are executed on the server and not on the database client. Although they can be written in pure SQL, the implementation of additional logic, such as conditional jumps and loops, goes beyond SQL itself and requires the use of some language extensions. Functions can be written using one of the following languages:

  • The built-in procedural language PL / pgSQL, in many respects similar to the PL / SQL language used in the Oracle DBMS;
  • Script languages ​​- PL / Lua, PL / LOLCODE, PL / Perl, PL / PHP, PL / Python, PL / Ruby, PL / sh, PL / Tcl and PL / Scheme;
  • Classic languages ​​- C, C ++, Java (via the PL / Java module);
  • Statistical language R (via PL / R module).

PostgreSQL allows the use of functions that return a set of records, which can then be used in the same way as the result of a regular query.

Functions can be performed both with the rights of their creator, and with the rights of the current user.

Sometimes functions are identified with stored procedures, but there is a difference between these concepts. From the ninth version it is possible to write autonomous blocks that allow you to execute code in procedural languages ​​without writing functions directly in the client.

Triggers

Triggers are defined as functions initiated by DML operations. For example, an INSERT operation may trigger a trigger that checks the added record for compliance with certain conditions. When writing functions for triggers, different programming languages ​​can be used.

Triggers are associated with tables. Multiple triggers are performed in alphabetical order.

Rules and Submissions

The rules mechanism is a mechanism for creating custom handlers not only for DML operations, but also for selection operations. The main difference from the trigger mechanism is that the rules are triggered at the request parsing stage, before choosing the optimal execution plan and the execution process itself. The rules allow you to override the behavior of the system when executing a SQL operation on a table.

A good example is the implementation of the presentation mechanism (views). When creating a view, a rule is created that determines that instead of performing a fetch operation on the view, the system must perform a fetch operation on the base table / tables, taking into account the selection conditions underlying the view definition. To create views that support update operations, the rules for inserting, modifying, and deleting rows must be defined by the user.

Indices

PostgreSQL supports indexes of the following types: B-tree, hash, R-tree, GiST, GIN. If necessary, you can create new types of indexes, although this is not a trivial process.

PostgreSQL indexes have the following properties:

  • it is possible to view the index not only in direct, but also in reverse order - the creation of a separate index is not necessary for the operation of the ORDER BY ... DESC construct;
  • it is possible to create an index over several columns of the table, including over columns of different data types;
  • indexes can be functional, i.e. not be built on the basis of a set of values ​​of a certain column (s), but on the basis of a set of function values ​​from a set of values;
  • indices can be partial, that is, be built only in part of the table (according to some of its projections); in some cases, it helps to create much more compact indices or achieve performance improvements by using different types of indices for different (for example, in terms of update frequency) parts of the table;
  • A query planner can use multiple indexes at the same time to perform complex queries.

Multi-Version (MVCC)

PostgreSQL supports simultaneous modification of a database by several users using the Multiversion Concurrency Control (MVCC) mechanism. Due to this, the ACID requirements are met, and there is practically no need for read locks.

Data types

PostgreSQL supports a large set of built-in data types:

  • Numeric types
    • Whole
    • Fixed point
    • Float
    • Money type (different special output format, but otherwise similar to fixed-point numbers with two decimal places)
  • Arbitrary character types
  • Binary types (including BLOB)
  • Date / time types (fully supporting various formats, accuracy, output formats, including recent changes in time zones)
  • Boolean type
  • Enumeration
  • Geometric primitives
  • Network types
    • IP and IPv6 addresses
    • CIDR format
    • MAC address
  • UUID ID
  • XML data
  • Arrays
  • Json
  • Identifiers of database objects
  • Pseudotypes
  • Types for text search
  • Range Types

Moreover, the user can independently create new types required by him and program indexing mechanisms for them using GiST.

User objects

PostgreSQL can be extended by the user for their own needs in almost any aspect. It is possible to add your own:

  • Type conversion
  • Data types
  • Domains (custom types with constraints initially imposed)
  • Functions (including aggregate)
  • Indices
  • Operators (including overriding existing ones)
  • Procedural languages

Inheritance

Tables can inherit characteristics and field sets from other tables (parent). In this case, the data added to the generated table will automatically participate (if this is not specified separately) in the queries to the parent table.

Commercial extensions

EnterpriseDB created more powerful versions of this DBMS for commercial use based on PostgreSQL — Postgres Plus (consisting entirely of open source products; the board is required only when commercial support for the product is needed) and Postgres Plus Advanced Server (PostgreSQL extension opportunities for compatibility with Oracle Database). The delivery of these products contains a large set of software for developers and DBA:

  • Postgres Studio - a more powerful analogue of pgAdmin;
  • Postgres Plus Debugger - a debugger for PL / pgSQL code integrated with the previous package;
  • Migration Studio - a tool to automatically convert databases from MySQL / Oracle to PostgreSQL

Conclusion

PostgreSQL is currently one of the most (if not the most-most) promising DBMS in the world. Thanks to great architecture, gratuity, great community and great opportunities.

PostgreSQL is usually compared to MySQL. But at the moment, PostgreSQL is far beyond the scope of MySQL. And, due to its capabilities in data analytics and manipulation, it can often compete with the recognized leaders of the enterprise-DBMS market: Oracle and MS SQL.

And thanks to its non-relational data storage capabilities (JSON, text search types, HStore), PostgreSQL directly competes with NoSQL solutions (for example, MongoDB).


Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Databases, knowledge and data warehousing. Big data, DBMS and SQL and noSQL

Terms: Databases, knowledge and data warehousing. Big data, DBMS and SQL and noSQL