SQL - a universal language for working with databases

Track the website visitors

Dear readers of our blog, we'd like to recommend you to visit the main page of our website, where you can learn about our product SQLS*Plus and its advantages.

SQLS*Plus - best SQL Server command line reporting and automation tool! SQLS*Plus is several orders of magnitude better than SQL Server sqlcmd and osql command line tools.

REQUEST COMPLIMENTARY SQLS*PLUS LICENCE

Enteros UpBeat offers a patented database performance management SaaS platform. It proactively identifies root causes of complex revenue-impacting database performance issues across a growing number of RDBMS, NoSQL, and deep/machine learning database platforms. We support Oracle, SQL Server, IBM DB2, MongoDB, Casandra, MySQL, Amazon Aurora, and other database systems.

5 June 2020

SQL is often called the language of Esperanto for Database Management Systems (DBMS). Indeed, there is no other Database Management System (DMS) language in the world that is so widely used in programs.

The first sql standard was introduced in 1986 and has now gained universal acceptance. It can be used even when working with non-relational DBMS. Unlike other software tools, such as C and Kobol, which are the prerogative of professional programmers, sql is used by specialists from various fields. Programmers, DBMS administrators, business analysts – all of them successfully process data using sql. Knowledge of this language is useful for everyone who has to deal with the database.

Database Management Systems

In this article we will consider the basic concepts of SQL. We will tell its prehistory (and dispel several myths along the way). You will get acquainted with the relational model and will be able to acquire the first skills of working with sql, which will help you further master the language.

Is it hard to learn SQL? It depends on how deeply you are going to get to the bottom of it. In order to become a professional you will have to learn a lot. sql was introduced in 1974 as a small 23-page research paper and has come a long way since then.

The text of the current standard – the official document “the international standard database language SQL” (usually called sql-92) – contains more than six hundred pages, but it does not say anything about the specific features of the versions of sol, implemented in the DBMS firms microsoft, oracle, sybase and others.

The language is so developed and diverse that only a simple enumeration of its features will require several journal articles, and if you collect everything written on the subject of sol, you will get a multi-volume library.

However, it is not necessary for an ordinary user to know sql completely. As a tourist who finds himself in a country where people speak an incomprehensible language, it’s enough to learn only a few useful expressions and grammar rules, and in sql – knowing little, you can get many useful results.

In this article, we will take a look at the basic commands of sql, rules of setting criteria for data selection and show how to get results. As a result, you will be able to create tables and enter information into them yourself, make queries and work with reports. This knowledge can become the basis for further independent development of sql.

What is SQL?

SQL is a specialized non-procedural language that allows you to describe data, select and process information from relational DBMS. Specialization means that sol is intended only for working with the database; you cannot create a full-fledged application system by means of this language only – this requires using other languages in which you can embed sql commands. That’s why sql is also called an auxiliary language tool for data processing. An auxiliary language is used only in combination with other languages.

A general-purpose application language usually has tools for creating procedures, but not in sql. It cannot be used to specify how a task is to be performed, but only to determine what it is. In other words, when working with sql, we are interested in the results, not in the procedures for obtaining them.

The most essential property of sql is the ability to access relational databases. Many even think that the expressions “a database processed by means of sql” and “relational database” are synonyms. However, you will soon see that there is a difference between them. The standard sql-92 does not even contain the term relation.

What is a relational DBMS?

Without going into details, a relational DBMS is a system based on a relational data management model.

The concept of a relational model was first proposed in the paper by Dr. E.F. Kodd, published in 1970. It described a mathematical apparatus for data structuring and management, and proposed an abstract model to represent any real information. Prior to that, the specific features of storing information in the database had to be taken into account when using the database.

If the internal database structure was changed (for example, to increase the performance), it was necessary to recycle the application programs, even if there were no changes on the logical level. The relational model made it possible to separate the particular features of data storage from the application layer. In fact, the model does not describe in any way how the information is stored and accessed.

Only the way this information is perceived by the user is taken into account. With the advent of the relational model, there has been a qualitative change in the approach to data management from art to science, leading to a revolutionary development of the industry.

The basic concepts of the relational model

According to the relational model, relation is a certain table with data. A relationship may have one or more attributes (attributes) corresponding to the columns of this table, and some set (possibly empty) of data, which are sets of these attributes (they are called n-arched tuples, or records) and corresponding rows of the table.

For any tuple, attribute values must belong to the so-called domains. In fact, a domain is a certain set of data, which sets a set of all valid values.

Let’s look at an example. Let there be a domain Days of the week, containing values from Monday to Sunday. If the relationship has a DayWeek attribute corresponding to this domain, then in any motorcade relationship column in the DayWeek must have one of these values. The appearance of the values January or Cat is not allowed.

Please note: the attribute must have one of the allowed values. It is forbidden to set several values at once. Thus, in addition to the requirement that the attribute’s values belong to a certain domain, the condition of its atomicity must be met. It means that decomposition is inadmissible for these values, i.e. it is impossible to break them into smaller parts without losing the basic meaning.

For example, if the attribute’s value contained Monday and Tuesday at the same time, it would be possible to separate two parts keeping the original meaning – DayWeek; therefore, this attribute’s value is not atomic. However, if you try to break down the value of “Monday”, you would get a set of separate letters – from “M” to “Y”; the original meaning is lost, so the value of “Monday” is atomic.

Relationships have other properties as well. The most significant of them is the mathematical property of closed operations. This means that as a result of any operation on a relationship, a new relationship must appear. This property allows to get predictable results when performing mathematical operations on a relationship. In addition, it is possible to present operations as abstract expressions with different nesting levels.

In his initial work, Dr. Codd defined a set of eight operators called relational algebra. The four operators – union, logical multiplication, difference and Cartesian work – were derived from traditional set theory; the other operators were created specifically for relationship processing.

In subsequent works by Dr. Codd, Chris Data and other researchers, additional operators have been proposed. Three relational operators – project, select, or restrict, and merge – will be discussed later in this paper.

SQL and relational model

Now that you’ve met the relational model, let’s forget about it. Of course, not forever, but only to explain the following: although it was the relational model proposed by Dr. Codd that was used in the development of sql, there is no full or literal correspondence between them (this is one of the reasons why the standard sql-92 lacks the term relationality).

For example, the notions of table sql and the relation are not equal, because tables can have several identical rows at once, while in relations the appearance of identical tuples is not allowed. In addition, the use of relational domains in sql is not provided, although to some extent, their role is played by data types (some influential supporters of the relational model are now trying to achieve inclusion in the future standard sql relational domains).

Unfortunately, the mismatch between the sql and the relational model has given rise to many misunderstandings and disputes over the years. But since the main topic of the article is studying sql rather than the relational model, these problems are not considered here. Just keep in mind that there are differences between the terms used in sql and the relational model.

Further in the article, only the terms used in sql will be used. Instead of relations, attributes and tuples, we will apply their sql equivalents: tables, columns and rows.

Static and dynamic SQL

You may already be familiar with terms such as static and dynamic sql. A sql request is static if it is compiled and optimized at the stage before the program execution. We have already mentioned one of the forms of static sql when talking about embedding sql commands in C or Kobol programs (there is another name for such expressions – embedded sql).

As you can probably guess, a dynamic sql request is compiled and optimized during the program execution. As a rule, ordinary users use dynamic sql, which allows them to create requests according to their immediate needs.

One of the options for using dynamic sql requests is their interactive or direct call (there is even a special term – directsql), when requests sent for processing are entered interactively from the terminal. There are certain differences between static and dynamic sql in syntax of applied constructions and peculiarities of execution, but these questions are beyond the scope of the article.

Let us only note that for the sake of clarity, examples are given in the form of direct sql requests, as it allows learning to use sql not only for programmers, but for most end users as well.

How to study SQL

Now you’re ready to write your first sql queries. If you have access to the database via sql and would like to use our examples in practice, please note the following: you must log in as a user with unlimited permissions and you will need software for interactive processing of sql queries (if it is a network database, you should talk to the database administrator about granting you the appropriate permissions). If there is no access to the database via sql – do not be upset: all the examples are very simple and you can understand them “dry”, without going to the machine.

In order to perform any actions in sql, you should execute an expression in sql. There are several types of expressions, but among them there are three main groups: ddl-commands (data definition language), dml-commands (data manipulation language) and data control tools. Thus, in a sense, three different languages are combined in sql.

Data description language commands

Let’s start with one of the main ddl commands – create table. There are several types of tables in sql, the basic types are base and the selective types are views. The base tables are those that relate to actual data; the sample tables are “virtual” tables that are created based on information obtained from the base tables; but for users, the forms look like regular tables. The create table command is intended for creating base tables.

In the create table command, you should specify a table name, a list of columns, and the types of data they contain. Other optional elements may also be present as parameters, but first let’s consider only the basic parameters. Let’s show the simplest syntactic form for this command:

create table TableName ( Data Type Column ) ;

create and table are sql keywords; Table Name, Column and Data Type are formal parameters, instead of which the user enters the actual values each time. The Column and Data Type options are enclosed in parentheses. In sql, parentheses are usually used to group individual elements. In this case they allow you to combine the definitions for a column. The “semicolon” sign at the end is the command separator. It must end any expression in sql.

Let us consider an example. Let’s create a table for storing data about all the elements. To do this, you should enter the command in sql:

create table appoints ( appoint_date date ) ;

After executing this command, a table will be created with the names of the appointments, where there is one column appointment_date, in which data such as date can be written. Since no data has been entered so far, the number of rows in the table is zero (using the create table command only defines the table; the actual values are entered by the insert command, which is discussed below).

The parameters appointments and appointment_date are called identifiers because they set the names for the specific database objects, in this case – the names for the table and the column respectively. In sql there are identifiers of two types: regular and delimited. Selected identifiers consist of double quotes and take into account the case of the characters used. Regular identifiers are not distinguished by any delimited characters, and their spelling is case insensitive. In this article, only regular identifiers are used.

Symbols used to build identifiers must meet certain rules. Only letters (not necessarily Latin, but also other alphabets), numbers and underscores may be used in normal identifiers. An identifier must not contain punctuation marks, spaces or special characters (#, @, % or !); moreover, it must not start with a number or an underscore.

Separate sql keywords may be used for identifiers, but this is not recommended. An identifier is intended to denote some object, so it must have a unique (within a certain context) name: you cannot create a table with a name already encountered in the database; you cannot have columns with identical names in one table. By the way, keep in mind that the appointments are the same and the names for sql. You cannot create a new identifier by changing the letter case alone.

Although a table can only have one column, in practice, tables with multiple columns are usually required. The command for creating such a table looks like this in general terms:

create table Table_Name_id (Data_Type_Column [ { , Data_Type_Column } ] ) ;

Square brackets are used to denote optional elements, curly braces contain elements that can represent a list of single-track constructions (when you enter a real sql-command neither of those brackets are put). This syntax allows you to specify any number of columns. Note that the second element is preceded by a comma. If there are several parameters in the list, they are separated by commas.

Tables that contain only one column, such as our appointments, are rare in practice. Let’s look at another, more useful example.

create table appoints2 ( appoint_date date , appoint_time time , description varchar ( 256 ) ) ;

This command creates a table2 (the new table must have a different name since the table is already present in the database). Like the first table, it has an appointment_date column to record the date of meetings; in addition, there is an appointment_time column to record the time of these meetings.

The description parameter is a text string that can contain up to 256 characters. The varchar type (short for character varying) is specified for this parameter, as it is not known in advance how much space will be needed to record, but it is clear that the description will not take more than 256 characters.

When describing a parameter in the type a character string (and some other types) is specified length of the parameter. Its value is specified in parentheses to the right of the type name.

You may have noticed that in the two examples discussed, the command entry is differently designed. In the first case, if the command is completely placed in one row, then in the second case, after the first open parenthesis, the entry continues with a new row and the definition of each next column starts with a new row.

In sql, there are no special requirements for the design of the record. Splitting a record into rows makes it easier to read. When writing commands, sql allows you not only to split the command into rows, but also to insert indents at the beginning of rows and spaces between the elements of the record.

Now that you know the basic rules, let’s take a more complex example of creating a table with multiple columns. At the beginning of this article, the employee table was shown. It contains the following columns: last name, first name, date of employment, division, category and salary for the year. The following command sql is used to define this table:

create table empls ( l_name character ( 13 ) not null, f_name character ( 10 ) not null, hire_date_id date , branch_office_id character ( 15 ) , grade_level_id smallint , salary_id decimal ( 9 , 2 ) ) ;

There are several new elements in the team. First of all, this expression is not null, which stands at the end of last_name and first_name columns definition. With the help of such constructs, the requirements to be met are set. In this case, it is specified that the last_name and first_name fields must be filled in when entering; it is not allowed to leave these columns empty (it is quite logical: how can one identify an employee without knowing his name?).

Besides, there are three new data types in the example: character, smallint and decimal. So far, we have hardly talked about types. Although there are no relational domains in sql, there is a set of basic data types. This information is used when allocating memory and comparing values; to some extent, it narrows down the list of possible values to be entered, but type control in sql is less strict than in other languages.

All data types available in sql can be divided into six groups: character strings, exact numeric values, approximate numeric values, bit strings, date time and intervals. We have listed all varieties, but this article will only cover some of them in detail (bit strings, for example, are not of particular interest to ordinary users).

By the way, if you thought that date time is a misprint, you made a mistake. This group (datetime) includes most of the time related data types used in sql (parameters such as time intervals are allocated to a separate group). In the previous example, there have already been two data types from the datetime group – date and time.

The next data type you are already familiar with is character varying (or just varchar); it belongs to a character string group. If varchar is used to store strings of variable length, then the char type found in the third example is for recording strings with a fixed number of characters.

For example, in the last_name column, rows of 13 characters will be written regardless of the actual surnames entered, be it poe or penworth-chickering (in the case of poe, the remaining 10 characters will be filled with spaces).

From the user’s point of view, varchar and char have the same meaning. Why did you have to enter two types? The point is that in practice you usually have to find a compromise between performance and disk space saving. As a rule, the use of fixed-length strings gives some advantage in access speed, but if the strings are too long, the disk space is not used economically.

If 256 characters are reserved for each comment line, it may not be rational; most often the lines will be much shorter. On the other hand, names also have different lengths, but usually require about 13 characters; in this case, the loss will be minimal. There is a good rule: if you know that the length of a string changes slightly or it is relatively small, use char; in other cases, use varchar.

The following two new data types – smallint and decimal – belong to the group of exact numerical values. smallint is an abbreviated name from small integer. The integer data type is also provided in sql. The presence of two similar types in this case is also explained by the consideration of space saving.

In our example, the values of the grade_level parameter can be represented by a two-digit number, so the smallint type is used; however, in practice it is not always known what maximum values the parameters can have. If there is no such information, you should use integer. The actual volume allocated to store smallint and integer parameters and the corresponding range of values for these parameters are individual for each platform.

The decimal data type, usually used for accounting of financial indicators, allows you to specify a template with the required number of decimal digits. Because this type is used for accurate numeric recording, it guarantees accuracy when mathematical operations are performed on decimal data.

If you use data types from a group of approximate numbers, such as float (floating point number), for decimal values, this will lead to rounding errors, so this option is not suitable for financial calculations. The following form of record is used to define parameters of decimal type:

decimal(p,d),

where p – is the number of decimal places, d – is the number of digits after the decimal point. Instead of p, the total number of significant digits in the values used should be written, and instead of d, the number of digits after the decimal point.

Create table

The “Create table” box shows the full version of the generalized entry of the create table command. It contains the new elements and shows the format for all the data types we have considered (In principle, there are other data types, but we are not considering them yet).

At first, it might seem that the syntax of the sql command is too complex. But you can easily understand it if you have studied the above examples carefully. On the scheme there is an additional element – a vertical line; it serves for differentiation of alternative constructions.

In other words, at definition of each column it is necessary to choose suitable type of the data (as you remember, in square brackets consist optional parameters, and in curly braces – designs which can repeat repeatedly; in real sql-teams these special symbols are not written). The first part of the scheme contains full names for data types, and the second part contains their abbreviated names; in practice, you can use any of them.

The first part of the article is over. The second part will study dml commands insert, select, update and delete. Data sampling conditions, comparison and logical operators, null-value usage and ternary logic will also be discussed.

Table creation. Syntax of the command create table: optional parameters are given in square brackets, repetitive constructions are given in curly brackets.

create table table ( column character (length) [ constraint ] | character varying (length) [ constraint ] | date [ constraint ] | time [ constraint ] | integer [ constraint ] | smallint [ constraint ] | decimal (precision, decimal places) [ constraint ] | float (precision) [ constraint ]
[{ , column char (length) [ constraint ] | varchar (length) [ constraint ] | date [ constraint ] | time [ constraint ] | int [ constraint ] | smallint [ constraint ] | dec (precision, decimal places) [ constraint ] | float (precision) [ constraint ] }])

The secret of the name SQL

In the early 1970s, ibm started to put the relational database model proposed by Dr. Codd into practice. Donald Chamberlin and a team from the Advanced Research Unit created a prototype language called structured english query language, or simply sequel. It has since been expanded and refined.

The new version proposed by ibm is called sequel/2. It was used as a software interface (api) for designing the first relational system of an ibm database – system/r. For legal reasons, ibm decided to change the name: instead of sequel/2, use sql (structured query language).

There are significant differences between the early sequel prototypes and the currently recognized sql standard in various organizations. Jim Melton, who prepared the standard sql-92, even stated that many were wrong to think that the word “structured” correctly reflected the specifics of that language (jim melton and alan r. simon “understanding the new sql: a complete guide”. san francisco: morgan kaufmann, 1993. isbn: 1-55860-245-3). So actually, sql is just a name, a sequence of s-q-l letters and nothing more.

SQL basics and creating a simple database – SQL tutorial for beginners (video)

Tags: data types, SQL, sql server, sqls

NoSQL life without a schema

NoSQL: life without a schema

PreambleNoSql is not a replacement for SQL databases but is a valid alternative for many situations where standard SQL is not the best approach for...

SQLS*Plus - MongoDB Sample operators 1

MongoDB Sample operators

PreambleMongoDB Conditional operators specify a condition to which the value of the document field shall correspond.Comparison Query Operators $eq...

5 Database management trends impacting database administration

5 Database management trends impacting database administrationIn the realm of database management systems, moreover half (52%) of your competitors feel...

SQL Server data types

SQL Server data types

The data type is defined as the type of data that any column or variable can store in MS SQL Server. What is the data type? When you create any table or...

SQL Server Architecture

SQL Server Architecture

PreambleMS SQL Server is a client-server architecture. MS SQL Server process starts with the client application sending a query.SQL Server accepts,...

Get a better understanding of the MongoDB master slave configuration

Get a better understanding of the MongoDB master slave configuration

First the basics: what is the master/slave?One database server (“master”) responds and can do anything. A lot of other database servers store copies of all...

Run a MongoDB data adapter for Atom Hopper with Netbeans

Run a MongoDB data adapter for Atom Hopper with Netbeans

PreambleAtom Hopper (based on Apache Abdera) for those who may not know is an open-source project sponsored by Rackspace. Today we will figure out how to...

Improve MongoDB aggregation structure

Improve MongoDB aggregation structure

PreambleMongoDB recently introduced its new aggregation structure. This structure provides a simpler solution for calculating aggregated values rather...

Passing MongoDB data modeling

Passing MongoDB data modeling

FlexibilityOne of the most advertised features of MongoDB is its flexibility. Flexibility, however, is a double-edged sword. More flexibility means more...

SQLShell: SQL tool for multiple databases with NoSQL potential

SQLShell: SQL tool for multiple databases with NoSQL potential

PreambleSQLShell is a cross-platform command-line tool for SQL, similar to psql for PostgreSQL or MySQL command-line tool for MySQL.Why use it?If you...

Use Mongosniff to clarify what your MongoDB hears and says

Use Mongosniff to clarify what your MongoDB hears and says

PreambleWriting an application on top of the framework on top of the driver on top of the database is a bit like a game on the phone: you say “insert...

Notes on Oracle Coherence

Notes on Oracle Coherence

PreambleOracle Coherence is a distributed cache that is functionally comparable with Memcached. In addition to the basic function of the API cache, it...

FrankenQueries: when SQL and NoSQL collide

FrankenQueries: when SQL and NoSQL collide

PreambleIBM pureXML, a proprietary XML database built on a relational mechanism (designed for puns) that offers both relational ( SQL / XML ) and...

PostgreSQL array

PostgreSQL array

What is PostgreSQL array? In PostgreSQL we can define a column as an array of valid data types. The data type can be built-in, custom or enumerated....

7 steps to create a new Oracle database from the command line

7 steps to create a new Oracle database from the command line

PreambleIf you are a Linux sysadmin or developer, there comes a time when you need to manage an Oracle database that can work in your environment.In this...

Restoring access to the SQL Server instance without restarting it

Restoring access to the SQL Server instance without restarting

PreambleStarting with Microsoft SQL Server 2008, by default, the group of local administrators is no longer added to SQL Server administrators during the...