Data amount processed by modern databases grows steadily. In this regard, there is an acute problem – database performance. Insert, Update and Delete operations have to be performed as fast as possible. Therefore Devart provides several solutions to speed up processing of huge amounts of data. So, for example, insertion of a large portion of data to a DB is supported in the TPgLoader. Unfortunately, TPgLoader allows to insert data only – it can’t be used for updating and deleting data.
The new version of Devart Delphi Data Access Components introduces the new mechanism for large data processing — Batch Operations. The point is that just one parametrized Modify SQL query is executed. The plurality of changes is due to the fact that parameters of such a query will be not single values, but a full array of values. Such approach increases the speed of data operations dramatically. Moreover, in contrast to using TPgLoader, Batch operations can be used not only for insertion, but for modification and deletion as well.
Let’s have a better look at capabilities of Batch operations with an example of the BATCH_TEST table containing attributes of the most popular data types.
CREATE TABLE BATCH_TEST
(
ID INTEGER,
F_INTEGER INTEGER,
F_FLOAT DOUBLE PRECISION,
F_STRING VARCHAR(250),
F_DATE DATE,
CONSTRAINT PK_BATCH_TEST PRIMARY KEY (ID)
)
To insert records into the BATCH_TEST table, we use the following SQL query:
INSERT INTO BATCH_TEST VALUES (:ID, :F_INTEGER, :F_FLOAT, :F_STRING, :F_DATE)
When a simple insertion operation is used, the query parameter values look as follows:
Parameters | ||||
---|---|---|---|---|
:ID | :F_INTEGER | :F_FLOAT | :F_STRING | :F_DATE |
1 | 100 | 2.5 | ‘String Value 1′ | 01.09.2015 |
After the query execution, one record will be inserted into the BATCH_TEST table.
When using Batch operations, the query and its parameters remain unchanged. However, parameter values will be enclosed in an array:
Parameters | ||||
---|---|---|---|---|
:ID | :F_INTEGER | :F_FLOAT | :F_STRING | :F_DATE |
1 | 100 | 2.5 | ‘String Value 1′ | 01.09.2015 |
2 | 200 | 3.15 | ‘String Value 2′ | 01.01.2000 |
3 | 300 | 5.08 | ‘String Value 3′ | 09.09.2010 |
4 | 400 | 7.5343 | ‘String Value 4′ | 10.10.2015 |
5 | 500 | 0.4555 | ‘String Value 5′ | 01.09.2015 |
Now, 5 records are inserted into the table at a time on query execution.
How to implement a Batch operation in the code?
Let’s try to insert 1000 rows to the BATCH_TEST table using a Batch Insert operation:
var
i: Integer;
begin
// describe the SQL query
PgQuery1.SQL.Text := 'INSERT INTO BATCH_TEST VALUES (:ID, :F_INTEGER, :F_FLOAT, :F_STRING, :F_DATE)';
// define the parameter types passed to the query :
PgQuery1.Params[0].DataType := ftInteger;
PgQuery1.Params[1].DataType := ftInteger;
PgQuery1.Params[2].DataType := ftFloat;
PgQuery1.Params[3].DataType := ftString;
PgQuery1.Params[4].DataType := ftDateTime;
// specify the array dimension:
PgQuery1.Params.ValueCount := 1000;
// populate the array with parameter values:
for i := 0 to PgQuery1.Params.ValueCount - 1 do begin
PgQuery1.Params[0][i].AsInteger := i + 1;
PgQuery1.Params[1][i].AsInteger := i + 2000 + 1;
PgQuery1.Params[2][i].AsFloat := (i + 1) / 12;
PgQuery1.Params[3][i].AsString := 'Values ' + IntToStr(i + 1);
PgQuery1.Params[4][i].AsDateTime := Now;
end;
// insert 1000 rows into the BATCH_TEST table
PgQuery1.Execute(1000);
end;
This command will insert 1000 rows to the table with one SQL query using the prepared array of parameter values. The number of inserted rows is defined in the Iters parameter of the Execute(Iters: integer; Offset: integer = 0) method. In addition, you can pass another parameter – Offset (0 by default) – to the method. The Offset parameter points the array element, which the Batch operation starts from.
We can insert 1000 records into the BATCH_TEST table in 2 ways.
All 1000 rows at a time:
PgQuery1.Execute(1000);
2×500 rows:
// insert first 500 rows
PgQuery1.Execute(500, 0);
// insert next 500 rows
PgQuery1.Execute(500, 500);
500 rows, then 300, and finally 200:
// insert 500 rows PgQuery1.Execute(500, 0); // insert next 300 rows starting from 500 PgQuery1.Execute(300, 500); // insert next 200 rows starting from 800 PgQuery1.Execute(200, 800);
With Batch operations we can modify all 1000 rows of our BATCH_TEST table just this simple:
var
i: Integer;
begin
// describe the SQL query
PgQuery1.SQL.Text := 'UPDATE BATCH_TEST SET F_INTEGER=:F_INTEGER, F_FLOAT=:F_FLOAT, F_STRING=:F_STRING, F_DATE=:F_DATE WHERE ID=:OLDID';
// define parameter types passed to the query:
PgQuery1.Params[0].DataType := ftInteger;
PgQuery1.Params[1].DataType := ftFloat;
PgQuery1.Params[2].DataType := ftString;
PgQuery1.Params[3].DataType := ftDateTime;
PgQuery1.Params[4].DataType := ftInteger;
// specify the array dimension:
PgQuery1.Params.ValueCount := 1000;
// populate the array with parameter values:
for i := 0 to 1000 - 1 do begin
PgQuery1.Params[0][i].AsInteger := i - 2000 + 1;
PgQuery1.Params[1][i].AsFloat := (i + 1) / 100;
PgQuery1.Params[2][i].AsString := 'New Values ' + IntToStr(i + 1);
PgQuery1.Params[3][i].AsDateTime := Now;
PgQuery1.Params[4][i].AsInteger := i + 1;
end;
// update 1000 rows in the BATCH_TEST table
PgQuery1.Execute(1000);
end;
Deleting 1000 rows from the BATCH_TEST table looks like the following operation:
var
i: Integer;
begin
// describe the SQL query
PgQuery1.SQL.Text := 'DELETE FROM BATCH_TEST WHERE ID=:ID';
// define parameter types passed to the query:
PgQuery1.Params[0].DataType := ftInteger;
// specify the array dimension
PgQuery1.Params.ValueCount := 1000;
// populate the arrays with parameter values
for i := 0 to 1000 - 1 do
PgQuery1.Params[0][i].AsInteger := i + 1;
// delete 1000 rows from the BATCH_TEST table
PgQuery1.Execute(1000);
end;
The example with BATCH_TEST table allows to analyze execution speed of normal operations with a database and Batch operations:
Operation Type | 25 000 records | ||
---|---|---|---|
Standard Operation (sec.) | Batch Operation (sec.) | Insert | 346.7 | 1.69 |
Update | 334.4 | 4.59 | |
Delete | 373.7 | 2.05 | |
The less, the better. |
It should be noted, that the retrieved results may differ when modifying the same table on different database servers. This is due to the fact that operations execution speed may differ depending on the settings of a particular server, its current workload, throughput, network connection, etc.
When populating the array and inserting records, we accessed query parameters by index. It would be more obvious to access parameters by name:
for i := 0 to 9999 do begin
PgQuery1.Params.ParamByName('ID')[i].AsInteger := i + 1;
PgQuery1.Params.ParamByName('F_INTEGER')[i].AsInteger := i + 2000 + 1;
PgQuery1.Params.ParamByName('F_FLOAT')[i].AsFloat := (i + 1) / 12;
PgQuery1.Params.ParamByName('F_STRING')[i].AsString := 'Values ' + IntToStr(i + 1);
PgQuery1.Params.ParamByName('F_DATE')[i].AsDateTime := Now;
end;
However, the parameter array would be populated slower, since you would have to define the ordinal number of each parameter by its name in each loop iteration. If a loop is executed 10000 times – performance loss can become quite significant.