The importance of the OLAP functions / A importância das funções OLAP
This article is written in English and Portuguese (original version aqui)
Este artigo está escrito em Inglês e Português (versão original aqui)
English version:
Introduction
In a recent customer real situation I was faced with an optimization problem: They had a query that run in a certain time (let's assume around 1H), but they noticed that they were getting "duplicates" (not exactly a row duplicate, but records with duplicate values that should "identify" each row).
This happened because the data model was created in a way that allowed each "object" to be represented more than once in the table (assume different "lives" of the object). And they wanted to remove those "duplicates". Problem was that when they rewrote the query, it would take several hours (it never completed...).
Although I shouldn't use the exact data model, due to privacy restrictions, I'll try to create an equivalent model so that I can show you an example. Let's assume this table structure:
CREATE TABLE customer_tax_code
(
customer_num INTEGER, -- Customer identification
customer_plan CHAR(5), -- Customers are using some plan (INDividual, COMPany, MULTI etc. at each moment, but they may change)
customer_service SMALLINT, -- Customers may have several services active
customer_seq_id SERIAL, -- Each time a customer, in a specific plan, changes to another tax_code, this sequential_id is incremented
customer_tax_code SMALLINT -- The tax code which defines how the customer will be taxed for a specific service. Different tax_codes may contain different discount levels, bonus or promotions etc.
);
The basic query was really something like:SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code
FROM
customer_tax_code
WHERE
customer_service = 1;
Note that I didn't consider any indexes. But this query would be very simple and would have only two possible query plans:
- A full scan if the number of records with customer_service = 1 was the majority of the table
- Some index containing customer_service, if the histograms showed that the percentage of records with customer_service = 1 was relatively low
The solution they had was roughly this:
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code
FROM
customer_tax_code a
WHERE
a.customer_service = 1 AND
a.customer_seq_id = (SELECT
MAX(b.customer_seq_id)
FROM
customer_tax_code b
WHERE
b.customer_num = a.customer_num AND
b.customer_plan = a.customer_plan AND
b.customer_service = 1
);
This would give the desired result.... But a typical query plan would include the options above (full scan or index scan based on customer_service) and for each record would require an index access for the sub-query.To give you some perspective, the percentage of records with customer_service = 1 was roughly 41% of the table. Around 41M rows.
An index like one of these would help a lot:
- (customer_num, customer_plan, customer_num_seq_id)
Would provide the easy access to MAX() for customer_num,customer_plan - (customer_num, customer_plan, customer_service, customer_num_seq_id)
Could improve on the one before as it would include all the filters in the sub-query - (customer_num, customer_plan)
Basic index to easily access all rowids for customer_num/customer_plan pairs
Solution
Then I remembered that we introduced the so called OLAP functions in 12.10. And actually one of them provides what we need for this sort of queries (need to find a MAX(), but we require aditional fields besides that one and the ones we're going to base our GROUP BY).
In fact we can generalize this problem (and hopefully also the solution), as this happens every time the data model is created in a way that maintains "history" and we just want the last image of the object for each "state". In other words, all the situations where besides other filters we want the records where:
some_field = MAX(some_field) and the group for the MAX calculation is defined by a set of columns. Many times "some_field" is a date/datetime field and the set defines an object identification.
The function in question is RANK(). It basically calculates a rank of a field value within a "window" defined in another one or more fields (the ones we'd use for GROUP BY in MAX() calculation).
In our case we can get a "rank" of customer_seq_id for each group of customer_num/customer_plan. By default, the record with the lower (in terms of ORDER BY depending on the datatype) customer_seq_id would get a rank = 1. The next one would get 2 and so on. But we can do the opposite. Assign 1 to the record with the highest customer_seq_id, 2 the the one right below that one and so on.
The query would look like:
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code,
RANK() OVER (PARTITION BY customer_num, customer_plan ORDER BY customer_seq_id DESC) myrank
FROM
customer_tax_code a
WHERE
a.customer_service = 1
Note this would not eliminate the "duplicates". It would just "rank" the result set. Actually we need to impose a restriction saying we just want the records that have myrank = 1 (these will be the ones with higher customer_seq_is within the group of records with same customer_num and customer_plan),Because the RANK() is calculated "in the result set", it doesn't allow a condition "myrank = 1" within the WHERE clause.
But ideally we would be able to use the "HAVING clause" like this:
... HAVING myrank = 1
Unfortunately this raises error -217 (column "myrank" not found). We can't use column alias in the HAVING clause (we can for GROUP BY and ORDER BY clauses). Additionally we cannot use the full column expression in the HAVING clause as it will raise the error:-25856 Invalid usage of window aggregate in this context.
You can use OLAP window aggregate functions in the Projection clause
and in the ORDER BY clause of queries. They cannot be present in
the WHERE clause or in the HAVING clause (except within a subquery).
So the way to do that is to wrap the query above as an inline view for an outside query that includes the restriction myrank = 1:SELECT
*
FROM
(
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code,
RANK() OVER (PARTITION BY customer_num, customer_plan ORDER customer_seq_id DESC) myrank
FROM
customer_tax_code a
WHERE
a.customer_service = 1
)
WHERE my_rank = 1;
This is executed with a single full scan on the table (around 50m). And doesn't depend on hints, complex queries etc. We just take advantage of the new functions. It was the fastest query plan we could achieve for this query.Only one issue: Customer production environment is still in 11.50, which doesn't contain these functions. Just an example of how customers loose time and opportunity by sticking to old versions...
To give you a better insight about this problem, I created a test case with some "random" test data that can prove the point. You may apply it to your environment as I'm testing in a VM and these queries are heavily I/O bound... and I/O is not the best aspect of virtual machines.
Test case
The test data was created with an GAWK script which tries to mimic the original conditions. The script can be found at the bottom of the article. I generated 2.5M rows:
castelo@primary:informix-> echo 2500000 | awk -f generate_data.awk > rank_test_data.unl
castelo@primary:informix->
Then I loaded the data into a table with same schema as above:castelo@primary:informix-> dbaccess -e stores test_ddl.sql
Database selected.
DROP TABLE IF EXISTS customer_tax_code;
Table dropped.
CREATE RAW TABLE customer_tax_code
(
customer_num INTEGER, -- Customer identification
customer_plan CHAR(5), -- Customers are using some plan (INDividual, COMPany, MULTI etc. at each moment, but they may change)
customer_service SMALLINT, -- Customers may have several services active
customer_seq_id SERIAL, -- Each time a customer, in a specific plan, changes to another tax_code, this sequential_id is incremented
customer_tax_code SMALLINT -- The tax code which defines how the customer will be taxed for a specific service. Different tax_codes may contain different discount levels, bonus or promotions etc.
) IN dbs1
EXTENT SIZE 10000 NEXT SIZE 10000
LOCK MODE ROW;
Table created.
BEGIN WORK;
Started transaction.
LOCK TABLE customer_tax_code IN EXCLUSIVE MODE;
Table locked.
LOAD FROM rank_test_data.unl INSERT INTO customer_tax_code;
2500000 row(s) loaded.
COMMIT WORK;
Data committed.
ALTER TABLE customer_tax_code TYPE(STANDARD);
Table altered.
CREATE INDEX ix_customer_tax_code_1 ON customer_tax_code(customer_num,customer_plan,customer_seq_id) IN dbs2;
Index created.
CREATE INDEX ix_customer_tax_code_2 ON customer_tax_code(customer_service) IN dbs2;
Index created.
Database closed.
castelo@primary:informix->
And then I run the following query to have an idea of how data was generated:castelo@primary:informix-> dbaccess -e stores distrib.sql
Database selected.
SELECT
COUNT(*) num, TRUNC((COUNT(*) / 2500000) * 100, 2) percent
FROM
customer_tax_code
WHERE
customer_service = 1;
num percent
1025501 41.02
1 row(s) retrieved.
SELECT
COUNT(*) num, TRUNC((COUNT(*) / 2500000) * 100, 2) percent
FROM (
SELECT
DISTINCT customer_num, customer_plan
FROM
customer_tax_code
WHERE
customer_service = 1
);
num percent
798315 31.93
1 row(s) retrieved.
Database closed.
castelo@primary:informix->
So we should expect a result set with 798315 rows. And we can see that we have 41% of the table data with customer_service = 1 (just like in the real customer situation)In order to show the difference in behavior between both queries I've used a script which I created some time ago and is documented in this article. For that I prepared an SQL script for each query:
castelo@primary:informix-> cat test_query_1.sql
SET EXPLAIN FILE TO "query_option_1.txt";
SET EXPLAIN ON;
UNLOAD TO /dev/null
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code
FROM
customer_tax_code a
WHERE
a.customer_service = 1 AND
a.customer_seq_id = (SELECT MAX(b.customer_seq_id) FROM customer_tax_code b WHERE
b.customer_num = a.customer_num AND
b.customer_plan = a.customer_plan AND
b.customer_service = 1
);
castelo@primary:informix->
castelo@primary:informix-> cat test_query_2.sql
SET EXPLAIN FILE TO "query_option_2.txt";
SET EXPLAIN ON;
UNLOAD TO /dev/null
SELECT
*
FROM (
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code,
RANK() OVER (PARTITION BY customer_num, customer_plan ORDER BY customer_seq_id DESC) myrank
FROM
customer_tax_code a
WHERE
a.customer_service = 1
)
WHERE myrank = 1;
castelo@primary:informix->
Note that I've turned on the EXPLAIN and that I'm UNLOADING to /dev/null. This is just a trick to avoid wasting time writing the data to a file and by doing so, I minimize the variables involved that can twist the query processing time. I'd also like to point out that the issue I'm trying to show is basically an I/O bound problem. And because I've just loaded the data, created the indexes and run the queries above, the side effect is that at this point I had a significant portion of the table in the engine cache. To avoid the influence of this in the times, I've restarted the instance (onmode -ky/oninit).Then I run the mentioned scripts:
castelo@primary:informix-> onmode -ky;oninit
castelo@primary:informix-> /usr/informix/bin/ixprofiling -w -i stores test_query_1.sql
Database selected.
Engine statistics RESETed. Query results:
Query start time: 09:51:29.634374000
SET EXPLAIN FILE TO "query_option_1.txt";
Explain set.
SET EXPLAIN ON;
Explain set.
UNLOAD TO /dev/null
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code
FROM
customer_tax_code a
WHERE
a.customer_service = 1 AND
a.customer_seq_id = (SELECT MAX(b.customer_seq_id) FROM customer_tax_code b WHERE
b.customer_num = a.customer_num AND
b.customer_plan = a.customer_plan AND
b.customer_service = 1
);
798315 row(s) unloaded.
Query stop time: 09:56:16.976229000
Thread profiles (SID: 5)
LkReq LkWai DLks TOuts LgRec IsRd IsWrt IsRWr IsDel BfRd BfWrt LgUse LgMax SeqSc Srts DskSr SrtMx Sched CPU Time Name
----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----------- ------------
496e4 0 0 0 0 831e3 0 0 0 493e4 0 0 0 2 0 0 0 86801 26.71790603 sqlexec
LkWs IOWs nIOW IdxBR Name
------------ ------------ ------------ ------------ -----------------------------------
0.0 251.79098566 16543 0 sqlexec
Session wait statistics:
Thread name Condition Num. waits Cum. time Max wait
----------------------------------- ----------------------------------- ------------ ------------ ------------
sqlexec mutex 1 23.611186280 23.0
sqlexec mt yield 1 6508.7673752 6508.0
sqlexec buffer 17 462525.80710 61764.0
sqlexec mt yield 0 3929 611561.28945 14680.0
sqlexec mt ready 86801 799028.07279 14682.0
sqlexec mt yield n 60 1104496.3218 58390.0
sqlexec condition 2935 2236864.5528 16698.0
sqlexec running 23461 26712075.663 53739.0
sqlexec aio 16517 251627122.20 3787185.0
sqlexec Total time 283560206.29
Partitions profiles (Database: stores)
LkReq LkWai DLks TOuts DskRd DskWr IsRd IsWrt IsRWr IsDel BfRd BfWrt SeqSc Object name
----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ------------------------------------------------------
[... SOME systables stats were removed for clarity ...]
332e4 0 0 0 26049 0 17641 0 0 0 909e3 0 1 customer_tax_code
164e4 0 0 0 30852 0 163e4 0 0 0 413e4 0 0 customer_tax_code#ix_customer_tax_code_1
Database closed.
real 4m49.367s
user 0m0.390s
sys 0m1.390s
castelo@primary:informix-> cat query_option_1.txt
QUERY: (OPTIMIZATION TIMESTAMP: 03-31-2014 09:51:29)
------
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code
FROM
customer_tax_code a
WHERE
a.customer_service = 1 AND
a.customer_seq_id = (SELECT MAX(b.customer_seq_id) FROM customer_tax_code b WHERE
b.customer_num = a.customer_num AND
b.customer_plan = a.customer_plan AND
b.customer_service = 1
)
Estimated Cost: 2418832
Estimated # of Rows Returned: 512750
1) informix.a: SEQUENTIAL SCAN
Filters: (informix.a.customer_service = 1 AND informix.a.customer_seq_id = <subquery> )
Subquery:
---------
Estimated Cost: 4
Estimated # of Rows Returned: 1
1) informix.b: INDEX PATH
Filters: informix.b.customer_service = 1
(1) Index Name: informix.ix_customer_tax_code_1
Index Keys: customer_num customer_plan customer_seq_id (Reverse) (Aggregate) (Serial, fragments: ALL)
Lower Index Filter: (informix.b.customer_num = informix.a.customer_num AND informix.b.customer_plan = informix.a.customer_plan )
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 a
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 798315 512750 2500000 04:38.22 2418832
Subquery statistics:
--------------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 b
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 806330 1 821121 04:30.70 5
type rows_prod est_rows rows_cons time
-------------------------------------------------
group 806330 1 806330 04:31.89
castelo@primary:informix->
Some facts about this execution:- It took around 5m
- It did a sequential scan on the table and used the most complex index for the sub-query (this ones provides easy access to the MAX(customer_num_seq_id) by customer_num/customer_plan
- It spend most of the time in I/O wait
castelo@primary:informix-> onmode -ky;oninit
castelo@primary:informix-> /usr/informix/bin/ixprofiling -w -i stores test_query_2.sql
Database selected.
Engine statistics RESETed. Query results:
Query start time: 09:58:23.477004000
SET EXPLAIN FILE TO "query_option_2.txt";
Explain set.
SET EXPLAIN ON;
Explain set.
UNLOAD TO /dev/null
SELECT
*
FROM (
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code,
RANK() OVER (PARTITION BY customer_num, customer_plan ORDER BY customer_seq_id DESC) myrank
FROM
customer_tax_code a
WHERE
a.customer_service = 1
)
WHERE myrank = 1;
798315 row(s) unloaded.
Query stop time: 09:59:59.676333000
Thread profiles (SID: 6)
LkReq LkWai DLks TOuts LgRec IsRd IsWrt IsRWr IsDel BfRd BfWrt LgUse LgMax SeqSc Srts DskSr SrtMx Sched CPU Time Name
----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----------- ------------
250e4 0 0 0 0 19698 102e4 0 0 206e3 3997 0 0 3 1 1 18944 108e3 9.918404867 sqlexec
LkWs IOWs nIOW IdxBR Name
------------ ------------ ------------ ------------ -----------------------------------
0.0 74.245603123 3087 0 sqlexec
Session wait statistics:
Thread name Condition Num. waits Cum. time Max wait
----------------------------------- ----------------------------------- ------------ ------------ ------------
sqlexec mutex 6 76.625269575 19.0
sqlexec sort io 280 641848.77478 94357.0
sqlexec condition 4554 2177075.9388 21014.0
sqlexec mt yield 0 9538 2845631.3908 24543.0
sqlexec mt ready 108699 2931312.6256 24544.0
sqlexec buffer 191 5480268.6835 126467.0
sqlexec running 17658 9915220.0648 47373.0
sqlexec aio 3088 74141204.005 451378.0
sqlexec Total time 98132638.109
Partitions profiles (Database: stores)
LkReq LkWai DLks TOuts DskRd DskWr IsRd IsWrt IsRWr IsDel BfRd BfWrt SeqSc Object name
----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ------------------------------------------------------
[... SOME systables stats were removed for clarity ...]
250e4 0 0 0 26049 358 10154 0 0 0 179e3 3220 1 customer_tax_code
Database closed.
real 1m38.444s
user 0m0.140s
sys 0m0.750s
castelo@primary:informix->
QUERY: (OPTIMIZATION TIMESTAMP: 03-31-2014 09:58:23)
------
SELECT
*
FROM (
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code,
RANK() OVER (PARTITION BY customer_num, customer_plan ORDER BY customer_seq_id DESC) myrank
FROM
customer_tax_code a
WHERE
a.customer_service = 1
)
WHERE myrank = 1
Estimated Cost: 42278
Estimated # of Rows Returned: 102550
1) (Temp Table For Collection Subquery): SEQUENTIAL SCAN
Filters: (Temp Table For Collection Subquery).myrank = 1
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 a
t2 (Temp Table For Collection Subquery)
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 1025501 1025501 2500000 00:07.19 101043
type rows_sort est_rows rows_cons time
-------------------------------------------------
sort 1025501 0 1025501 01:27.73
type it_count time
----------------------------
olap 1025501 01:31.31
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t2 798315 102550 1025501 00:00.47 42278
castelo@primary:informix->
Some facts about this execution:- The query took around 1m40s
- It did the same sequential scan as the one aboive
- It didn't require heavy I/O on any index
- It wasted little more than 1m on I/O wait
This table summarizes the differences between both queries:
Query 1 (sub-query) | Query 2 (OLAP) | |
---|---|---|
Execution time | 4m49.367s | 1m38.444s |
Thread ISAM reads | 831000,000 | 19698,000 |
Thread ISAM writes | 0,000 | 1020000,000 |
Thread buffer reads | 4930000,000 | 206000,000 |
Thread buffer writes | 0,000 | 3997,000 |
Thread SortMax | 0,000 | 18944,000 |
Thread schedules | 86801,000 | 108000,000 |
Thread CPU time | 26,718 | 9,918 |
Table ISAM reads | 17641,000 | 10154,000 |
Table buffer reads | 909000,000 | 179000,000 |
Table buffer writes | 0,000 | 3220,000 |
Table disk reads | 26049,000 | 26049,000 |
Table disk writes | 0,000 | 358,000 |
Index ISAM reads | 1630000,000 | 0,000 |
Index buffer reads | 4130000,000 | 0,000 |
Index disk reads | 30852,000 | 0,000 |
I/O waits | 16543,000 | 3087,000 |
I/O wait time | 251,790 | 74,245 |
As you can see for yourself, the second query has much better values in most common aspects. It does take extra work for sorting and writing the temporary structure. But even so, this seems to have much less impact. There is also another aspect that makes the first option much better in the test data than in the real data. This aspect alone could deserve a dedicated article, but for now I'll just point out that the way the data was created makes the table relatively ordered by customer_num/customer_plan. This is not an irrelevant aspect in the real situation because of these two reasons:
- In the real scenario, the index used in the sub-query only contained customer_num, customer_plan, and did not contain customer_seq_id which in my test allows the engine to solve the MAX() with just the index access
- When we access the index for the sub-query, the rowids we get are "close". In most cases a single disk access will be able to fetch all of them. In the real situation, the records for the same customer_num/customer_plan would require more disk accesses. Even more, since we're doing a sequential scan, when we try to fetch the row data for the records with the same customer_num/customer_plan, chances are that those pages are already place in memory (retrieved by the sequential scan). If the table was not "ordered" this would not happen and the disk I/O impact would be even bigger.
Versão Portuguesa:
Introducão
Numa situação real num cliente, fui confrontado com um problema de otimização: Usavam uma query que demorava cerca de 1H a correr, mas notaram que a mesma gerava "duplicados" (não exatamente duplicados, mas registos repetidos em valores que deveriam identificar um objeto).
Isto acontecia porque o modelo de dados foi criado de forma a permitir que cada "objeto" fosse representado mais que uma vez na tabela (assumia diferentes "vidas"). E o objetivo era remover esses "duplicados" do resultado da query. O problema é que quando re-escreveram a query o tempo de execução passou para várias horas (nunca terminou...)
Não posso usar o modelo de dados exato por questões de privacidade, mas vou tentar criar um modelo que seja funcionalmente equivalente, para que possa exemplificar a situação. Vamos assumir esta estrutura de tabela:
CREATE TABLE customer_tax_code
(
customer_num INTEGER, -- Identificação do cliente
customer_plan CHAR(5), -- Cada cliente está num determinado plano (INDividual, COMPany, MULTI etc.) em cada momento, mas podem mudar
customer_service SMALLINT, -- Os clientes podem ter vários serviços activos
customer_seq_id SERIAL, -- Cada vez que um cliente muda de código de taxação é criado um novo registo com um número de sequência novo
customer_tax_code SMALLINT -- O código de taxação que define a regra de taxação para o serviço
);
A query original era algo como:SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code
FROM
customer_tax_code
WHERE
customer_service = 1;
Repare que não considerei ainda quaisquer índices. Mas
esta query é extremamente simples e só teria dois planos de execução
possíveis:- Um scan total da tabela se o número de registos com customer_service = 1 for uma percentagem significativa da tabela
- Algum índice que contenha o campo customer_service, se os histogramas mostrarem que a percentagem de registos que valida a condição for reduzido (face à dimensão da tabela)
Por outras palavras queriam trazer todos os registos onde o customer_seq_id era o MAX(customer_seq_id) para cada par customer_num/customer_plan (com customer_service=1)
Mas como querem obter também outros dados da linha, o uso do MAX(customer_seq_id) e o respetivo GROUP BY customer_num, customer_plan não pode ser usado (seríamos obrigado a introduzir todos os campos sem função de agregação na cláusula GROUP BY, e isso iria alterar o resultado obtido).
A solução que tinham criado era isto, em traços gerais:
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code
FROM
customer_tax_code a
WHERE
a.customer_service = 1 AND
a.customer_seq_id = (SELECT
MAX(b.customer_seq_id)
FROM
customer_tax_code b
WHERE
b.customer_num = a.customer_num AND
b.customer_plan = a.customer_plan AND
b.customer_service = 1
);
Esta query obtém o resultado esperado... Mas o plano de execução típico incluíria as opções acima (scan completo
ou acesso por índice sobre a coluna customer_service), e para cada
registo obtido teríamos de fazer um acesso indexado para resolver a
sub-query.Para fornecer mais algum contexto, a percentagem de registos com customer_service = 1 rondava os 41% da tabela (cerca de 41M de linhas).
Um índice semelhante a um destes ajuda bastante a query:
- (customer_num, customer_plan, customer_num_seq_id)
Fornece um acesso rápido ao MAX() por customer_num, customer_plan - (customer_num, customer_plan, customer_service, customer_num_seq_id)
Poderia melhorar o anterior pois incluí todos os filtros para a sub-query - (customer_num, customer_plan)
Este seria o índice mais básico para aceder aos rowids das linhas para cada para customer_num/customer_plan
Solução
Nesta altura lembrei-me que introduzimos as chamadas funções OLAP na versão 12.10. E na verdade uma delas dá-nos algo que necessitamos para este tipo de queries (necessidade de encontrar um MAX(), mas necessitamos de trazer mais campos para além dos que usaríamos para o GROUP BY).
Na verdade, este problema pode ser generalizado (e a ideia é que a solução também), dado que isto acontece sempre que o modelo de dados é criado de forma que mantenha histórico e nós só pretendemos o último estado de cada objeto. Por outras palavras, todas as situações onde para além de outros filtros, queremos:
campo_xpto = MAX(campo_xpto) e o grupo para o cálculo do máximo é definido por um conjunto de colunas. Muitas vezes o "campo_xpto" é um date/datetime e o conjunto de colunas define um identificador de objeto.
A função em questão é o RANK(). Sumariamente, calcula uma ordem para um campo, numa "janela" definida por um ou mais campos (os que usaríamos no GROUP BY do cálculo do MAX()).
No nosso caso, podemos obter um "rank" do customer_seq_id, para cada grupo/janela definido pelos campos customer_num/customer_plan.
Por omissão o registo com o campo com valor mais baixo (em termos de ordenação conforme o tipo de dados) no customer_seq_id recebería o rank() = 1. O seguinte receberia o rank 2 e assim por diante. Mas podemos fazer o oposto. Atribuir 1 ao registo com o customer_seq_id mais alto, 2 ao que se encontrar logo abaixo etc.
A query ficaria assim:
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code,
RANK() OVER (PARTITION BY customer_num, customer_plan ORDER BY customer_seq_id DESC) myrank
FROM
customer_tax_code a
WHERE
a.customer_service = 1
Note que isto não eliminaria os "duplicados". Apenas classificaria os registos no conjunto resultante da query. Na verdade temos de impor uma restrição, dizendo que apenas queremos os campos cujo myrank
= 1 (estes serão os registos com o customer_seq_id mais alto, dentro do
grupo de registos com os mesmos customer_num e customer_plan).Dado que o RANK() é calculado sobre o conjunto de resultados, não pode aceitar o uso da condição "murank = 1" na cláusula WHERE.
Mas idealmente poderíamos usar a cláusula HAVING, tal como:
... HAVING myrank = 1
Infelizmente isto dispara o erro -217 (column "myrank" not found). Não podemos usar alias de colunas na cláusula HAVING (podemos usá-las nos GROUP BY e ORDER BY).E também não podemos usar a expressão completa na cláusula HAVING pois isso causa o erro:
-25856 Invalid usage of window aggregate in this context.
You can use OLAP window aggregate functions in the Projection clause
and in the ORDER BY clause of queries. They cannot be present in
the WHERE clause or in the HAVING clause (except within a subquery).
Por isso, o que temos de fazer é "embrulhar" a query acima como uma inline view numa query exterior que contenha a restrição myrank = 1:SELECT
*
FROM
(
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code,
RANK() OVER (PARTITION BY customer_num, customer_plan ORDER customer_seq_id DESC) myrank
FROM
customer_tax_code a
WHERE
a.customer_service = 1
)
WHERE my_rank = 1;
Esta query era executada com um scan completo da tabela e terminava em cerca de 50m. E é simples de escrever e entender, não dependendo de hints, queries complexas, sub-queries etc. Apenas tiramos proveito das novas funções. Este foi o plano de execução mais eficiente que consegui para esta situação.Apenas tive um problema: Os sistemas de produção do cliente ainda estão em 11.50, a qual não contém estas funções. Serve como exemplo de como os clientes perdem tempo e perdem oportunidades ao manterem-se em versões antigas....
Para lhe dar uma melhor perceção sobre este problema, criei um caso de teste com dados de teste "aleatórios", de forma a provar e demonstrar o princípio explicado acima. Poderá aplicar isto ao seu ambiente, dado que estou a testar num ambiente virtualizado, e estas queries sofrerem muito com o I/O... e é sabido que o I/O não é o ponto forte dos sistemas virtualizados.
Caso de teste
Os dados de teste foram criados com um script GAWK que tenta imitar as condições originais que descrevi acima. O script pode ser consultado no final do artigo. Gerei 2.5M de linhas:
castelo@primary:informix-> echo 2500000 | awk -f generate_data.awk > rank_test_data.unl
castelo@primary:informix->
Depois carreguei os dados numa tabela com a estrutura já mostrada acima:A query seguinte permite ter uma idea de como os dados foram gerados:castelo@primary:informix-> dbaccess -e stores test_ddl.sql Database selected. DROP TABLE IF EXISTS customer_tax_code; Table dropped. CREATE RAW TABLE customer_tax_code (
customer_num INTEGER, -- Identificação do cliente customer_plan CHAR(5), -- Cada cliente está num determinado plano (INDividual, COMPany, MULTI etc.) em cada momento, mas podem mudar customer_service SMALLINT, -- Os clientes podem ter vários serviços activos customer_seq_id SERIAL, -- Cada vez que um cliente muda de código de taxação é criado um novo registo com um número de sequência novo customer_tax_code SMALLINT -- O código de taxação que define a regra de taxação para o serviço
) IN dbs1 EXTENT SIZE 10000 NEXT SIZE 10000 LOCK MODE ROW; Table created. BEGIN WORK; Started transaction. LOCK TABLE customer_tax_code IN EXCLUSIVE MODE; Table locked. LOAD FROM rank_test_data.unl INSERT INTO customer_tax_code; 2500000 row(s) loaded. COMMIT WORK; Data committed. ALTER TABLE customer_tax_code TYPE(STANDARD); Table altered. CREATE INDEX ix_customer_tax_code_1 ON customer_tax_code(customer_num,customer_plan,customer_seq_id) IN dbs2; Index created. CREATE INDEX ix_customer_tax_code_2 ON customer_tax_code(customer_service) IN dbs2; Index created. Database closed. castelo@primary:informix->
castelo@primary:informix-> dbaccess -e stores distrib.sql
Database selected.
SELECT
COUNT(*) num, TRUNC((COUNT(*) / 2500000) * 100, 2) percent
FROM
customer_tax_code
WHERE
customer_service = 1;
num percent
1025501 41.02
1 row(s) retrieved.
SELECT
COUNT(*) num, TRUNC((COUNT(*) / 2500000) * 100, 2) percent
FROM (
SELECT
DISTINCT customer_num, customer_plan
FROM
customer_tax_code
WHERE
customer_service = 1
);
num percent
798315 31.93
1 row(s) retrieved.
Database closed.
castelo@primary:informix->
Portanto deveremos esperar um conhjunto de resultados
com 798315 linhas. E como podemos ver, os dados da tabela de teste têm
41% de registos com customer_service = 1 (tal como na situação real).Para mostrar a diferença de comportamento entre ambas as queries usei um script que criei há algum tempo atrás e que documentei neste artigo. Para o usar preparei também um script para cada query:
castelo@primary:informix-> cat test_query_1.sql
SET EXPLAIN FILE TO "query_option_1.txt";
SET EXPLAIN ON;
UNLOAD TO /dev/null
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code
FROM
customer_tax_code a
WHERE
a.customer_service = 1 AND
a.customer_seq_id = (SELECT MAX(b.customer_seq_id) FROM customer_tax_code b WHERE
b.customer_num = a.customer_num AND
b.customer_plan = a.customer_plan AND
b.customer_service = 1
);
castelo@primary:informix->
castelo@primary:informix-> cat test_query_2.sql
SET EXPLAIN FILE TO "query_option_2.txt";
SET EXPLAIN ON;
UNLOAD TO /dev/null
SELECT
*
FROM (
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code,
RANK() OVER (PARTITION BY customer_num, customer_plan ORDER BY customer_seq_id DESC) myrank
FROM
customer_tax_code a
WHERE
a.customer_service = 1
)
WHERE myrank = 1;
castelo@primary:informix->
Repare que activei o EXPLAIN e que estou a fazer o
UNLOAD para /dev/null. Isto é apenas um truque para evitar desperdiçar
tempo a escrever os dados em ficheiro, e ao fazê-lo estou a minimizar as
variáveis que afectarão o tempo de execução. Gostaria também de referir
que o problema que estou a tentar explicar traduz-se num problema de
I/O. E dado que acabei de carregar os dados, criar os índices e correr
as queries acima, no ponto em que nos encontramos a memória do
servidor de base de dados conterá uma parte significativa dos dados da
tabela. Para evitar que isto influencie os tempos, re-iniciei a
instância (onmode -ky;oninit).Depois executei os referidos scripts:
castelo@primary:informix-> onmode -ky;oninit
castelo@primary:informix-> /usr/informix/bin/ixprofiling -w -i stores test_query_1.sql
Database selected.
Engine statistics RESETed. Query results:
Query start time: 09:51:29.634374000
SET EXPLAIN FILE TO "query_option_1.txt";
Explain set.
SET EXPLAIN ON;
Explain set.
UNLOAD TO /dev/null
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code
FROM
customer_tax_code a
WHERE
a.customer_service = 1 AND
a.customer_seq_id = (SELECT MAX(b.customer_seq_id) FROM customer_tax_code b WHERE
b.customer_num = a.customer_num AND
b.customer_plan = a.customer_plan AND
b.customer_service = 1
);
798315 row(s) unloaded.
Query stop time: 09:56:16.976229000
Thread profiles (SID: 5)
LkReq LkWai DLks TOuts LgRec IsRd IsWrt IsRWr IsDel BfRd BfWrt LgUse LgMax SeqSc Srts DskSr SrtMx Sched CPU Time Name
----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----------- ------------
496e4 0 0 0 0 831e3 0 0 0 493e4 0 0 0 2 0 0 0 86801 26.71790603 sqlexec
LkWs IOWs nIOW IdxBR Name
------------ ------------ ------------ ------------ -----------------------------------
0.0 251.79098566 16543 0 sqlexec
Session wait statistics:
Thread name Condition Num. waits Cum. time Max wait
----------------------------------- ----------------------------------- ------------ ------------ ------------
sqlexec mutex 1 23.611186280 23.0
sqlexec mt yield 1 6508.7673752 6508.0
sqlexec buffer 17 462525.80710 61764.0
sqlexec mt yield 0 3929 611561.28945 14680.0
sqlexec mt ready 86801 799028.07279 14682.0
sqlexec mt yield n 60 1104496.3218 58390.0
sqlexec condition 2935 2236864.5528 16698.0
sqlexec running 23461 26712075.663 53739.0
sqlexec aio 16517 251627122.20 3787185.0
sqlexec Total time 283560206.29
Partitions profiles (Database: stores)
LkReq LkWai DLks TOuts DskRd DskWr IsRd IsWrt IsRWr IsDel BfRd BfWrt SeqSc Object name
----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ------------------------------------------------------
[... SOME systables stats were removed for clarity ...]
332e4 0 0 0 26049 0 17641 0 0 0 909e3 0 1 customer_tax_code
164e4 0 0 0 30852 0 163e4 0 0 0 413e4 0 0 customer_tax_code#ix_customer_tax_code_1
Database closed.
real 4m49.367s
user 0m0.390s
sys 0m1.390s
castelo@primary:informix-> cat query_option_1.txt
QUERY: (OPTIMIZATION TIMESTAMP: 03-31-2014 09:51:29)
------
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code
FROM
customer_tax_code a
WHERE
a.customer_service = 1 AND
a.customer_seq_id = (SELECT MAX(b.customer_seq_id) FROM customer_tax_code b WHERE
b.customer_num = a.customer_num AND
b.customer_plan = a.customer_plan AND
b.customer_service = 1
)
Estimated Cost: 2418832
Estimated # of Rows Returned: 512750
1) informix.a: SEQUENTIAL SCAN
Filters: (informix.a.customer_service = 1 AND informix.a.customer_seq_id = <subquery> )
Subquery:
---------
Estimated Cost: 4
Estimated # of Rows Returned: 1
1) informix.b: INDEX PATH
Filters: informix.b.customer_service = 1
(1) Index Name: informix.ix_customer_tax_code_1
Index Keys: customer_num customer_plan customer_seq_id (Reverse) (Aggregate) (Serial, fragments: ALL)
Lower Index Filter: (informix.b.customer_num = informix.a.customer_num AND informix.b.customer_plan = informix.a.customer_plan )
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 a
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 798315 512750 2500000 04:38.22 2418832
Subquery statistics:
--------------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 b
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 806330 1 821121 04:30.70 5
type rows_prod est_rows rows_cons time
-------------------------------------------------
group 806330 1 806330 04:31.89
castelo@primary:informix->
Alguns factos sobre esta execução:- Demorou cerca de 5m
- Fez um scan completo à tabela e usou o índice mais complexo para resolver a sub-query (este fornece acesso fáacil ao MAX(customer_seq_id) por customer_num / customer_plan)
- Passou a maioria do tempo à espera de I/O
castelo@primary:informix-> onmode -ky;oninit
castelo@primary:informix-> /usr/informix/bin/ixprofiling -w -i stores test_query_2.sql
Database selected.
Engine statistics RESETed. Query results:
Query start time: 09:58:23.477004000
SET EXPLAIN FILE TO "query_option_2.txt";
Explain set.
SET EXPLAIN ON;
Explain set.
UNLOAD TO /dev/null
SELECT
*
FROM (
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code,
RANK() OVER (PARTITION BY customer_num, customer_plan ORDER BY customer_seq_id DESC) myrank
FROM
customer_tax_code a
WHERE
a.customer_service = 1
)
WHERE myrank = 1;
798315 row(s) unloaded.
Query stop time: 09:59:59.676333000
Thread profiles (SID: 6)
LkReq LkWai DLks TOuts LgRec IsRd IsWrt IsRWr IsDel BfRd BfWrt LgUse LgMax SeqSc Srts DskSr SrtMx Sched CPU Time Name
----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----------- ------------
250e4 0 0 0 0 19698 102e4 0 0 206e3 3997 0 0 3 1 1 18944 108e3 9.918404867 sqlexec
LkWs IOWs nIOW IdxBR Name
------------ ------------ ------------ ------------ -----------------------------------
0.0 74.245603123 3087 0 sqlexec
Session wait statistics:
Thread name Condition Num. waits Cum. time Max wait
----------------------------------- ----------------------------------- ------------ ------------ ------------
sqlexec mutex 6 76.625269575 19.0
sqlexec sort io 280 641848.77478 94357.0
sqlexec condition 4554 2177075.9388 21014.0
sqlexec mt yield 0 9538 2845631.3908 24543.0
sqlexec mt ready 108699 2931312.6256 24544.0
sqlexec buffer 191 5480268.6835 126467.0
sqlexec running 17658 9915220.0648 47373.0
sqlexec aio 3088 74141204.005 451378.0
sqlexec Total time 98132638.109
Partitions profiles (Database: stores)
LkReq LkWai DLks TOuts DskRd DskWr IsRd IsWrt IsRWr IsDel BfRd BfWrt SeqSc Object name
----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ------------------------------------------------------
[... SOME systables stats were removed for clarity ...]
250e4 0 0 0 26049 358 10154 0 0 0 179e3 3220 1 customer_tax_code
Database closed.
real 1m38.444s
user 0m0.140s
sys 0m0.750s
castelo@primary:informix->
QUERY: (OPTIMIZATION TIMESTAMP: 03-31-2014 09:58:23)
------
SELECT
*
FROM (
SELECT
customer_num, customer_plan, customer_seq_id, customer_tax_code,
RANK() OVER (PARTITION BY customer_num, customer_plan ORDER BY customer_seq_id DESC) myrank
FROM
customer_tax_code a
WHERE
a.customer_service = 1
)
WHERE myrank = 1
Estimated Cost: 42278
Estimated # of Rows Returned: 102550
1) (Temp Table For Collection Subquery): SEQUENTIAL SCAN
Filters: (Temp Table For Collection Subquery).myrank = 1
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 a
t2 (Temp Table For Collection Subquery)
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 1025501 1025501 2500000 00:07.19 101043
type rows_sort est_rows rows_cons time
-------------------------------------------------
sort 1025501 0 1025501 01:27.73
type it_count time
----------------------------
olap 1025501 01:31.31
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t2 798315 102550 1025501 00:00.47 42278
castelo@primary:informix->
Factos sobre esta execução:- A query demorou cerca de 1m40s
- Fez o mesmo scan completo que a anterior
- Não necessitou de qualquer I/O pesado sobre nenhum dos índices
- Desperdicou pouco mais que 1m em esperas pelo I/O (versus cerca de 4m na anterior)
A tabela seguinte contém o sumário das diferenças entre ambas as queries:
Query 1 (sub-query) | Query 2 (OLAP) | |
---|---|---|
Tempo de exeução | 4m49.367s | 1m38.444s |
Thread ISAM reads | 831000,000 | 19698,000 |
Thread ISAM writes | 0,000 | 1020000,000 |
Thread buffer reads | 4930000,000 | 206000,000 |
Thread buffer writes | 0,000 | 3997,000 |
Thread SortMax | 0,000 | 18944,000 |
Thread schedules | 86801,000 | 108000,000 |
Thread CPU time | 26,718 | 9,918 |
Table ISAM reads | 17641,000 | 10154,000 |
Table buffer reads | 909000,000 | 179000,000 |
Table buffer writes | 0,000 | 3220,000 |
Table disk reads | 26049,000 | 26049,000 |
Table disk writes | 0,000 | 358,000 |
Index ISAM reads | 1630000,000 | 0,000 |
Index buffer reads | 4130000,000 | 0,000 |
Index disk reads | 30852,000 | 0,000 |
I/O waits | 16543,000 | 3087,000 |
I/O wait time | 251,790 | 74,245 |
Como pode verificar por si próprio, a segunda query tem valores muito melhores no que é comum. Necessita de algum trabalho extra para ordenar e escrever a estrutura temporária. Mas mesmo assim, parece ter um impacto muito menor. Há ainda um outro aspecto que torna a primeira opção muito melhor com estes dados de teste que na situação real. Este aspecto por si só mereceria um artigo, mas de momento vou apenas referir que a forma como os dados foram criados, faz com que a tabela fique relativamente ordenada por customer_num/customer_plan. Este aspecto não é de todo irrelevante na situação real por dois motivos:
- No caso real o índice que existia tinha apenas customer_num/customer_plan e não tinha como terceiro campo o customer_seq_id que neste caso permite obter o MAX() apenas com o acesso ao índice
- Quando acedemos ao índice para resolver a sub-query, os rowids que obtemos tendem a estar próximos. Em muitos casos um único acesso a disco chegará para obter todas as linhas que validem a condição. Na situação real, obter os registos para o mesmo customer_num/customer_plan necessitaria de mais acessos a disco, e tipicamente mais aleatórios ou espalhados que neste caso. Ainda por cima, como estamos a fazer um scan completo para a query de fora, quando tentamos aceder por rowid aos registos com o mesmo customer_num/customer_plan, a probabilidade de esses dados já terem sido lidos pelo scan é grande. Se a tabela não estivesse "ordenada", isto não aconteceria e o impacto no I/O seria sensivelmente maior.
AWK script used to generate the test data:
castelo@primary:informix-> cat generate_data.awk
function myrand(i)
{
return int(rand() * i) + 1;
}
BEGIN {
srand();
my_previous_customer=5000000;
my_previous_plan="MULTI";
my_previous_service=1;
list_tax_size=6;
list_tax_code[1]=15;
list_tax_code[2]=25;
list_tax_code[3]=35;
list_tax_code[4]=45;
list_tax_code[5]=55;
list_tax_code[6]=65;
list_plans_size=4;
list_plans[1]="MULTI";
list_plans[2]="SOLO";
list_plans[3]="GROUP";
list_plans[4]="KIDS";
client_with_service_1_percent=0.41;
same_client_plan_percent=0.5;
customer_num_repetition_percent=0.33
}
{
for(a=1;a<=$1;a++)
{
r=myrand(100);
if ( r <= customer_num_repetition_percent * 100 )
{
# % will repeat previouc customer_num
client_num = my_previous_customer;
client_service = my_previous_service;
r=myrand(100);
if ( r <= same_client_plan_percent * 100 )
{
client_plan = my_previous_plan;
r=myrand(list_tax_size);
client_tax_code=list_tax_code[r]
}
else
{
r=myrand(list_plans_size);
client_plan=list_plans[r];
r=myrand(list_tax_size);
client_tax_code=list_tax_code[r]
}
}
else
{
# random customer_num
client_num = myrand(10000000);
r=myrand(list_plans_size);
client_plan=list_plans[r];
r=myrand(100);
if ( r <= client_with_service_1_percent * 100 )
client_service=1;
else
client_service=r;
r=myrand(list_tax_size);
client_tax_code=list_tax_code[r]
}
my_previous_customer=client_num;
my_previous_plan=client_plan;
my_previous_service=client_service;
printf("%s|%s|%s|%s|%s\n",client_num,client_plan,client_service,0,client_tax_code);
}
}
castelo@primary:informix->