Friday, September 30, 2016

Sysadmin during upgrades / Sysadmin durante upgrades

Curious situation related to sysadmin during an upgrade (original version here: http://informix-technology.blogspot.com/2016/09/sysadmin-during-upgrades-sysadmin.html)


English version
Recently at a customer we were testing an upgrade from 11.50.xC7 to 12.10.xC7W1 and it apparently hanged. That's not what you like to see during an upgrade, even in a test environment.
We waited a few minutes and tried to understand what was happening. There was a thread called "convert" that apparently was idle. The thread stack trace apparently didn't change.
The functions showing up in the stack suggested it was trying to connect to a remote server following up what looks to be a call to UPDATE STATISTICS. That was a bit of a surprise.... looking at the online.log it was "hanged" during the "SCHAPI" phase. This is related to sysadmin.
Considering the upgrade didn't look good and it was a test system we decided to kill the engine and use onrestorept to revert back to version 11.50.
Then we decided to run an UPDATE STATISTICS FOR PROCEDURE on sysadmin database and voilá.... hanged again.
So, an UPDATE STATISTICS FOR PROCEDURE on sysadmin got hanged, and apparently that same thing happened during the upgrade. Further investigation led to a simple conclusion. We had created procedures in sysadmin to implement some monitoring tasks. And some of them tried to do remote operations on an engine where we centralize the monitoring data. Just before the upgrade a change was made to the instance $INFORMIXSQLHOSTS so that we "isolated" it from the outside world. This was done by adding some value to the service ports. The consequence of this was that during the upgrade, when it tried to run the UPDATE STATISTICS FOR PROCEDURE it tried to access a non-reachable remote engine. As it was not possible, it hanged some time for each remote instruction. Later tests on 12.10 showed the time it waits for each connection attempt is by default 60s and can be defined by changing the relatively new ONCONFIG parameter INFORMIXCONTIME

This can be easily reproduced. On a version 11.50 engine we create a simple (dumb) procedure:


bica@primary:informix-> onstat -

IBM Informix Dynamic Server Version 11.50.FC7 -- On-Line -- Up 00:30:33 -- 269068 Kbytes


bica@primary:informix-> cat test_remote_proc_stats.sql


CREATE PROCEDURE test_updstats_remote()

DEFINE i INTEGER;

SELECT 1 INTO i FROM stores@test:systables WHERE tabid = 1;

SELECT 2 INTO i FROM somedb@test:systables WHERE tabid = 1;

END PROCEDURE;
bica@primary:informix-> 



bica@primary:informix-> dbaccess sysadmin test_remote_proc_stats.sql

Database selected.

Routine created.

Database closed.

bica@primary:informix->

Database server "test" exists in the $INFORMIXSQLHOSTS, but points to a wrong port. Let's try an upgrade to 12.10.FC7 and see what happens.

From the online.log we get:


01:30:52 IBM Informix Dynamic Server Started.
01:30:52 Requested shared memory segment size rounded from 4308KB to 4796KB

Wed Aug 31 01:30:54 2016

01:30:54 Requested shared memory segment size rounded from 221001KB to 221004KB
01:30:54 Successfully added a bufferpool of page size 2K.

01:30:54 Event alarms enabled. ALARMPROG = '/usr/informix/srvr1210fc7w1/etc/alarmprogram.sh'
01:30:54 Booting Language from module <>
01:30:54 Loading Module
01:30:54 Booting Language from module <>
01:30:54 Loading Module
01:30:59 DR: DRAUTO is 0 (Off)
01:30:59 DR: ENCRYPT_HDR is 0 (HDR encryption Disabled)
01:30:59 Event notification facility epoll enabled.
01:31:00 Trusted host cache successfully built:/etc/hosts.equiv.
01:31:00 CCFLAGS2 value set to 0x200
01:31:00 SQL_FEAT_CTRL value set to 0x8008
01:31:00 SQL_DEF_CTRL value set to 0x4b0
01:31:00 IBM Informix Dynamic Server Version 12.10.FC7W1 Software Serial Number AAA#B000000
01:31:00 Conversion from version 11.50.xC6 Started

[...]

01:31:05 processing sysmaster
01:31:05 *** entity_ver=31 targetVersion=36 indexVersion= 0 ***
01:31:05 *** FromVersion=10 ToVersion=26 entity_ver=31 targetVersion=36 indexVersion=0 ***
01:31:05 *** FromVersion=26 ToVersion=28 entity_ver=31 targetVersion=36 indexVersion=1 ***
01:31:05 *** FromVersion=28 ToVersion=29 entity_ver=31 targetVersion=36 indexVersion=2 ***
01:31:05 *** FromVersion=29 ToVersion=30 entity_ver=31 targetVersion=36 indexVersion=3 ***
01:31:05 *** FromVersion=30 ToVersion=31 entity_ver=31 targetVersion=36 indexVersion=4 ***
01:31:05 *** FromVersion=31 ToVersion=32 entity_ver=31 targetVersion=36 indexVersion=5 ***
01:31:05 *** SQL 1 11.70.xC1 ***
01:31:05 Converting database sysmaster ...
01:31:06 The database sysmaster has been converted successfully.
01:31:06 *** FromVersion=32 ToVersion=33 entity_ver=32 targetVersion=36 indexVersion=6 ***
01:31:06 *** SQL 1 11.70.xC3 ***
01:31:06 Converting database sysmaster ...
01:31:06 The database sysmaster has been converted successfully.
01:31:06 *** FromVersion=33 ToVersion=34 entity_ver=33 targetVersion=36 indexVersion=7 ***
01:31:06 *** SQL 1 12.10.xC1 ***
01:31:06 Converting database sysmaster ...
01:31:06 The database sysmaster has been converted successfully.
01:31:06 *** FromVersion=34 ToVersion=35 entity_ver=34 targetVersion=36 indexVersion=8 ***
01:31:06 *** SQL 1 12.10.xC4 ***
01:31:06 Converting database sysmaster ...
01:31:06 The database sysmaster has been converted successfully.
01:31:06 *** FromVersion=35 ToVersion=36 entity_ver=35 targetVersion=36 indexVersion=9 ***
01:31:06 *** SQL 1 12.10.xC4 ***
01:31:06 processing sysutils

[...]

01:31:17 Conv/rev: calling conversion to 12.10.xC1 ...
01:31:19 Conv/rev: blade conversion to 12.10.xC1 completed
01:31:19 *** FromVersion=6 ToVersion=7 entity_ver=6 targetVersion=7 indexVersion=6 ***
01:31:19 *** BLADE 1 12.10.xC4 ***
01:31:19 Conv/rev: calling conversion to 12.10.xC4 ...
01:31:21 Conv/rev: blade conversion to 12.10.xC4 completed
01:31:21 Conv/rev: Finished conversion for component BLADE
01:31:21 Conv/rev: Started conversion for component SCHAPI
01:31:21 *** numVersions=7 targetVersion=7 ***
01:31:21 *** entity_ver=3 targetVersion=7 indexVersion= 0 ***
01:31:21 *** FromVersion=0 ToVersion=1 entity_ver=3 targetVersion=7 indexVersion=0 ***
01:31:21 *** FromVersion=1 ToVersion=2 entity_ver=3 targetVersion=7 indexVersion=1 ***
01:31:21 *** FromVersion=2 ToVersion=3 entity_ver=3 targetVersion=7 indexVersion=2 ***
01:31:21 *** FromVersion=3 ToVersion=4 entity_ver=3 targetVersion=7 indexVersion=3 ***
01:31:21 *** SCHAPI 1 11.70.xC1 ***
01:31:21 calling schapi_cvr1170_convert
01:31:21 *** FromVersion=4 ToVersion=5 entity_ver=4 targetVersion=7 indexVersion=4 ***
01:31:21 *** SCHAPI 1 11.70.xC3 ***



and it hangs...
Looking into the threads and thread stack:


IBM Informix Dynamic Server Version 12.10.FC7W1 -- Single-User -- Up 00:00:54 -- 266648 Kbytes

Threads:
tid tcb rstcb prty status vp-class name
2 4579d9e8 0 1 IO Idle 3lio* lio vp 0
3 458fd368 0 1 IO Idle 4pio* pio vp 0
4 4591e368 0 1 IO Idle 5aio* aio vp 0
5 4593f368 1d43600 1 IO Idle 6msc* msc vp 0
6 45970368 0 1 IO Idle 7fifo* fifo vp 0
7 45991608 44c1e028 3 sleeping secs: 1 1cpu main_loop()
8 459ed4c0 0 1 running 8soc* soctcppoll
9 459edc98 0 2 sleeping forever 1cpu* soctcplst
10 45a3cc88 44c1e8f0 1 sleeping secs: 1 1cpu flush_sub(0)
11 45a59028 44c1f1b8 1 sleeping secs: 1 1cpu flush_sub(1)
12 45a59368 44c1fa80 1 sleeping secs: 1 1cpu flush_sub(2)
13 45a596a8 44c20348 1 sleeping secs: 1 1cpu flush_sub(3)
14 45a599e8 44c20c10 1 sleeping secs: 1 1cpu flush_sub(4)
15 45a59d28 44c214d8 1 sleeping secs: 1 1cpu flush_sub(5)
16 45ab70e0 44c21da0 1 sleeping secs: 1 1cpu flush_sub(6)
17 45ab7420 44c22668 1 sleeping secs: 1 1cpu flush_sub(7)
18 45ab7760 0 3 IO Idle 1cpu* kaio
19 45c0c2c8 44c22f30 3 sleeping secs: 1 1cpu convert
20 45c0c9f8 44c237f8 2 sleeping secs: 1 1cpu aslogflush
21 45ccc958 44c240c0 1 sleeping secs: 8 1cpu btscanner_0
22 45ceba00 44c24988 3 cond wait ReadAhead 1cpu readahead_0
23 45d0eb50 44c25250 3 sleeping secs: 1 1cpu auto_tune
27 45d8aa80 44c25b18 1 sleeping secs: 0 1cpu sbspclean
28 45d30aa8 44c26ca8 1 sleeping secs: 1 1cpu defragclean
30 45d738f0 44c27570 3 sleeping secs: 1 1cpu* onmode_mon
31 45d89028 44c27e38 3 sleeping secs: 1 1cpu periodic
32 45d89608 44c28700 3 sleeping forever 1cpu memory
33 45d89be8 44c28fc8 3 sleeping secs: 15 1cpu session_mgr
35 45e08768 44c263e0 1 cond wait bp_cond 1cpu bf_priosweep()

IBM Informix Dynamic Server Version 12.10.FC7W1 -- Single-User -- Up 00:01:03 -- 266648 Kbytes

Stack for thread: 19 convert
base: 0x0000000045c19000
len: 69632
pc: 0x00000000013eaa6f
tos: 0x0000000045c24d50
state: sleeping
vp: 1

0x00000000013eaa6f (oninit) yield_processor_svp
0x00000000013ed5d3 (oninit) mt_yield
0x0000000001501340 (oninit) net_wait_for_connect
0x0000000001506ade (oninit) connsocket
0x000000000155f3c0 (oninit) tlConnect
0x000000000155c56a (oninit) slSQIreq
0x0000000001555343 (oninit) pfConReq
0x000000000154c628 (oninit) cmReqSync
0x000000000154d7ef (oninit) cmConReq
0x00000000014ee62e (oninit) ASF_Call
0x000000000076fd94 (oninit) dd_newconnect
0x00000000007712d0 (oninit) dd_remconnect
0x0000000000765404 (oninit) remcatalog
0x0000000000734831 (oninit) readtab
0x000000000073a372 (oninit) ddgettable_hierarchy_trace
0x00000000006bd1d5 (oninit) aud_ddgettable_trace
0x0000000000601206 (oninit) gettab_trace
0x0000000000655f15 (oninit) tlgetddtab_trace
0x0000000000656530 (oninit) tlp2tnode
0x000000000065af3b (oninit) tbl2tab
0x000000000063785d (oninit) sel2cb
0x0000000000639498 (oninit) subq2cb
0x00000000006632c1 (oninit) sq2expr
0x0000000000663373 (oninit) subq2expr
0x000000000066cca4 (oninit) expr2expr
0x0000000000682764 (oninit) ip_exprparse
0x00000000006a28fe (oninit) cmp_stmt
0x00000000006a2c1d (oninit) pr_compile
0x00000000006bf021 (oninit) aud_pr_compile
0x00000000006a3987 (oninit) updprocstat
0x00000000005cfba6 (oninit) excommand
0x0000000000f7eed7 (oninit) admin_exec_cmd
0x0000000000f7f2d5 (oninit) run_sql_script
0x0000000000f7f418 (oninit) run_script_etc
0x0000000000f9b826 (oninit) run_script_conv
0x0000000000f9b936 (oninit) ph_convrev
0x0000000000f42233 (oninit) schapi_cvr1170XC3_convert
0x0000000000f3f906 (oninit) do_each_version_work
0x0000000000f402d3 (oninit) generic_fw_main
0x0000000000f40cf6 (oninit) convertSM
0x00000000013c3720 (oninit) th_init_initgls
0x000000000142a518 (oninit) startup




The stack trace is clear. And precisely 2 minutes later it continues:


01:31:21 *** SCHAPI 1 11.70.xC3 ***
01:33:22 *** FromVersion=5 ToVersion=6 entity_ver=5 targetVersion=7 indexVersion=5 ***
01:33:22 *** SCHAPI 1 12.10.xC1 ***
01:33:22 calling schapi_cvr1210_convert
01:33:22 *** FromVersion=6 ToVersion=7 entity_ver=6 targetVersion=7 indexVersion=6 ***
01:33:22 *** SCHAPI 1 12.10.xC4 ***
01:33:22 calling schapi_cvr1210XC4_convert
01:33:23 Dynamically allocated new virtual shared memory segment (size 8192KB)
01:33:23 Memory sizes:resident:4796 KB, virtual:270044 KB, no SHMTOTAL limit
01:33:23 Conv/rev: Finished conversion for component SCHAPI




And it completed the conversion. In the customer situation, a later test showed a "hang" period of around 45m. So, a big warning.... The conversion process will run an UPDATE STATISTICS FOR PROCEDURE on sysadmin database. If you have created some procedures that do remote/distributed operations than you must consider this "problem" during the upgrade. Here are your options:

  1. Allow connections to that remote server(s)
  2. Remove the procedures and recreate them
  3. Use INFORMIXCONTIME parameter (12.10 only) to reduce the timeouts
  4. Remove the remote INFORMIXSERVER from the INFORMIXSQLHOSTS during migration. This also makes the connections attempts very quickly (as it simply can't find where to connect)


Versão Portuguesa
Há algum tempo num cliente, estando a testar uma migração de uma versão 11.50.FC7 para 12.10.FC7W1 deparámo-nos com um aparente bloqueio. Não é o que gostamos de ver, ainda que seja num ambiente de testes.
Esperámos uns minutos e tentámos perceber o que se estava a passar. Havia uma thread chamada "convert" que aparentemente estava inativa. O seu stack trace não parecia mudar
As funções que apareciam no stack sugeriam que estava a tentar contactar um servidor remoto, seguindo o que parecia ser uma chamada a um UPDATE STATISTICS. Isto foi uma grande surpresa.... Olhando para o online.log estava parado na fase "SCHAPI". Isto está relacionado com a sysadmin.
Considerando que o upgrade não estava com bom aspeto e que se tratava de um ambiente de testes decidimos abortar e usámos o onrestorept para reverter para a versão 11.50 (umas das minhas funcionalidades favoritas).
Para efeitos de teste e investigação decidimos executar um UPDATE STATISTICS FOR PROCEDURE na base de dados sysadmin. E voilá, preso de novo.
Portanto um UPDATE STATISTICS FOR PROCEDURE na sysadmin bloqueava o sistema e isso teria acontecido durante o upgrade. Após mais investigação concluímos que tínhamos criado procedimentos nesta base de dados para implementar algumas tarefas de monitorização e que esses procedimentos executavam algumas operações remotas para enviar a informação de monitorização para um sistema central. E antes do upgrade tinha sido feito uma alteração "de rotina" que consistiu na alteração do $INFORMIXSQLHOSTS de forma a "isolar" este motor. Basicamente foi adicionado um valor a todas as portas dos listeners, o que faz com que os clientes (não alterados) não consigam chegar ao ambiente. A consequência disto é que quando efetuamos o UPDATE STATISTICS este motor tenta aceder ao(s) motores referenciados nas operações remotas o que também não será possível. E como tal temos um timeout por cada operação remota encontrada nos procedimentos. Testes posteriores na 12.10 mostraram que a espera por cada tentativa de conexão é o valor por omissão (60s) ou o que estiver definido no parâmetro de $ONCONFIG INFORMIXCONTIME introduzido recentemente.

Tudo isto pode ser reproduzido facilmente. Num motor com a versão 11.50 criamos um procedimento simples (inútil):


bica@primary:informix-> onstat -

IBM Informix Dynamic Server Version 11.50.FC7 -- On-Line -- Up 00:30:33 -- 269068 Kbytes


bica@primary:informix-> cat test_remote_proc_stats.sql


CREATE PROCEDURE test_updstats_remote()

DEFINE i INTEGER;

SELECT 1 INTO i FROM stores@test:systables WHERE tabid = 1;

SELECT 2 INTO i FROM somedb@test:systables WHERE tabid = 1;

END PROCEDURE;
bica@primary:informix-> 



bica@primary:informix-> dbaccess sysadmin test_remote_proc_stats.sql

Database selected.

Routine created.

Database closed.

bica@primary:informix->

O servidor de base de dados "test" existe no $INFORMIXSQLHOSTS, mas aponta para um serviço não existente. Tentemos um upgrade para a 12.10.FC7 e vejamos o que acontece:

No online.log temos:


01:30:52 IBM Informix Dynamic Server Started.
01:30:52 Requested shared memory segment size rounded from 4308KB to 4796KB

Wed Aug 31 01:30:54 2016

01:30:54 Requested shared memory segment size rounded from 221001KB to 221004KB
01:30:54 Successfully added a bufferpool of page size 2K.

01:30:54 Event alarms enabled. ALARMPROG = '/usr/informix/srvr1210fc7w1/etc/alarmprogram.sh'
01:30:54 Booting Language from module <>
01:30:54 Loading Module
01:30:54 Booting Language from module <>
01:30:54 Loading Module
01:30:59 DR: DRAUTO is 0 (Off)
01:30:59 DR: ENCRYPT_HDR is 0 (HDR encryption Disabled)
01:30:59 Event notification facility epoll enabled.
01:31:00 Trusted host cache successfully built:/etc/hosts.equiv.
01:31:00 CCFLAGS2 value set to 0x200
01:31:00 SQL_FEAT_CTRL value set to 0x8008
01:31:00 SQL_DEF_CTRL value set to 0x4b0
01:31:00 IBM Informix Dynamic Server Version 12.10.FC7W1 Software Serial Number AAA#B000000
01:31:00 Conversion from version 11.50.xC6 Started

[...]

01:31:05 processing sysmaster
01:31:05 *** entity_ver=31 targetVersion=36 indexVersion= 0 ***
01:31:05 *** FromVersion=10 ToVersion=26 entity_ver=31 targetVersion=36 indexVersion=0 ***
01:31:05 *** FromVersion=26 ToVersion=28 entity_ver=31 targetVersion=36 indexVersion=1 ***
01:31:05 *** FromVersion=28 ToVersion=29 entity_ver=31 targetVersion=36 indexVersion=2 ***
01:31:05 *** FromVersion=29 ToVersion=30 entity_ver=31 targetVersion=36 indexVersion=3 ***
01:31:05 *** FromVersion=30 ToVersion=31 entity_ver=31 targetVersion=36 indexVersion=4 ***
01:31:05 *** FromVersion=31 ToVersion=32 entity_ver=31 targetVersion=36 indexVersion=5 ***
01:31:05 *** SQL 1 11.70.xC1 ***
01:31:05 Converting database sysmaster ...
01:31:06 The database sysmaster has been converted successfully.
01:31:06 *** FromVersion=32 ToVersion=33 entity_ver=32 targetVersion=36 indexVersion=6 ***
01:31:06 *** SQL 1 11.70.xC3 ***
01:31:06 Converting database sysmaster ...
01:31:06 The database sysmaster has been converted successfully.
01:31:06 *** FromVersion=33 ToVersion=34 entity_ver=33 targetVersion=36 indexVersion=7 ***
01:31:06 *** SQL 1 12.10.xC1 ***
01:31:06 Converting database sysmaster ...
01:31:06 The database sysmaster has been converted successfully.
01:31:06 *** FromVersion=34 ToVersion=35 entity_ver=34 targetVersion=36 indexVersion=8 ***
01:31:06 *** SQL 1 12.10.xC4 ***
01:31:06 Converting database sysmaster ...
01:31:06 The database sysmaster has been converted successfully.
01:31:06 *** FromVersion=35 ToVersion=36 entity_ver=35 targetVersion=36 indexVersion=9 ***
01:31:06 *** SQL 1 12.10.xC4 ***
01:31:06 processing sysutils

[...]

01:31:17 Conv/rev: calling conversion to 12.10.xC1 ...
01:31:19 Conv/rev: blade conversion to 12.10.xC1 completed
01:31:19 *** FromVersion=6 ToVersion=7 entity_ver=6 targetVersion=7 indexVersion=6 ***
01:31:19 *** BLADE 1 12.10.xC4 ***
01:31:19 Conv/rev: calling conversion to 12.10.xC4 ...
01:31:21 Conv/rev: blade conversion to 12.10.xC4 completed
01:31:21 Conv/rev: Finished conversion for component BLADE
01:31:21 Conv/rev: Started conversion for component SCHAPI
01:31:21 *** numVersions=7 targetVersion=7 ***
01:31:21 *** entity_ver=3 targetVersion=7 indexVersion= 0 ***
01:31:21 *** FromVersion=0 ToVersion=1 entity_ver=3 targetVersion=7 indexVersion=0 ***
01:31:21 *** FromVersion=1 ToVersion=2 entity_ver=3 targetVersion=7 indexVersion=1 ***
01:31:21 *** FromVersion=2 ToVersion=3 entity_ver=3 targetVersion=7 indexVersion=2 ***
01:31:21 *** FromVersion=3 ToVersion=4 entity_ver=3 targetVersion=7 indexVersion=3 ***
01:31:21 *** SCHAPI 1 11.70.xC1 ***
01:31:21 calling schapi_cvr1170_convert
01:31:21 *** FromVersion=4 ToVersion=5 entity_ver=4 targetVersion=7 indexVersion=4 ***
01:31:21 *** SCHAPI 1 11.70.xC3 ***



e bloqueia...
Vejamos as threads e o respetivo stack:


IBM Informix Dynamic Server Version 12.10.FC7W1 -- Single-User -- Up 00:00:54 -- 266648 Kbytes

Threads:
tid tcb rstcb prty status vp-class name
2 4579d9e8 0 1 IO Idle 3lio* lio vp 0
3 458fd368 0 1 IO Idle 4pio* pio vp 0
4 4591e368 0 1 IO Idle 5aio* aio vp 0
5 4593f368 1d43600 1 IO Idle 6msc* msc vp 0
6 45970368 0 1 IO Idle 7fifo* fifo vp 0
7 45991608 44c1e028 3 sleeping secs: 1 1cpu main_loop()
8 459ed4c0 0 1 running 8soc* soctcppoll
9 459edc98 0 2 sleeping forever 1cpu* soctcplst
10 45a3cc88 44c1e8f0 1 sleeping secs: 1 1cpu flush_sub(0)
11 45a59028 44c1f1b8 1 sleeping secs: 1 1cpu flush_sub(1)
12 45a59368 44c1fa80 1 sleeping secs: 1 1cpu flush_sub(2)
13 45a596a8 44c20348 1 sleeping secs: 1 1cpu flush_sub(3)
14 45a599e8 44c20c10 1 sleeping secs: 1 1cpu flush_sub(4)
15 45a59d28 44c214d8 1 sleeping secs: 1 1cpu flush_sub(5)
16 45ab70e0 44c21da0 1 sleeping secs: 1 1cpu flush_sub(6)
17 45ab7420 44c22668 1 sleeping secs: 1 1cpu flush_sub(7)
18 45ab7760 0 3 IO Idle 1cpu* kaio
19 45c0c2c8 44c22f30 3 sleeping secs: 1 1cpu convert
20 45c0c9f8 44c237f8 2 sleeping secs: 1 1cpu aslogflush
21 45ccc958 44c240c0 1 sleeping secs: 8 1cpu btscanner_0
22 45ceba00 44c24988 3 cond wait ReadAhead 1cpu readahead_0
23 45d0eb50 44c25250 3 sleeping secs: 1 1cpu auto_tune
27 45d8aa80 44c25b18 1 sleeping secs: 0 1cpu sbspclean
28 45d30aa8 44c26ca8 1 sleeping secs: 1 1cpu defragclean
30 45d738f0 44c27570 3 sleeping secs: 1 1cpu* onmode_mon
31 45d89028 44c27e38 3 sleeping secs: 1 1cpu periodic
32 45d89608 44c28700 3 sleeping forever 1cpu memory
33 45d89be8 44c28fc8 3 sleeping secs: 15 1cpu session_mgr
35 45e08768 44c263e0 1 cond wait bp_cond 1cpu bf_priosweep()

IBM Informix Dynamic Server Version 12.10.FC7W1 -- Single-User -- Up 00:01:03 -- 266648 Kbytes

Stack for thread: 19 convert
base: 0x0000000045c19000
len: 69632
pc: 0x00000000013eaa6f
tos: 0x0000000045c24d50
state: sleeping
vp: 1

0x00000000013eaa6f (oninit) yield_processor_svp
0x00000000013ed5d3 (oninit) mt_yield
0x0000000001501340 (oninit) net_wait_for_connect
0x0000000001506ade (oninit) connsocket
0x000000000155f3c0 (oninit) tlConnect
0x000000000155c56a (oninit) slSQIreq
0x0000000001555343 (oninit) pfConReq
0x000000000154c628 (oninit) cmReqSync
0x000000000154d7ef (oninit) cmConReq
0x00000000014ee62e (oninit) ASF_Call
0x000000000076fd94 (oninit) dd_newconnect
0x00000000007712d0 (oninit) dd_remconnect
0x0000000000765404 (oninit) remcatalog
0x0000000000734831 (oninit) readtab
0x000000000073a372 (oninit) ddgettable_hierarchy_trace
0x00000000006bd1d5 (oninit) aud_ddgettable_trace
0x0000000000601206 (oninit) gettab_trace
0x0000000000655f15 (oninit) tlgetddtab_trace
0x0000000000656530 (oninit) tlp2tnode
0x000000000065af3b (oninit) tbl2tab
0x000000000063785d (oninit) sel2cb
0x0000000000639498 (oninit) subq2cb
0x00000000006632c1 (oninit) sq2expr
0x0000000000663373 (oninit) subq2expr
0x000000000066cca4 (oninit) expr2expr
0x0000000000682764 (oninit) ip_exprparse
0x00000000006a28fe (oninit) cmp_stmt
0x00000000006a2c1d (oninit) pr_compile
0x00000000006bf021 (oninit) aud_pr_compile
0x00000000006a3987 (oninit) updprocstat
0x00000000005cfba6 (oninit) excommand
0x0000000000f7eed7 (oninit) admin_exec_cmd
0x0000000000f7f2d5 (oninit) run_sql_script
0x0000000000f7f418 (oninit) run_script_etc
0x0000000000f9b826 (oninit) run_script_conv
0x0000000000f9b936 (oninit) ph_convrev
0x0000000000f42233 (oninit) schapi_cvr1170XC3_convert
0x0000000000f3f906 (oninit) do_each_version_work
0x0000000000f402d3 (oninit) generic_fw_main
0x0000000000f40cf6 (oninit) convertSM
0x00000000013c3720 (oninit) th_init_initgls
0x000000000142a518 (oninit) startup




O stack é claro. E precisamente 2 minutos depois a conversão continua (o procedimento tinha duas instruções remotas):


01:31:21 *** SCHAPI 1 11.70.xC3 ***
01:33:22 *** FromVersion=5 ToVersion=6 entity_ver=5 targetVersion=7 indexVersion=5 ***
01:33:22 *** SCHAPI 1 12.10.xC1 ***
01:33:22 calling schapi_cvr1210_convert
01:33:22 *** FromVersion=6 ToVersion=7 entity_ver=6 targetVersion=7 indexVersion=6 ***
01:33:22 *** SCHAPI 1 12.10.xC4 ***
01:33:22 calling schapi_cvr1210XC4_convert
01:33:23 Dynamically allocated new virtual shared memory segment (size 8192KB)
01:33:23 Memory sizes:resident:4796 KB, virtual:270044 KB, no SHMTOTAL limit
01:33:23 Conv/rev: Finished conversion for component SCHAPI




E termina a conversão. Na situação concreta do cliente, um novo teste com a mesma instância apresentou um "bloqueio" durante cerca de 45m (correspondente a sensivelmente o mesmo número de instruções remotas nos procedimentos da sysadmin). Portanto um grande alerta...: O processo de conversão irá executar um UPDATE STATISTICS FOR PROCEDURE na base de dados sysadmin (isto não está documentado, mas é natural tendo em conta o processo de conversão da mesma). Se porventura criou procedimentos que acedem a instâncias remotas, então deverá ter isto em conta As opções para lidar com isto serão::
  1. Permitir as ligações ao(s) servidor remoto
  2. Remover os procedimentos e recriá-los depois da migração
  3. Usar o parâmetro INFORMIXCONTIME (a partir da 12.10) para reduzir os tempos de espera
  4. Remover o INFORMIXSERVER remoto do $INFORMIXSQLHOSTS durante a conversão. Isto também permite acelerar as tentativas de conexão pois ao não encontrar a referência no ficheiro o motor não sabe onde se ligar, e portanto não se aplicam os timeouts

Wednesday, September 21, 2016

Temporary space usage / Uso de espaço temporário

How do you find which sessions are consuming temporary space? (original version here: http://informix-technology.blogspot.com/2016/09/temporary-space-usage-uso-de-espaco.html)


English version

Introduction

Informix has a few rocks in it's shoes, or skeletons in the closet as we prefer. In my opinion the ones I consider as such are lack of specific functionality, undesirable behavior or inability to provide information to the DBA. I believe I've opened RFEs (request for enhancements) for most if not all of them, and my favorites are probably:
  1. Inability to easily see who is consuming temporary space (RFE 77879 and RFE 43877)
  2. Inability to generate a query plan for a running query (RFE 33800)
  3. Default behavior on non-ANSI databases regarding privileges on new objects (or lack of NODEFDAC $ONCONFIG equivalent - RFE 34407 and RFE 60849)
I've opened two RFEs, for a couple of situations because the first was refused or erroneously considered delivered (in my opinion of course). There are a few others like the string truncation in non-ANSI databases without warning, inability to use PAM with OlEDB clients - password only), missing SID in the audit logs etc...

From these, the first doesn't seem to have any progress and I believe it's a serious limitation for the DBA's daily work. And even if R&D decides to implement it, it would probably not be back ported to currently existing versions, which are to be supported for several years. So it means customers would not benefit from it, even if it was implemented today. And considering IBM closed my first attempt on the basis that we were working on the session limit configurations, I doubt this will be considered a priority. As a side note, the possibility to limit the use of temporary space is a very welcomed feature. But the fact that you can configure something, doesn't mean you won't need to monitor it. I believe this is common sense.

Script solution and usage

Well, considering all this, and after learning a few more things about the engine I noticed it would be possible to create something that can help customers with existing versions regarding the first two issues above (temporary space usage and running query plan). At this moment all I have in a "sharable" state is a script to report on temporary space usage. it's not entirely new as it was possible to find at least one script with overlapping functionality in the IUG repository, but I believe it's the first one that can do all of these:
  • Show the session owning most (if not all) temporary obkjects which can be:
    • Explicit (CREATE TEMP TABLE....) and inplicit (SELECT .. INTO TEMP...) temporary tables
    • Hash structures used by hash joins
    • Sort structures used by ORDER BY and GROUP BY clauses
    • Temporary structures resulting from view materialization, inline views and some OLAP functions
  • Show the temporary table structures (columns and respective datatypes and indexes, both explicit and system created)
  • Find the objects created by the currently running query and other opened cursors
  • Ability to work on secondary read-only servers
The script is hopefully simple to use. As with most of my scripts the "-h" option shows the help:

galileo@solstice.onlinedomus.net:informix-> ixtempuse -h
ixtempuse: -V | -h | [-s <session> [-t temp_table]] [-u <user>] [-d <database>] [-D <dbspace> [-I max_iterators]
          -V shows script version
          -h shows this help
          -s <session list>       : Show information for a session or list of sessions (separated by commas)
          -t <temp_table>         : Show temporary table structure for a single session/table (ALL for all)
          -u &ltuser list>          : Show information for a user or list of users (separated by commmas)
          -d &ltdatabase list>      : Show information for sessions in a database or list of databases (separated by commas)
          -D &ltdbspace list>       : Show information for a dbspace or list of dbspaces (separated by commas - must be temp dbspaces)
          -I &ltmax_iterators list> : The script can generate an infinite recursion if something goes wrong.
                                      To prevent this there's a limit of 100 iterators that can be overriden with this parameter
                                      But be advices that in most cases if the limit is reached, it means the script as a flaw.
                                      Apart a script bug, this situation should only happen on very complex queries which are unlikely
                                      to appear frequently. Overriding the limit may cause a FORK BOMB!!!
Ex: ixtempuse -s 43,35,4
galileo@solstice.onlinedomus.net:informix->

Without arguments it shows an overview of tthe temporary space usage:

galileo@solstice.onlinedomus.net:informix-> ixtempuse
+----------+------------------+----------+-------------------+--------------------------+-------------------+----------+----------+
|Sessionid |Username          |PID       |Conn Time          |DBSpace                   |   # (H/G/S/T/V)   |DBSpace MB|Session MB|
+----------+------------------+----------+-------------------+--------------------------+-------------------+----------+----------+
|55         informix           22522      2016-09-21 12:36:58|dbtemp1                   |11 /0  /0  /0  /0  |22.5      |          |
|                                                            |dbtemp2                   |11 /0  /0  /0  /0  |22.0      |          |
|                                                            |                          |                   |          |44.5      |
+------------------------------------------------------------+--------------------------+-------------------+----------+----------+
|52         informix           22348      2016-09-21 12:28:42|dbtemp2                   |0  /0  /0  /0  /1  |10.62     |          |
|                                                            |                          |                   |          |10.62     |
+------------------------------------------------------------+--------------------------+-------------------+----------+----------+
|99         informix           29367      2016-09-21 16:17:00|dbtemp1                   |0  /0  /0  /1  /0  |0.01      |          |
|                                                            |dbtemp2                   |0  /0  /0  /1  /0  |0.01      |          |
|                                                            |rootdbs                   |0  /0  /0  /1  /0  |0.01      |          |
|                                                            |                          |                   |          |0.04      |
+------------------------------------------------------------+--------------------------+-------------------+----------+----------+
galileo@solstice.onlinedomus.net:informix-&gt

Using the "-s" option we can get the detail for one or more sessions:


galileo@solstice.onlinedomus.net:informix-> ixtempuse -s 55
-----------------------------------------------------------------------------------------------------------------------------------
SID: 55       User: informix      Client: solstice.onlinedomus.net     PID: 22522
Connected at: 2016-09-21 12:36:58
-----------------------------------------------------------------------------------------------------------------------------------
  DBSPACE: dbtemp1
    H: th_build_ffffffffffffffff                                                        (2097154/0x00200002) 2MB
    H: th_build_ffffffffffffffff                                                        (2097155/0x00200003) 2.25MB
    H: th_build_ffffffffffffffff                                                        (2097156/0x00200004) 2MB
    H: th_build_ffffffffffffffff                                                        (2097157/0x00200005) 2MB
    H: th_build_ffffffffffffffff                                                        (2097158/0x00200006) 2MB
    H: th_build_ffffffffffffffff                                                        (2097159/0x00200007) 2MB
    H: th_probe_ffffffffffffffff                                                        (2097160/0x00200008) 2.25MB
    H: th_probe_ffffffffffffffff                                                        (2097161/0x00200009) 2MB
    H: th_probe_ffffffffffffffff                                                        (2097162/0x0020000a) 2MB
    H: th_probe_ffffffffffffffff                                                        (2097163/0x0020000b) 2MB
    H: th_probe_ffffffffffffffff                                                        (2097164/0x0020000c) 2MB
    Total space for object of type H in this dbspace: 22.5MB
  Total space for DBSPACE dbtemp1 is 22.5MB:

  DBSPACE: dbtemp2
    H: th_build_ffffffffffffffff                                                        (3145730/0x00300002) 2MB
    H: th_build_ffffffffffffffff                                                        (3145731/0x00300003) 2MB
    H: th_build_ffffffffffffffff                                                        (3145732/0x00300004) 2MB
    H: th_build_ffffffffffffffff                                                        (3145733/0x00300005) 2MB
    H: th_build_ffffffffffffffff                                                        (3145734/0x00300006) 2MB
    H: th_probe_ffffffffffffffff                                                        (3145735/0x00300007) 2MB
    H: th_probe_ffffffffffffffff                                                        (3145736/0x00300008) 2MB
    H: th_probe_ffffffffffffffff                                                        (3145737/0x00300009) 2MB
    H: th_probe_ffffffffffffffff                                                        (3145738/0x0030000a) 2MB
    H: th_probe_ffffffffffffffff                                                        (3145739/0x0030000b) 2MB
    H: th_probe_ffffffffffffffff                                                        (3145740/0x0030000c) 2MB
    Total space for object of type H in this dbspace: 22MB
  Total space for DBSPACE dbtemp2 is 22MB:

Summary for space usage for session 55 is:
      Hash       : 44.5      MB      Groups BY  : 0MB
      Sort       : 0         MB      Temp Tables: 0MB
      Views      : 0         MB      Others     : 0MB
Total space for session 55 is 44.5MB

galileo@solstice.onlinedomus.net:informix->

and with the "-s" and "-t" option we can get the structure of the temporary tables within one session:

galileo@solstice.onlinedomus.net:informix-> ixtempuse -s 99 -t all
{ TABLE "informix"."fnunes_teste" row size = 85 number of columns = 1 index size = 31 }
CREATE TEMP TABLE "informix"."fnunes_teste"
(
        col1_decimal_14_not_null         DECIMAL(14) NOT NULL,
        col2_decimal_12_5_unique         DECIMAL(12, 5),
        col3_dt_year_to_second           DATETIME YEAR TO SECOND,
        col4_dt_year_to_month            DATETIME YEAR TO MONTH,
        col5_dt_year_to_day              DATETIME YEAR TO DAY,
        col6_dt_year_to_hour             DATETIME YEAR TO HOUR,
        col7_dt_year_to_minute           DATETIME YEAR TO MINUTE,
        col8_dt_year_to_second           DATETIME YEAR TO SECOND,
        col9_dt_day_to_day               DATETIME DAY TO DAY,
        col10_dt_houwr_to_hour           DATETIME HOUR TO HOUR,
        col11_dt_minute_to_minute        DATETIME MINUTE TO MINUTE,
        col12_dt_second_to_second        DATETIME SECOND TO SECOND,
        col13_dt_varchar_10              VARCHAR(10),
        col14_dt_varchar_10_5            VARCHAR(10)
);

CREATE INDEX ix_2 ON fnunes_teste (col14_dt_varchar_10_5, col1_decimal_14_not_null DESC);
CREATE INDEX ix_1 ON fnunes_teste (col13_dt_varchar_10);
CREATE UNIQUE INDEX 0_2 ON fnunes_teste (col2_decimal_12_5_unique);


{ TABLE "informix"."t_2" row size = 24 number of columns = 2 index size = 0 }
CREATE TEMP TABLE "informix"."t_2"
(
        col1                             INTEGER,
        col3                             CHAR(20)
);



{ TABLE "informix"."t_1" row size = 4 number of columns = 1 index size = 0 }
CREATE TEMP TABLE "informix"."t_1"
(
        col1                             INTEGER
);



galileo@solstice.onlinedomus.net:informix-> 

Getting the script

The script can be downloaded from http://github.com/domusonline/InformixScripts (browse through scripts/ix)
It's licensed under GPL 2.0, so feel free to use it and even change it if you want. I welcome any feedback, requests and bug reports.

Warnings and disclaimer

I have to be very clear on this section. First of all, this is my best effort to provide something useful and that can help customers (DBAs) in their daily routine, and hopefully help to reduce the number of PMRs that are opened around this topic. The script was tested in several combinations of OS (RHEL, HP-UX, Solaris, AIX) and different engine versions (11.50 and 12.10) and several use cases. But anyone using it will have to assume full responsibility for it's results and behavior. Having said this, I'd be very glad if users provide feedback and bug reports. Assuming I can reproduce it, I'd gladly fix it.
A few very important points:
  • The script is SLOW. There isn't much I can do about it because the way it works (and that's apparently the only way to get the info) means I must go through all the sessions and several onstat commands. I have plans to try to use a single execution of onstat (with -i) but I'm not sure if that's possible. Current workaround it to reduce the scope of the sessions it has to go through. It has options for session id (-s), user (-u) and database (-d). Naturally the slowness is proportional to the number of session and also to the complexity of what they're doing and what cursors they have opened
  • The script requires recursion and the only way I've found to make it work in KSH is to run the recursive function as a background process. As with any thing that is recursive there's always the possibility of an infinite loop. I had this during development and I created some measures to prevent it. The main one is that the script will only go up to 100 levels of recursion. This should be more than enough for most avergae day to day queries. A query that requires more than this means that it has more than 100 "iterators", where an iterator is a table scan, a nested loop join, a sort, a merge, an hash join etc. So as you can imagine, 100 should be enough. When the limit is reached the script stops. There's an option to increase the limit, but you should think twice (or more) before using it. An high value can lead to a FORK BOMB. Also, because of the recursion, the script may go over the limit of user processes you have defined. Currently I have no way around this.

Acknowledgements

I must publicly thank several people who somehow contributed to this script. A special thanks to Jacques Renaut from R&D who provided invaluable technical guidance, and allowed me to proceed when I was stuck trying to understand internal structures. Frank Arias, a technical support engineer was crucial in getting it to work on Solaris. Adelino Silva, another technical support engineer was the first beside me to test it and provided valuable input. Jonathan Leffler provided some important feedback. Finally two customers, one in Portugal who provided feedback on HP-UX and another in the US who was fundamental for solving Solaris specific issue.


Versão Portuguesa

Introducão

O Informix tem algumas pedras no sapato ou esqueletos no armário conforme preferirmos um termo ou outro.. Na minha opinião o que eu considero como tal pode ser falta de alguma funcionalidade, comportamento indesejado ou impossibilidade do DBA em obter informação necessária à sua atividade. Penso que abri RFEs (request for enhancements) para a maioria se não todos os casos, e os meus favoritos serão provavelmente:
  1. Impossibilidade de facilmente verificar quem está a consumir espaço temporário (RFE 77879 e RFE 43877)
  2. Impossibilidade de gerar um plano de execução de uma query que esteja a correr (RFE 33800)
  3. Comportamento por omissão em bases de dados não ANSI, no que diz respeito à atribuição automática de privilégios (ou a falta de um equivalente à variável NODEFDAC mas configurável no $ONCONFIG - RFE 34407 e RFE 60849)
Os problemas para os quais abri dois RFEs deve-se ao facto de o primeiro ter sido recusado ou erroneamente (na minha opinião) considerado resolvido. Existem alguns outros problemas como o corte de strings em bases de dados não ANSI sem geração de erros, impossibilidade de utilizar PAM em clientes OLEDB - ainda que só com password -, falta do SID nos logs de audit etc....

Mas destes, o primeiro não parece ter tido nenhum progresso e penso que é uma limitação séria ao trabalho diário de um DBA. E mesmo que as equipas de I&D resolvessem implementar a funcionalidade hoje, provavelmente não seria portado para as versões anteriores. que serão suportadas por alguns anos ainda. Por outras palavras, mesmo que o problema fosse resolvido hoje, os clientes com versões anteriores continuariam a sofrer com o tema por muitos anos. E considerando que a IBM fechou a minha primeira tentativa de RFE, com base no fato de estarmos a trabalhar na limitação de utilização de recursos por sessão (incluindo o espaço temporário), duvido que este pedido venha a ser considerado prioritário. Como nota acessória, a funcionalidade de limitar os recursos foi muito bem vinda. Mas o facto de podermos configurar algo não elimina a necessidade de monitorizar. Diria que isto é senso comum

Solução via script e sua utilização

Bom, considerando tudo isto e depois de aprender mais algumas coisas sobre o motor, percebi que seria possível criar algo que ajudasse os clientes das versões existentes relativamente aos primeiros dois problemas acima referidos (uso de espaço temporário e plano de execução de uma query já a correr). De momento o que tenho numa forma que possa ser partilhada é um script que reporta a utilização de espaço temporário. O conceito não é totalmente novo e é possível encontrar pelo menos um script no repositório do IIUG que tem funcionalidade algo coincidente. Mas penso que este será o primeiro  conseguir fazer tudo isto:
  • Mostrar a sessão que criou a maioria (se não todos) os tipos de estruturas temporárias, que podem ser:
    • Tabelas temporárias explícitas (CREATE TEMP TABLE....) e implícitas (SELECT .. INTO TEMP...)
    • Estruturas de Hash usadas pelos HASH joins
    • Estruturas de ordenação usadas pelos ORDER BY e GROUP BY
    • Estruturas temporárias resultantes da materialização de views, inline views, e algumas funções OLAP
  • Mostrar a estrutura das tabelas temporárias (colunas, respetivos tipos de dados e índices, quer os criados pelos utilizadores quer os criados pelo sistema para suporte de constraints)
  • Encontrar os objetos criados pela query em execução e por outros cursores abertos
  • Possibilidade de trabalhar em servidores secundários em modo de leitura apenas
O script deverá (espera-se) ser fácil de utilizar. Como na maioria dos meus scripts a opção "-h" mostra a ajuda:

galileo@solstice.onlinedomus.net:informix-> ixtempuse -h
ixtempuse: -V | -h | [-s <session> [-t temp_table]] [-u <user>] [-d <database>] [-D <dbspace> [-I max_iterators]
          -V shows script version
          -h shows this help
          -s <session list>       : Show information for a session or list of sessions (separated by commas)
          -t <temp_table>         : Show temporary table structure for a single session/table (ALL for all)
          -u &ltuser list>          : Show information for a user or list of users (separated by commmas)
          -d &ltdatabase list>      : Show information for sessions in a database or list of databases (separated by commas)
          -D &ltdbspace list>       : Show information for a dbspace or list of dbspaces (separated by commas - must be temp dbspaces)
          -I &ltmax_iterators list> : The script can generate an infinite recursion if something goes wrong.
                                      To prevent this there's a limit of 100 iterators that can be overriden with this parameter
                                      But be advices that in most cases if the limit is reached, it means the script as a flaw.
                                      Apart a script bug, this situation should only happen on very complex queries which are unlikely
                                      to appear frequently. Overriding the limit may cause a FORK BOMB!!!
Ex: ixtempuse -s 43,35,4
galileo@solstice.onlinedomus.net:informix->

Sem argumentos mostra o panorama geral da utilização de espaço temporário:

galileo@solstice.onlinedomus.net:informix-> ixtempuse
+----------+------------------+----------+-------------------+--------------------------+-------------------+----------+----------+
|Sessionid |Username          |PID       |Conn Time          |DBSpace                   |   # (H/G/S/T/V)   |DBSpace MB|Session MB|
+----------+------------------+----------+-------------------+--------------------------+-------------------+----------+----------+
|55         informix           22522      2016-09-21 12:36:58|dbtemp1                   |11 /0  /0  /0  /0  |22.5      |          |
|                                                            |dbtemp2                   |11 /0  /0  /0  /0  |22.0      |          |
|                                                            |                          |                   |          |44.5      |
+------------------------------------------------------------+--------------------------+-------------------+----------+----------+
|52         informix           22348      2016-09-21 12:28:42|dbtemp2                   |0  /0  /0  /0  /1  |10.62     |          |
|                                                            |                          |                   |          |10.62     |
+------------------------------------------------------------+--------------------------+-------------------+----------+----------+
|99         informix           29367      2016-09-21 16:17:00|dbtemp1                   |0  /0  /0  /1  /0  |0.01      |          |
|                                                            |dbtemp2                   |0  /0  /0  /1  /0  |0.01      |          |
|                                                            |rootdbs                   |0  /0  /0  /1  /0  |0.01      |          |
|                                                            |                          |                   |          |0.04      |
+------------------------------------------------------------+--------------------------+-------------------+----------+----------+
galileo@solstice.onlinedomus.net:informix-&gt

Usando a opção "-s" obtemos o detalhe de uma ou mais sessões:

galileo@solstice.onlinedomus.net:informix-> ixtempuse -s 55
-----------------------------------------------------------------------------------------------------------------------------------
SID: 55       User: informix      Client: solstice.onlinedomus.net     PID: 22522
Connected at: 2016-09-21 12:36:58
-----------------------------------------------------------------------------------------------------------------------------------
  DBSPACE: dbtemp1
    H: th_build_ffffffffffffffff                                                        (2097154/0x00200002) 2MB
    H: th_build_ffffffffffffffff                                                        (2097155/0x00200003) 2.25MB
    H: th_build_ffffffffffffffff                                                        (2097156/0x00200004) 2MB
    H: th_build_ffffffffffffffff                                                        (2097157/0x00200005) 2MB
    H: th_build_ffffffffffffffff                                                        (2097158/0x00200006) 2MB
    H: th_build_ffffffffffffffff                                                        (2097159/0x00200007) 2MB
    H: th_probe_ffffffffffffffff                                                        (2097160/0x00200008) 2.25MB
    H: th_probe_ffffffffffffffff                                                        (2097161/0x00200009) 2MB
    H: th_probe_ffffffffffffffff                                                        (2097162/0x0020000a) 2MB
    H: th_probe_ffffffffffffffff                                                        (2097163/0x0020000b) 2MB
    H: th_probe_ffffffffffffffff                                                        (2097164/0x0020000c) 2MB
    Total space for object of type H in this dbspace: 22.5MB
  Total space for DBSPACE dbtemp1 is 22.5MB:

  DBSPACE: dbtemp2
    H: th_build_ffffffffffffffff                                                        (3145730/0x00300002) 2MB
    H: th_build_ffffffffffffffff                                                        (3145731/0x00300003) 2MB
    H: th_build_ffffffffffffffff                                                        (3145732/0x00300004) 2MB
    H: th_build_ffffffffffffffff                                                        (3145733/0x00300005) 2MB
    H: th_build_ffffffffffffffff                                                        (3145734/0x00300006) 2MB
    H: th_probe_ffffffffffffffff                                                        (3145735/0x00300007) 2MB
    H: th_probe_ffffffffffffffff                                                        (3145736/0x00300008) 2MB
    H: th_probe_ffffffffffffffff                                                        (3145737/0x00300009) 2MB
    H: th_probe_ffffffffffffffff                                                        (3145738/0x0030000a) 2MB
    H: th_probe_ffffffffffffffff                                                        (3145739/0x0030000b) 2MB
    H: th_probe_ffffffffffffffff                                                        (3145740/0x0030000c) 2MB
    Total space for object of type H in this dbspace: 22MB
  Total space for DBSPACE dbtemp2 is 22MB:

Summary for space usage for session 55 is:
      Hash       : 44.5      MB      Groups BY  : 0MB
      Sort       : 0         MB      Temp Tables: 0MB
      Views      : 0         MB      Others     : 0MB
Total space for session 55 is 44.5MB

galileo@solstice.onlinedomus.net:informix->

E com as opções "-s" e "-t" podemos obter a estrutura das tabelas temporárias criadas por uma sessão:

galileo@solstice.onlinedomus.net:informix-> ixtempuse -s 99 -t all
{ TABLE "informix"."fnunes_teste" row size = 85 number of columns = 1 index size = 31 }
CREATE TEMP TABLE "informix"."fnunes_teste"
(
        col1_decimal_14_not_null         DECIMAL(14) NOT NULL,
        col2_decimal_12_5_unique         DECIMAL(12, 5),
        col3_dt_year_to_second           DATETIME YEAR TO SECOND,
        col4_dt_year_to_month            DATETIME YEAR TO MONTH,
        col5_dt_year_to_day              DATETIME YEAR TO DAY,
        col6_dt_year_to_hour             DATETIME YEAR TO HOUR,
        col7_dt_year_to_minute           DATETIME YEAR TO MINUTE,
        col8_dt_year_to_second           DATETIME YEAR TO SECOND,
        col9_dt_day_to_day               DATETIME DAY TO DAY,
        col10_dt_houwr_to_hour           DATETIME HOUR TO HOUR,
        col11_dt_minute_to_minute        DATETIME MINUTE TO MINUTE,
        col12_dt_second_to_second        DATETIME SECOND TO SECOND,
        col13_dt_varchar_10              VARCHAR(10),
        col14_dt_varchar_10_5            VARCHAR(10)
);

CREATE INDEX ix_2 ON fnunes_teste (col14_dt_varchar_10_5, col1_decimal_14_not_null DESC);
CREATE INDEX ix_1 ON fnunes_teste (col13_dt_varchar_10);
CREATE UNIQUE INDEX 0_2 ON fnunes_teste (col2_decimal_12_5_unique);


{ TABLE "informix"."t_2" row size = 24 number of columns = 2 index size = 0 }
CREATE TEMP TABLE "informix"."t_2"
(
        col1                             INTEGER,
        col3                             CHAR(20)
);



{ TABLE "informix"."t_1" row size = 4 number of columns = 1 index size = 0 }
CREATE TEMP TABLE "informix"."t_1"
(
        col1                             INTEGER
);



galileo@solstice.onlinedomus.net:informix-> 


Obtenção do script

O script pode ser descarregado de http://github.com/domusonline/InformixScripts (navegue por scripts/ix) Está licenciado como GPL 2.0, pelo que é livre de utilizá-lo e até de o modificar caso deseje. Qualquer comentário, sugestão ou reporte de bug será muito bem vindo.

Avisos e termo de não responsabilização

Tenho de ser muito claro nesta secção. Antes de mais, isto resulta do meu melhor esforço em fornecer algo que possa ser útil aos clientes (DBAs) na sua rotina, e que eventualmente ajude a reduzir ou resolver os PMRs que vão sendo abertos em torno deste problema. O script foi testado em várias combinações de sistema operativo ((RHEL, HP-UX, Solaris, AIX) e diferentes versões do Informix (11.50 e 12.10) bem como em vários casos de uso. Mas qualquer pessoa que deseje utilizá-lo terá de assumir a responsabilidade pelo seu resultado e comportamneto. Tendo deixado isto claro estarei totalmente disponível para corrigir qualquer problema que consiga reproduzir e que me seja indicado pelos seus utilizadores.
Alguns pontos adicionais muito importantes:
  • O script é LENTO. Não há muito que possa fazer, porque a forma como funciona (e penso que é a única de obter a informação) requer que varra todas as sessões e execute uma série de comandos onstat para cada uma. Tenho planos de tentar fazer apenas uma execução do onstat (com a opção -i) mas não estou seguro que seja possível. A forma de minimizar a lentidão atualmente será reduzir o número de seeões a examinar. Tem opções para indicar uma lista de sessões (-s), utilizadores (-u) e bases de dados (-d). Naturalmente a lentidão será proporcional ao número de sessões e complexidade do que estejam a executar e dos cursores que tenham aberto
  • O script requer recursividade e a única forma que encontrei de o conseguir fazer em KSH é correr a função recursiva como um processo em background. Como em tudo o que envolva recursividade existe a possibilidade de entrarmos num ciclo infinito. Aconteceu-me durante o desenvolvimento e consequentemente crirei alguns mecanismos para o evitar. O principal é que por omissão o script só entrará em 100 níveis de recursão. Isto deverá ser mais que suficiente para a maioria das nossas queries do dia a dia. Uma query que necessite mais que isto terá de ter mais de 100 "iteradores", sendo um iterador um acesso a uma tabela, um nested loop join, um hasj join, um sort, um merge etc. Por isso, como se compreenderá, 100 deverá ser suficiente para a maioria dos casos. Quando o limite é atingido o script aborta Existe uma opção para aumentar o limite, mas deverá pensar várias vezes antes de a usar. Um valor muito alto pode causar um FORK BOMB!!!Adicionalmente, devido à recursividade, o script pode exceder o limite de processos definidos para o seu ambiente. Atualmente não há forma de evitar isto.

Agradecimentos

Tenho de agradecer publicamente a várias pessoas que de alguma forma contribuiram para o script. Um agradecimento muito especial para o Jacques Renaut do I&D que forneceu preciosa ajuda técnica que me permitiu prosseguir quando pensava estar "encalhado" durante as tentativas de interpretação das estruturas internas. O Frank Arias, do suporte técnico foi crucial na solução de problemas de execução em Solaris. O Adelino Silva, também do suporte técnico foi o primeiro depois de mim a testar o script e forneceu ideias e informação importantees. O Jonathan Leffler forneceu comentários muito úteis. E finalmente dois clientes, um em Portugal que forneceu feedback e tetses em HP-UX e outro nos EUA que foi fundamental para a solução de problemas específicos de Solaris

Tuesday, September 20, 2016

Informix on Cloud / Informix na núvem

Informix on Cloud offer webcast. Learn all you need to know about this new option (original version here: http://informix-technology.blogspot.com/2016/09/informix-on-cloud-informix-na-nuvem.html)



English version
IBM will host a webcast next September 26 about a new offering of Informix on cloud environment. If Cloud deployment is an option for you, this may be very interesting. During the webcast you can learn all the details about the new offering. The webcast will last 1H and will start at 12H EDT (Eastern Daylight Time). Main speaker will be Nicholas Geib
, Solution Architect
 in IBM Analytics. Nicholas has been involved in Informix HA and replication technology. To register please access: http://bit.ly/2cf3KIq


Versão Portuguesa
A IBM está a organizar um webcast no próximo dia 26 de Setembro, sobre a nova oferta de Informix num ambiente de nuvem (cloud). Se a Cloud faz parte dos seus planos esta sessão pode ser muito interessante para si. Durante o webcast serão dados todos os pormenores deta nova oferta. A sessão terá a duração de 1H e início às 12H EDT (Eastern Daylight Time). O responsável pela sessão será o Nicholas Geib
, Solution Architect
 na IBM Analytics. O Nicholas tem estado envolvido com a alta disponibilidade e tecnologias de replicação do Informix. Para se registar aceda a: http://bit.ly/2cf3KIq

Tuesday, September 06, 2016

Issues on 12.10.FC6 x86_64 / Problemas com a 12.10.FC6 em x86_64

Significant issues with version 12.10.FC6 on Linux 64 bits (original version here: http://informix-technology.blogspot.com/2016/09/issues-on-1210fc6-x8664-problemas-com.html)


English version
IBM has recently issued an alert about a serious issue with version 12.10.FC6 on Linux 64 bits platforms.
As can be read in the alert, there are a set of situations that must happen in order for the problem to manifest. But the consequences are very serious.
Please read the alert carefully if you use that version on that platform. If you don't use it, yet it would be better to avoid migrating existing instances to it. Versions 12.10.FC7 and 12.10.FC7W1 are already available and don't suffer from these two issues that affected this version and platform (another alert has been issued recently). If you use version 12.10.FC6 on another platform it should not happen. Same if you don't meet all the pre-requisites of the issue. In case of doubt consult IBM technical support.

Versão Portuguesa
A IBM emitiu recentemente um alerta sobre um problema sério que pode afetar a versão 12.10.FC6 em Linux de 64 bits. Como pode ler-se no alerta, existem uma série de pressupostos para que o problema se manifeste. Mas as consequências do mesmo são muito sérias.
Leia por favor o alerta se usa esta versão nesta plataforma. Se não a usa ainda seria provavelmente preferível evitar migrações para esta versão de instâncias já existentes. As versões 12.10.FC7 e 12.10.FC7W1 estão disponíveis e não sofrem dos dois problemas (já tinha sido divulgado outro alerta) que afetam esta versão/plataforma. Se utiliza a versão 12.10.FC6 noutra plataforma, um dos pré-requisitos não se verifica  O mesmo se atualmente não verifica todos os requisitos para a ocorrência do problema. Em caso de dúvida não hesite em contactar o suporte da IBM.