When 'Exclusive' is not really exclusive...
One situation that can really annoy an IBM Informix DBA happens when you're trying to ALTER TABLE [...] and you get the error "Non exclusive access when executing a DDL...". You then think... "Oh... I'll lock it in exclusive mode"... And after killling some sessions, setting the lock mode to WAIT and opening a transaction you get the desired 'exclusive' lock. Then you try the ALTER TABLE again and what happens? The same error! How can it be you ask?! Read on...
First let me say that this same subject is referenced in one of the many technical support articles. The link is here. Why am I writing this then? Well, because I think there's a bit more than what the article says... But please understand that everyhing in that article is correct, and that it solves the problem in most cases.
Let's start by making a few points:
- When you set the exclusive lock on a table using "BEGIN WORK; LOCK TABLE
IN EXCLUSIVE MODE you don't prevent sessions with DIRTY READ isolation level from accessing the table - In order to do an ALTER TABLE on a table, you need exclusive access on a different level. The fact that no one (besides dirty readers) can't read data from the table is not enough. There must be no references to the table structure. This is on a lower level than the SQL layer. This means for example that there can be no open cursors, no prepared statements using the table, and there can be no sessions waiting for the lock you establish. If for example at time T0 you get the exclusive lock, and before you make the ALTER TABLE, at T1, a session in LOCK MODE WAIT tries to SELECT FROM THE TABLE, this session will have a reference to the table. At T2 when you give the ALTER command, you won't be able to execute it. This needs a very precise timming to happen, but it can be easy when you're dealing with an "hot" table
For the first situation, our friends at R&D have come up with a solution, documented in the already mentioned article. There is an otherwise undocumented variable called IFX_DIRTY_WAIT which allows us to specify a maximum number of seconds that we will wait for DIRTY READERS to release the table structure. This will also prevent new sessions to attach to the table structure.
So, two things can happen. If the ammount of time specified in IFX_DIRTY_WAIT is enough to let all dirty readers to complete their jobs, your ALTER TABLE statement will wait for a while and successfuly execute after the last one ends their reads. Otherwise, if there are still any dirty reader after the timeout, you'll get the normal error. In this case you'll have to increase the value you put in IFX_DIRTY_WAIT or use the time to kill some sessions.
But this won't solve the second situation. For this you will have to find what sessions have references to the table and close them. And if you have a "wild" environment where you can't stop sessions from popping up, and many of this sessions make references to the table, you'll also need a way to stop this new sessions from using the table. I had cases where I checked which sessions were preventing me from doing the DDL and closed them. But when I retried the ALTER, there were already other sessions with references to the table. So, how can we find which session has a reference to a table?
> ideiafix@PacMan.domus.online.pt:informix-> onstat -g opn
IBM Informix Dynamic Server Version 10.00.UC5 -- On-Line -- Up 00:43:22 -- 87776 Kbytes
tid rstcb isfd op_mode op_flags partnum ucount ocount lockmode
39 0x477bf4c8 0 0x00000400 0x00000397 0x00100071 2 2 0
39 0x477bf4c8 1 0x00000002 0x00000003 0x00100071 2 2 0
39 0x477bf4c8 3 0x00000408 0x00000017 0x001000a3 1 1 0
Err... and now? Well, as you can see you have a column called "partnum", and another with the "rstcb". Knowing the tablename allows you to find out the partnum (and vice-versa) by looking at sysmaster:systabnames.
[informix@PacMan ~]$ dbaccess sysmaster -
Database selected.
> select hex(partnum), tabname, dbsname from systabnames where partnum = '0x001000a3';
(expression) 0x001000A3
tabname customer
dbsname stores_demo
1 row(s) retrieved.
And you can find the session ID from the "rstcb", using something like:
[informix@PacMan ~]$ onstat -u | grep 477bf4c8
477bf4c8 Y--P--- 21 informix 2 482d71a8 0 1 0 0
[informix@PacMan ~]$ onstat -g sql 21
IBM Informix Dynamic Server Version 10.00.UC5 -- On-Line -- Up 00:48:19 -- 87776 Kbytes
Sess SQL Current Iso Lock SQL ISAM F.E.
Id Stmt type Database Lvl Mode ERR ERR Vers Explain
21 SELECT stores_demo NL Not Wait 0 0 9.03 Off
Current statement name : slctcur
Current SQL statement :
select * from customer
As long as session 21 continues to reference the table structure you won't be able to issue any ALTER TABLE on customer table, even after being able to lock it:
[informix@PacMan ~]$ dbaccess stores_demo -
Database selected.
> begin work;
Started transaction.
> lock table customer in exclusive mode;
Table locked.
> alter table customer add (dummy_col char);
242: Could not open database table (informix.customer).
106: ISAM error: non-exclusive access.
Error in line 1
Near character position 40
>
So, this are the basics to find which sessions are referencing a specific table. But I really don't want to see anybody messing around with a lot of onstat commands, greps etc. I've already wrote a script which can do all this. It's called ixtableuse and it will be available soon.
So, by now, we have ways to handle dirty readers and to find out the sessions referencing the tables. We only need a way to prevent new sessions from getting references while we clear the ones we identified. How can we do this? Well... using a dirty trick... All the sessions trying to access a table somehow will have to read it's record from the systables table. Yes... that's it... If somehow we lock that record also, the sessions in LOCK MODE WAIT will get stuck on this systables record, before they open the table in question. How can we lock a record on systables? For example by GRANTing that table some dummy privilege.
So, let's see how to do some DDL on a very "hot" table in a "wild" environment.
In session 1 (the session where you want to ALTER TABLE):
> IFX_DIRTT_WAIT=300
> export IFX_DIRTY_WAIT
> #the above will take care of dirty readers...
> [informix@PacMan ~]$ dbaccess stores_demo -
Database selected.
> set lock mode to wait;
Lockmode set.
> begin work;
Started transaction.
> grant select on customer to dummy_user;
Permission granted.
> lock table customer in exclusive mode;
Table locked.
ideiafix@PacMan.domus.online.pt:informix-> dbaccess stores_demo -
Database selected.
> select * from customer;
211: Cannot read system catalog (systables).
107: ISAM error: record is locked.
Error in line 1
Near character position 22
Note that the error refers to systables, not the table customer.
Back to session 1:
> alter table customer add (dummy_col char);
242: Could not open database table (informix.customer).
106: ISAM error: non-exclusive access.
Error in line 1
Near character position 40
Why?! Well, the initial session that ended the select * from customer statement is still referencing the table. You can see how many "users" are referencing it using another onstat:
> ideiafix@PacMan.domus.online.pt:informix-> onstat -t
IBM Informix Dynamic Server Version 10.00.UC5 -- On-Line -- Up 01:22:31 -- 87776 Kbytes
Tblspaces
n address flgs ucnt tblnum physaddr npages nused npdata nrows nextns
4 477f8d38 0 1 100001 1:14 250 250 0 0 1
30 47de1018 0 2 100071 1:126 16 11 5 65 2
80 47e6f018 0 2 1000a3 1:176 8 3 2 28 1
3 active, 108 total
This can be cleared by an onmode -z or a kill to the client process:
onmode -z 21
And again:
ideiafix@PacMan.domus.online.pt:informix-> onstat -t
IBM Informix Dynamic Server Version 10.00.UC5 -- On-Line -- Up 01:24:52 -- 87776 Kbytes
Tblspaces
n address flgs ucnt tblnum physaddr npages nused npdata nrows nextns
4 477f8d38 0 1 100001 1:14 250 250 0 0 1
30 47de1018 0 1 100071 1:126 16 11 5 65 2
80 47e6f018 0 1 1000a3 1:176 8 3 2 28 1
Only one ucnt for the table... We can now ALTER the table... but what happens when the user who got the error on systables retries with anothe LOCK MODE? Check it:
ideiafix@PacMan.domus.online.pt:informix-> dbaccess stores_demo -
Database selected.
> set lock mode to wait;
Lockmode set.
> select * from customer;
He gets stuck... where?:
ideiafix@PacMan.domus.online.pt:informix-> onstat -u
IBM Informix Dynamic Server Version 10.00.UC5 -- On-Line -- Up 01:30:03 -- 87776 Kbytes
Userthreads
address flags sessid user tty wait tout locks nreads nwrites
477be018 ---P--D 1 informix - 0 0 0 30 42
477be544 ---P--F 0 informix - 0 0 0 0 682
477bea70 ---P--- 7 informix - 0 0 0 0 0
477bef9c ---P--B 8 informix - 0 0 0 0 0
477bf9f4 Y--P--D 14 informix - 4407d574 0 0 0 0
477bff20 ---P--D 11 informix - 0 0 0 0 0
477c044c Y-BP--- 43 informix 4 47f706d0 0 10 0 0
477c0978 L--PR-- 45 informix 5 440d03c4 -1 1 0 0
8 active, 128 total, 18 maximum concurrent
ideiafix@PacMan.domus.online.pt:informix-> onstat -k
IBM Informix Dynamic Server Version 10.00.UC5 -- On-Line -- Up 01:30:06 -- 87776 Kbytes
Locks
address wtlist owner lklist type tblsnum rowid key#/bsiz
440d0314 0 477c0978 0 HDR+S 100002 204 0
440d036c 0 477c044c 440d057c HDR+IX 100074 0 0
440d03c4 477c0978 477c044c 440d04cc HDR+X 100071 80b 0
440d041c 0 477c044c 440d036c HDR+X 100074 911 0
440d0474 0 477c044c 440d041c HDR+X 100074 911 K- 1
440d04cc 0 477c044c 440d0524 HDR+IX 100071 0 0
440d0524 0 477c044c 440d0474 HDR+X 100074 911 K- 2
440d057c 0 477c044c 0 S 100002 204 0
440d05d4 0 477c044c 440d03c4 HDR+X 1000a3 0 0
440d062c 0 477c044c 440d05d4 HDR+U 100071 80c 0
440d0684 0 477c044c 440d062c HDR+U 100071 903 0
11 active, 10000 total, 16384 hash buckets, 0 lock table overflows
ideiafix@PacMan.domus.online.pt:informix-> onstat -u | grep 477c0978
477c0978 L--PR-- 45 informix 5 440d03c4 -1 1 0 0
ideiafix@PacMan.domus.online.pt:informix-> onstat -u | grep 477c044c
477c044c Y-BP--- 43 informix 4 47f706d0 0 10 0 0
ideiafix@PacMan.domus.online.pt:informix-> onstat -g sql 45
IBM Informix Dynamic Server Version 10.00.UC5 -- On-Line -- Up 01:30:44 -- 87776 Kbytes
Sess SQL Current Iso Lock SQL ISAM F.E.
Id Stmt type Database Lvl Mode ERR ERR Vers Explain
45 SELECT stores_demo CR Wait 0 0 9.03 Off
Current SQL statement :
select * from customer
Last parsed SQL statement :
select * from customer
ideiafix@PacMan.domus.online.pt:informix-> onstat -g sql 43
IBM Informix Dynamic Server Version 10.00.UC5 -- On-Line -- Up 01:30:47 -- 87776 Kbytes
Sess SQL Current Iso Lock SQL ISAM F.E.
Id Stmt type Database Lvl Mode ERR ERR Vers Explain
43 - stores_demo CR Wait 0 0 9.03 Off
Last parsed SQL statement :
alter table customer add (dummy_col char)
> ideiafix@PacMan.domus.online.pt:informix-> dbaccess sysmaster -
Database selected.
> select hex(partnum), tabname, dbsname from systabnames where partnum = '0x100071';
(expression) 0x00100071
tabname systables
dbsname stores_demo
1 row(s) retrieved.
He is waiting on the lock you got on systables, and this prevents his session from grabbing a reference to the table structure:
> ideiafix@PacMan.domus.online.pt:informix-> onstat -t
IBM Informix Dynamic Server Version 10.00.UC5 -- On-Line -- Up 01:34:12 -- 87776 Kbytes
Tblspaces
n address flgs ucnt tblnum physaddr npages nused npdata nrows nextns
4 477f8d38 0 1 100001 1:14 250 250 0 0 1
30 47de1018 0 2 100071 1:126 16 11 5 65 2
80 47e6f018 0 1 1000a3 1:176 8 3 2 28 1
so you can complete your work (back to session 1):
> alter table customer add (dummy_col char);
Table altered.
> revoke select on customer from dummy_user;
Permission revoked.
> commit work;
Data committed.
And after this, the waiting session will continue:
[...]
customer_num 128
fname Frank
lname Lessor
company Phoenix University
address1 Athletic Department
address2 1817 N. Thomas Road
city Phoenix
state AZ
zipcode 85008
phone 602-533-1817
dummy_col
28 row(s) retrieved.
Please, remember to include the REVOKE...
So, this ends the article. To summarize it:
When you need to alter some table structure you'll need more than exclusive lock on the table because you'll need that no more sessions are referencing the table structure. These other sessions can be of two types:
- Dirty readers - for these set IFX_DIRTY_WAIT=300 in your environment (this will give you at most 300 seconds while you can kill the sessions or simply wait for them to finish)
- Other sessions with holds on the table structure. These can be sessions in LOCK MODE WAIT [x], that are waiting for your lock (but that already have a reference to the table structure) or sessions with open cursors, prepared statements etc. referencing the table
For these use:
SET LOCK MODE TO WAIT;
BEGIN WORK;
LOCK TABLEIN EXCLUSIVE MODE;
-- dummy grant to prevent new sessions from getting holds on the table structure
GRANT SELECT ONFOR dummy_user;
-- before continuing use onstat -g opn to identify the sessions with references
-- and kill them...
ALTER TABLE;
-- clean up the dummy GRANT
REVOKE SELECT ONFROM dummy_user;
COMMIT WORK;
4 comments:
Cute trick.
Check your spelling. at one point you set IFX_DIRTT_WAIT.
This was most helpful. I've been told you can do this all via firing the correct selects at the sysmaster database too. But anyway, here's a shell script that's based on the above.
(Note: blogger has probably mangled the less than and greater than signs).
#!/usr/bin/sh
#
# This script will tell you who is accessing a particular table.
# It can be useful when attempting to alter/drop/rename a table
# and you're experiencing the 106 ISAM error - eg
# 106: ISAM error: non-exclusive access.
# Error in line 1
# Near character position 40
#
#
# Derived mainly from information from here:
# http://informix-technology.blogspot.com/2006/10/when-exclusive-is-not-really-exclusive.html
# http://publib.boulder.ibm.com/infocenter/idshelp/v115/index.jsp?topic=/com.ibm.adref.doc/ids_adr_0495.htm
#
# Warning: will only work for dba, probably only informix.
# Tested on informix 9.40.FC3, AIX 5.3
#
# Requires dbaccess and awk derivatives (awk,sed,grep)
if [ ! "$1" ]
then
echo "Usage: $0 <database> <table> [-V]"
exit 0;
fi
if [ ! "$2" ]
then
echo "Usage: $0 <database> <table> [-V]"
exit 1;
fi
VERBOSE=""
if [ "$3" == "-V" ]
then
VERBOSE="ON"
fi
dbaccess $1 - >/dev/null 2>/dev/null <<EOT1!
set lock mode to not wait;
set isolation to dirty read;
unload to /tmp/hex1.unl
select lower(hex(partnum)), tabname, dbsname
from sysmaster:systabnames
where dbsname = "${1}"
and tabname = "${2}"
;
EOT1!
if [ ! -s /tmp/hex1.unl ]
then
echo "Cannot find table $1/$2"
exit 0;
fi
hexno=`head -1 /tmp/hex1.unl | sed 's/|.*//g'`
if [ "${VERBOSE}" ]
then
echo "part number of table - ${hexno}"
fi
opensessions=`onstat -g opn | grep ${hexno} | awk '{print $2}' | sed 's/^0x//' | sed 's/^0//g' `
if [ $? -ne 0 ]
then
echo "No-one is currently accessing $1/$2";
exit 0;
fi
if [ ! "${opensessions}" ]
then
echo "No-one is currently accessing $1/$2";
exit 0;
fi
echo "User \tTerminal \tHost \tSessionID"
for sessionid in ${opensessions}
do
if [ "${VERBOSE}" ]
then
echo "Session Id: ${sessionid}"
fi
splitme=`onstat -u | grep ${sessionid} | awk '{printf("%s %s %s",$3,$4,$5);}' `
if [ ! "${splitme}" ]
then
# Has probably already disappeared
continue;
fi
sqlid=`echo ${splitme} | awk '{printf("%-10s",$1);}' `
userid=`echo ${splitme} | awk '{printf("%-10s",$2);}' `
terminal=`echo ${splitme} | awk '{printf("%-10s",$3);}' `
if [ "${VERBOSE}" ]
then
echo "SQL ID: ${sqlid}..."
onstat -g sql ${sqlid}
echo "User: ${userid}. Terminal ${terminal}"
fi
# Get hostname
orighost=`onstat -g ses | grep ${sqlid} | awk '{printf("%-10s",$5);}'`
# I can't get the host IP. For some reason, only 8 characters are
# available to me in onstat -g ses
echo "${userid}\t${terminal}\t${orighost}\t${sqlid}"
done
echo "\nYou may kill the sessions listed using: onmode -z <SessionID>"
echo "You can view the sql currently being run using: onstat -g sql <SessionID>"
Thanks for a very informative article, this answers a lot of questions that IDS help left open.
Thanks also to Bigcalm for the script. To get it to work on Linux, one needs to add the "-r" parameter to the sed commands, as the standard Linux sed doesn't run in extended mode by default. The echo lines need the "-e" parameter to process the escapes for tabs and newlines.
Thank you very much for the tips and advice on this page. Got me out of a tight spot with 11.5 on HP-UX.
Post a Comment