Thursday, June 07, 2007

Cheetah spot by spot: Derived tables in FROM clause

Here we are for a very quick note on another usability feature of IDS 11.10 (Cheetah).
This time we'll see derived tables in the FROM clause of SELECT statements. This referrers to the possibility of "creating" tables on the fly as sub-querys in the FROM clause.

If you work in an environment where the developers also work with other RDBMS software you've probably had a few situations where they ask you: How can I do it with Informix?
There were two answers for this:

  1. If using IDS 7.31, we would have to use temporary tables
  2. If using IDS 9.40+ we could use the collection derived tables: TABLE(MULTISET(SELECT ... FROM ... WHERE ...))
Usually none of this would be welcome. Sometimes we have to make the query in environments where the use of temp tables isn't acceptable or desirable (ETL tools for example where we might not have control over connection lifetime, so we cannot guarantee that the temporary tables will be there when needed). The second option looks strange to an Informix outsider, and in some situation it may force the query to be different depending on the RDBMS to use - This is not desirable to application developers that want to be RDBMS neutral.

So, once more, IBM decided to ease our lives, and implement what is also an ANSI standard: Derived tables. Let's see an example:

select
o1.*
from
orders o1,
(
select max(order_num) , c.customer_num
from orders o0, customer c where c.customer_num = o0.customer_num
group by 2
) as max_orders(order_id,customer_id)
where
o1.order_num = max_orders.order_id

This is possible in IDS 11.10. In IDS 10.00 we could use:

select
o1.*
from
orders o1,
table(multiset(
select max(order_num)::integer as order_id, c.customer_num::integer as customer_id
from orders o0, customer c
where c.customer_num = o0.customer_num group by 2
)) max_orders
where
o1.order_num = max_orders.order_id

So, it was equally possible, but the syntax would be non-standard and looked more complex.
Note: The cast to integer (::integer) is required because the fields are of type SERIAL which is not allowed in MULTISETS.

For those that use this form, when you upgrade to IDS 11 you may want to consider rewriting your queries. Accordingly to the SQL syntax guide manual, the performance of a query written in the new format is "as least as good as a that of a query that uses the Informix-extension TABLE (MULTISET ( SELECT ... )) syntax to specify an equivalent derived table in the FROM clause". This insinuates that there may be performance advantages in this syntax. This is probably related to the fact that IDS does not materialize derived tables that are considered "simple". Clauses like aggregates, order by or unions define this tables as complex.

So, this is a very simple feature with usability, ANSI compliance and application portability in mind.

2 comments:

Anonymous said...

=Start Nitpicking=

"If using IDS 7.31, we would have to use temporary tablesIf using IDS 9.40+ we could use the collection derived tables: TABLE(MULTISELECT(SELECT ... FROM ... WHERE ...))"

Shouldn't that read MULTISET ?

=End Nitpicking=

-AC

Fernando Nunes said...

Nitpicking is very useful process if it leads to better quality, which is the case... Yes, it should read MULTISET... I have just corrected it.

Thanks for your attention,

Regards.