Monday, April 27, 2009

Informix community news

International Informix User Group is hosting the annual Informix User Conference. It's taking place in Kansas City, not far from the Lenexa development labs where most of the IDS development is done.
This allows the conference participants to meet with some of the more influential people in Informix development. Most of the product architects will be at the conference.

It should be starting, and it will end on Wednesday. You can check the schedule on the IIUG site.
The conference is packaged with a lot of interesting sessions, workshops and you can even take the exams for IDS v11 certification. If you're not going to the conference you can also take a free online test just to get some feedback on your knowledge level. The exams are available on the Prometric website and you can get a promotional code which makes the test exam free (usually it costs $10). Please check http://www.ibm.com/software/data/education/cert-assessment.html?ca=09iiugoffer
and go to the Prometric Website. Use the promotional code IIUG2009.

Taking advantage of being at the conference, Cathy Elliot, Informix Software Marketing Manager, has just created a new blob, called Informix Voices ( https://www.ibm.com/developerworks/mydeveloperworks/blogs/InformixVoices/ ) where she intends to post some interviews with relevant people from the Informix community. Should be worth to keep an eye on it.

Finally, we are all waiting for IDS 11.50.xC4 which we know by now that will bring table compression to the Informix world. I will post about it when the version is available. Meanwhile if you're going to the IIUG conference, enjoy it. If not, keep an eye on the Informix blogs for updates.

Saturday, April 25, 2009

Returning to null...

In a recent article I talked about the way to return a NULL from a query. That remembered me how "picky" Informix is with NULL. Did this sound like me saying Informix does something wrong?! That would be strange in an Informix dedicated blog...
Well... I said "picky" because that's the feeling I get from people who face the way Informix deals with NULL. Usually those persons have a background in other RDBMS which may times behave differently. Let me show some situations common in Informix:

  1. The query below returns NULL. Most of the times people expect it to return "SomeString"
    SELECT
    "SomeString"||NULL
    FROM
    systables
    WHERE
    tabid = 1;

  2. The query below return no rows. People would expect it to return some rows
    SELECT
    "SomeString"
    FROM
    systables
    WHERE
    "A" != NULL::CHAR

  3. The query below also returns NULL. People usually expect it to return 10
    SELECT
    10 + NULL::INTEGER
    FROM
    systables
    WHERE
    tabid = 1

Query 1) and 3) are similar. Above I wrote that Informix is "picky". I could say it in another way: Informix is very strict on the way it treats NULL. By definition (SQL standards), NULL is the absence of value. That means that you can't do nothing with it... Any operation (concatenation in query 1 and addition in query 3 for example) will return NULL. Any comparison with NULL returns FALSE. In order to understand it we must avoid looking at the question with a practical approach and instead use a formal, conceptual approach. In other words, it would be intuitive that if we concatenate a string with "nothing" we would still get the string. But on the other hand, conceptually, we cannot concatenate a string with the absence of string... And we cannot compare anything with NULL, because it does not exist. So, by definition any such comparison should return FALSE. "A" is not equal to something that does not exists, but conceptually you can't compare these two operands, so the comparison fails (FALSE).

Also note that in Informix NULL cannot be represented by "" for example. "" is an empty string. But at least it has a defined value. NULL hasn't.

Many developers argue that the way Informix deals with NULL does not make sense and causes incoveniences. I argue that this makes a lot of sense (mathematically) and I concede that it can be inconvenient sometimes from a practical perspective. But in reality, what those developers or I think is pretty irrelevant. What is relevant is that it's the way the SQL standard defines NULL and Informix follows it.

In order to overcome the inconveniences we can use the NVL() function.

Friday, April 24, 2009

Compress and free....

Guy Bowerman post's about two new Informix white papers. One of them talks about the new compression feature of IDS 11.50.xC4. Be sure to check it, if you feel your data is growing beyond acceptable limits.
The other paper gives an extensive and detailed overview of managing IDS on Windows platform.

Here's the article: http://www.ibm.com/developerworks/blogs/page/gbowerman?entry=new_informix_white_papers

Saturday, April 18, 2009

Informix authentication and connections

The idea for this post comes from a recent discussion on c.d.i and also from checking the URL referrers that bring people here.
Most of this blog visits comes from Google or other search engines. And apparently there are a lot of questions about how Informix authenticates and establishes user connections.

I will try go give a generic overview on these subjects. I'll point out a few less known aspects of Informix authentication.
This article in no way tries to substitute the official documentation, but I hope this can be used as a concentrated information resource about the subject.


Authentication versus privileges

First, we have to create a distinction between two steps involved in a connection establishment.
As we all know, IDS and other databases have several privileges required for doing any action in the database.
At the first level we have Database level privileges. These are CONNECT, RESOURCE and DBA. Then we have object level privileges. For tables we have INSERT, DELETE, ALTER, REFERENCE and for column level we have SELECT and UPDATE
The list above is not exhaustive. The privileges relevant for this article are the database level ones. In particular the CONNECT privilege. Without it we will not be able to connect.
Privileges can be given using the SQL stament GRANT and removed with the REVOKE. Object level privileges can be GRANT'ed to individual users or to ROLEs. You can imagine ROLEs as groups and you can GRANT a role to individual users.

But let's get back on track. The purpose here is not to explain the privilege infra-structure, but to explain how do we connect to the database. I just explained the privileges because we need to go through two steps for establishing a connection:

  1. First Informix has to make sure we are who we pretend to be. This is called the authentication phase
  2. Second, Informix will check if the user we are defining in the connection has the required privileges to establish the connection.
    It cannot do the second step without first completing the authentication.
    To be able to connect we need at least CONNECT privilege on the database level. This privilege can be granted specifically to the user or to the special role PUBLIC which means "anyone"

User repositories

One very important aspect of Informix is that it doesn't use the concept of database only users. Any user in an Informix database must be recognized by the underlying OS (up to the current version at the time of writing which is IDS 11.50.xC3).
It's essential to know this in order to understand how it works. For example, if you want to create a user for Informix Dynamic Server usage, you have to create it in the OS or other repository, but you must make the OS aware of that user. And for users created in the OS, if you need to change their password you must do it using the OS tools.

Some people may consider this a big limitation, but the subject is a bit more complex than what it may look at first.
There are reasons for this requirement:
  • In an Informix stored procedure, you can call an OS command or script. This is done using the SYSTEM() SPL statement. This command will be run with the user identity that was used to establish the database connection.
    Other databases execute these commands as a specifically configured user or as the user running the database software. The way Informix works can give you a lot of flexibility, but the price for that is the need to have the user recognized at the OS level
  • When we ask the database server to print out the explain of a query using the SET EXPLAIN SQL instruction, Informix will create a file on the database server. The information is written by the database instance processes, so it has to be written on the server running the database.
    This files are written with the user ID used for the connection. The file is written on the current client directory for connections local to the database server and on the users $HOME for remote connections. Again, this requires a user created in the OS
  • The SET DEBUG FILE/TRACE ON SPL instructions have the same behavior and requirements

The above doesn't necessarily means we need to have the users in /etc/passwd (Unix/Linux). It means that getpwnam() must be able to retrieve the user info. This will happen transparently if you configure your system to use LDAP or NIS authentication.
So you can have a central user management infra-structure like LDAP, MS Active Directory or NIS. Informix also doesn't require the user to be able to establish a session (ssh or telnet for example) on the database server at the OS level. You can for example set the users Shell to /bin/false or use other means to prevent the users to connect to the OS.
To complicate things a little bit more, we should not confuse the need to have the user id recognized in the OS with the authentication mechanism. Since 9.40 Informix can use PAM for authentication. This gives you complete freedom to implement complex and sophisticated authentication methods. You can use whatever PAM modules you desire to implement the authentication, which means your Informix authentication doesn't have to be the same as your OS authentication.
But the user has to be known by the OS due to the reasons presented before. As a quick example, you can create your users with random passwords in the OS (not known to anyone) and configure IDS to authenticate your users based on files, or any remote process using PAM.

The fact that the user has to be known in the OS is generally accepted as an inconvenience, and as such I believe it's safe to assume that in a future version Informix will relax these requirements. It could probably accept connections from users not recognized by the OS (either authenticated internally or with PAM or both) and have some parameter to define a default user for the situations where it needs a user id.


Types of connections

Informix implements the following types of connections:
  • Trusted connections
    These connections are based on trusted relations between hosts. Basically the database server host is configured to trust the user if he connects from a specific host or list of hosts.
    As such, this connections can be established without providing a password or any other authentication mechanism. The way it works is exactly the same as the "r" services (rshell, rcmd, rlogin).
    The configuration is done using the network connectivity files, /etc/hosts.equiv and ~/.rhosts

  • Non trusted connections
    These connections are the standard type of connections. Typically we provide a user and a password that is used to check the identity. The password is checked against the user's stored password


  • Challenge/response based authentication using PAM modules
    Connections made to PAM configured ports (DBSERVERALIAS). PAM stands for Plugin Authentication Modules and can be used in any Unix/Linux system (AIX, HP-UX, Linux and Solaris)

  • Distributed query connections
    These connections can be considered trusted connections. But they're established implicitly when a client connected to instance "A" sends a query referencing instance "B".
    In these situation the IDS server "A" will make an implicit connection on behalf of the user, to IDS server "B". The connection authentication will follow the rules for implicit connections, or will use a different authentication mechanism if server "B" is setup with PAM authentication (more on this later)

Non trusted connections

We use non-trusted connections when we give a user id and a token (typically a password) that guarantees our identity (only the user should know it's personal authentication secret or password) . Meaning we have not only the user id, but also it's secret key.
These connections are used mostly in applications that use JDBC, ODBC, .NET, PHP (PDO), Perl (DBI) etc. These APIs require a connection string or URL, or use specific API connection functions. So we define the password in the string, or we provide is as an argument for the API connection function.
Here's an example of a JDBC URL to connect to an Informix instance:

jdbc:informix-sqli://ids_server_machine:9088/my_database;user=username;password=secret

In ESQL/C (embedded SQL/C - which is basically C language with SQL commands) or Informix 4GL we can use the CONNECT instruction:

CONNECT TO database@ids_instance USER <"username"|variable_user> USING <variable_password>


One important note, that allows me to show the two steps in a connection establishment: authentication and database open:
Typically in 4GL we use a "database test_auth" instruction. This, by default, makes the two steps by trying to make a trusted connection using the owner of the process.
But let's create a simple 4GL program to show the difference between authentication and the database opening phase. Here's the code (don't worry if you don't know 4GL, because these lines are self explanatory):

DEFINE username,password char(20)
DEFINE c CHAR

MAIN
PROMPT "Insert your username: " FOR username
PROMPT "Insert your password: " FOR password ATTRIBUTE ( INVISIBLE )
CONNECT TO "@cheetah2" USER username USING password
PROMPT "We are authenticated, but not yet connected to a database..." FOR CHAR c
DATABASE test_auth

PROMPT "Now we have an opened database!" FOR CHAR c
END MAIN


So, we're asking the user's name and password. Than we're CONNECTing to the server. Note that I didn't use a database name. This is a supported syntax, although not very frequently used.
After the connect, we then issue the usual "DATABASE" instruction. Let's see what happens when we run the program above. First the status of the IDS instance showing the existing sessions:

cheetah2@PacMan.onlinedomus.net:informix-> onstat -u

IBM Informix Dynamic Server Version 11.50.UC2 -- On-Line -- Up 09:52:57 -- 88064 Kbytes

Userthreads
address flags sessid user tty wait tout locks nreads nwrites
4780c018 ---P--D 1 informix - 0 0 0 69 559
4780c5f0 ---P--F 0 informix - 0 0 0 0 1032
4780cbc8 ---P--F 0 informix - 0 0 0 0 596
4780d1a0 ---P--- 5 informix - 0 0 0 0 0
4780d778 ---P--B 6 informix - 0 0 0 48 0
4780e328 ---P--- 17 informix - 0 0 1 315 224
4780e900 ---P--D 9 informix - 0 0 0 3 0
4780eed8 ---P--- 16 informix - 0 0 1 1286 1093
4780f4b0 ---P--- 15 informix - 0 0 1 189 3
47810060 Y--P--D 21 informix - 440cfe28 0 0 0 0
10 active, 128 total, 18 maximum concurrent


So... Only informix user system sessions. Now let's run the program (I'll do it as root):

cheetah2@pacman.onlinedomus.net:root-> ./test.4ge
Insert your username: fnunes

Insert your password: [......][ENTER]

We are authenticated, but not yet connected to a database...


The program is now holding at the first "PROMPT" instruction after the connect. Let's see the list of sessions:

cheetah2@PacMan.onlinedomus.net:informix-> onstat -u

IBM Informix Dynamic Server Version 11.50.UC2 -- On-Line -- Up 09:57:30 -- 88064 Kbytes

Userthreads
address flags sessid user tty wait tout locks nreads nwrites
4780c018 ---P--D 1 informix - 0 0 0 69 559
4780c5f0 ---P--F 0 informix - 0 0 0 0 1032
4780cbc8 ---P--F 0 informix - 0 0 0 0 596
4780d1a0 ---P--- 5 informix - 0 0 0 0 0
4780d778 ---P--B 6 informix - 0 0 0 48 0
4780e328 ---P--- 17 informix - 0 0 1 315 224
4780e900 ---P--D 9 informix - 0 0 0 3 0
4780eed8 ---P--- 16 informix - 0 0 1 1286 1093
4780f4b0 ---P--- 15 informix - 0 0 1 189 3
4780fa88 Y--P--- 61 fnunes 4 485bff38 0 0 0 0
47810060 Y--P--D 21 informix - 440cfe28 0 0 0 0
11 active, 128 total, 18 maximum concurrent

cheetah2@PacMan.onlinedomus.net:informix-> onstat -g ses 61

IBM Informix Dynamic Server Version 11.50.UC2 -- On-Line -- Up 09:57:39 -- 88064 Kbytes

session effective #RSAM total used dynamic
id user user tty pid hostname threads memory memory explain
61 fnunes - 4 15267 pacman.o 1 40960 37280 off

tid name rstcb flags curstk status
85 sqlexec 4780fa88 Y--P--- 5728 cond wait netnorm -

Memory pools count 1
name class addr totalsize freesize #allocfrag #freefrag
61 V 483c3028 40960 3680 68 7

name free used name free used
overhead 0 1672 scb 0 96
opentable 0 784 filetable 0 192
misc 0 64 log 0 16512
temprec 0 1608 gentcb 0 1232
ostcb 0 2632 sqscb 0 8192
sql 0 40 hashfiletab 0 280
osenv 0 1720 sqtcb 0 2208
fragman 0 48

sqscb info
scb sqscb optofc pdqpriority sqlstats optcompind directives
488a18a0 48558018 0 0 0 2 1

Sess SQL Current Iso Lock SQL ISAM F.E.
Id Stmt type Database Lvl Mode ERR ERR Vers Explain
61 - - - Not Wait 0 0 9.29 Off


So what do we see? We have a session, on behalf of user "fnunes" but the "Current Database" is not defined. So we've just gone through the authentication process with success.
Now, if I press ENTER on the program I'll force the program to execute the DATABASE statement and wait on the second PROMPT:

cheetah2@pacman.onlinedomus.net:root-> ./test.4ge
Insert your username: fnunes

Insert your password: [...] [ENTER]

We are authenticated, but not yet connected to a database... [ENTER]
Now we have an opened database!

ok... now let's look at the database session again:

IBM Informix Dynamic Server Version 11.50.UC2     -- On-Line -- Up 10:09:32 -- 88064 Kbytes

session effective #RSAM total used dynamic
id user user tty pid hostname threads memory memory explain
61 fnunes - 4 15267 pacman.o 1 49152 45544 off

tid name rstcb flags curstk status
85 sqlexec 4780fa88 Y--P--- 5728 cond wait netnorm -

Memory pools count 1
name class addr totalsize freesize #allocfrag #freefrag
61 V 483c3028 49152 3608 83 6

name free used name free used
overhead 0 1672 scb 0 96
opentable 0 1792 filetable 0 352
misc 0 64 log 0 16512
temprec 0 1608 gentcb 0 1232
ostcb 0 2632 sqscb 0 13712
sql 0 40 rdahead 0 832
hashfiletab 0 280 osenv 0 1720
sqtcb 0 2792 fragman 0 208

sqscb info
scb sqscb optofc pdqpriority sqlstats optcompind directives
488a18a0 48558018 0 0 0 2 1

Sess SQL Current Iso Lock SQL ISAM F.E.
Id Stmt type Database Lvl Mode ERR ERR Vers Explain
61 - test_auth CR Not Wait 0 0 9.29 Off


So, now we have a current database, in the same session.
This ends the description of non trusted connections using username and passwords. Pretty simple, just like in any other database server.


Challenge/response connections using PAM

Another kind of non trusted connections are connections established through PAM configured DBSERVERALIAS (specific ports where the instance will listen for connections).
I will not go into details about this, because I already wrote an article about it. In http://informix-technology.blogspot.com/2007/11/informix-user-authentication-pam-for.html I explain how to setup and give examples of challenge/response scenarios.

In summary, a non-trusted connection in Informix is similar to connections in other RDBMS. You supply a username and a token that certifies you're entitled to connect as that user.
Typically this token is a password. But with PAM this can be a much more complex item.
In IDS 11.50 we can also use single sign on with kerberos. Due to the complexity of this scenario I've left it out in this article.
Non trusted connections are used mostly in more recent applications, like Web based applications, Windows client applications, or script languages (PHP, Perl, Ruby etc.) languages.

Trusted connections

These connections are made without any token to confirm the user identity. It's assumed that the user is already identified on the client system and we configure the database server to trust that identification. Trusted connections are mostly used in applications created using Informix 4GL and Informix ESQL/C.
The definition of the trusts are done by configuration of the so called network security files. These files are /etc/hosts.equiv and the ~/.rhosts (.rhosts created in each users's home dir).
Before we dig into these files content, and what it means I have to make a parenthesis to deal with something I'd almost call a myth surrounding Informix trusted connections. If you look around on the Internet you'll easy find a lot of information about how insecure is to use these files. In particular, you may found explicit instructions to not use the ~/.rhosts file.
The main reasons for these are two facts:
  1. These files are not specific for Informix, and in fact they were created to configure a group of services, sometimes called the "r" services. These services are rexec (allows remote execution of a command) on TCP port 512, rlogin (allows remote login) on TCP port 513 and rcmd (all remote shell execution) on TCP port 514.
  2. Many of these services are considered insecure and have a long history of security flaws. The most famous worm also took advantage of some of these security flaws, and also from very relaxed security setup on many of the networks connected to the Internet at the time.

Why did I referrer before the "myth"? Well, many customers have the idea that since IDS uses the same files for trusted connections authentication that you need to be running the above services for it to work. This is completely false. I cannot stress this enough. Even in the official documentation, the administrator guide, is written:

To determine whether a client is trusted, execute the following statement on the client computer: rlogin hostname If you log in successfully without receiving a password prompt, the client is a trusted computer.


Although this is not wrong, it misses the point, and gives the idea that the rlogin service should be running. The above sentence, saying that if rlogin works the user is trusted, is true. But you can create a trust for an Informix connection without having the service(s) running. In this case you'll get a "can't connect error" instead of the password prompt when you run rlogin. And it doesn't mean it won't work with Informix.
I strongly believe Informix should not use these same files for defining it's trust relationships. But the concern that this usage raises is not really an issue. If you're concerned about security, you probably will not use the "r" services. Most of the sites which take security seriously have replaced them with ssh or something similar. So the truth is that these files should be useless in a modern system. And in this scenario, the fact that they're used by Informix should not be considered too critical, since nothing else should depend on them.


I hope this contributes to the disappearance of the "myth". Now let's see how we have to change the network security files in order to create the trusts.
The contents of the files should be similar to how you would configure "r" services trusts. Let's see the two files, because there some slight but very important differences.
  • /etc/hosts.equiv
    Each line in this file defines a trust relation. You can specify a host only, or a host and a user. This file is a system file. No user other than system administrator should have permission to change it.
    If you specify only the host name, you're saying that any user from the remote host is trusted to connect as the same user in the local host. If you specify a "+" (plus sign) this means "any host" and you should never do that...
    There is a significant difference between how Informix and "r" services interpret the file if you specify a hostname followed by a user name. For "r" services, this means that you trust the remote user on the remote system to connect as ANY user in the local system (except root). This is obviously a very serious issue... And by the way, "+ +" means any user from anywhere can login locally as any user! (don't try this even at home :) )
    Informix on the other hand interprets this as a way to specify user by user and not all the users. The remote user cannot connect as another local user without providing a username and password.
  • ~/.rhosts
    This file also includes a trust relation in each non comment line. This file, being in the user home directory, means that you allow the user on the local host to define who he trusts. Many systems verify that the file has no write permissions to the "world". In this case it ignores the settings.
    A line with a host and a username means that the specified username on the specified remote host can connect as the local user owning the .rhosts file. So, for "r" services, if you specify a host and user in /etc/hosts.equiv it means that user can connect as any local user. If you do the same in ~/.rhosts you're reducing the scope of local users that the remote user can authenticate as, to only the file owner
    For Informix, the interpretation is the same: The remote user can authenticate on the local machine, but without changing it's identity. It means that for example, if the local user is luser and it's ~/.rhosts contains "remotehost ruser", then this entry is useless because Informix doesn't have a way to specify the identity change in a connection attempt.

Some additional notes on these files:
  • You can specify negative entries. For example you could use the following in /etc/hosts.equiv:

    pacman -informix
    pacman

    This would allow all users from host "pacman" to connect except informix
  • An entry with a host name means "trust all users except root". A trust relation for root user has to be defined in it's .rhosts file.
  • The order of the entries is relevant. If we exchange the order of the entries in the example above, even informix will be allowed to connect. That's because a positive entry will be found first


Let's see some examples. I have two hosts (pacman and ids1150srvr). I'm running IDS on the ids1150srvr host. If I want to allow the user informix to connect to this instance from the pacman host I can do it in the following ways (these file entries are on the ids1150srvr host):

  • /etc/hosts.equiv:

    pacman
    pacman informix

    The first entry allows all users (except root) to connect from pacman. The second will allow only informix. But be aware that if you're running "r" services you're saying that informix on pacman can connect as any user (again, except root) on ids1150srvr.

  • ~informix/.rhosts

    pacman

Distributed query connections

We've seen a client/server trusted connection. But we have another situation where we must establish trusts between systems. That's the case when you need to join data from two Informix instances. The SQL syntax to do this will look like this:

SELECT
local_table.column, remote_table.column
FROM
table1 local_table, remote_database@remote_ids_instance:table2 remote_table
WHERE
local_table.join_column = remote_table.join_column;

Assuming we are connected to a "local_ids_instance" on ServerA, and that "remote_ids_instance" is running on ServerB, this query will establish a connection on "remote_ids_instance" from the "local_ids_instance". Informix will establish this connection on behalf of our user on "local_ids_instance". As such, our user must be trusted on ServerB when connecting from ServerA. Note that the trust is not done between our client and the ServerB. It's between ServerA and ServerB. If the "remote_ids_instance" is accessed through a normal Informix port, the authentication mechanism used is the same as for trusted connections.

The situation will be different if the "remote_ids_instance" port is configured with PAM. In this situation the trust configuration is done entirely through SQL instructions. IDS versions that support PAM also have a new system database. It's called sysuser. In it we have a table called sysauth with the following schema:


{ TABLE "informix".sysauth row size = 322 number of columns = 4 index size = 295 }
create table "informix".sysauth
(
username char(32) not null ,
groupname char(32),
servers varchar(128) not null ,
hosts varchar(128) not null ,
check (groupname IS NULL )
);
revoke all on "informix".sysauth from "public" as "informix";


create unique index "informix".sysauth_idx on "informix".sysauth (username,servers,hosts) using btree ;



This table is used to define trusts for distributed queries when the destination Informix instance is configured for PAM authentication.
We must specify the username, the hostname and the IDS server from where the connection is made. Note that this not only avoids the usage of the network security files but also provides more control. With the network security files we cannot specify the Informix instance originating the connection.

So, picking up the query above, and assuming the username is "fnunes" we would need the following configuration on ServerB for it to work:
  • Using /etc/hosts.equiv one of the following lines:

    ServerA
    # this would create a trust for all users from ServerA
    ServerA fnunes
    # This would restrict the trust in Informix for user fnunes.
    # But for "r" services it would allow fnunes to login as any user except root on ServerB

  • Using ~fnunes/.rhosts
    ServerA

  • Using the sysuser:sysauth table (if the port was configured for PAM):
    INSERT INTO sysauth ( username, groupname, servers, hosts) VALUES ('fnunes', NULL, 'local_ids_instance', 'ServerA');
Additional notes about sysauth:
  • Although the field names "servers" and "hosts" suggest you could use a list of servers or hosts the documentation says we should create one entry per user/server/host
  • The field "groupname" is not used currently as expected from the check constraint




Additional notes

There are a few less none configuration details about connections in Informix. Let's check them.
  • There are some options that we can specify in the sqlhosts file that control the way IDS deals with trusted connection attempts (for the specific ALIAS of the sqlhosts line).The options I'm referring to are "s=X". "s" stands for security on the server side. The following values are supported:
    • s=0
      Disables both hosts.equiv and rhosts lookup from the database server side (only incoming connections with passwords are accepted).
    • s=1
      Enables only the hosts.equiv lookup from the database server side.
    • s=2
      Enables only the rhosts lookup from the database server side.
    • s=3
      Enables both hosts.equiv and rhosts lookup on the database server side (default setting for the database server side).
    • s = 6
      Marks the port/ALIAS as a replication only (HDR or ER) port
      If you use the option s=6 on sqlhosts options field, than you can create a file called hosts.equiv in $INFORMIXDIR/etc, owned by informix:informix and only modifiable by user informix. This makes the informix user independent of the system administrator for configuring the trusts, and additionally doesn't interfere with "r" services if they're in use
  • There is another security option that defines if the client uses or not the .netrc file. I haven't yet mentioned the .netrc file... You can use this file so specify a user and password, per host that you want to connect. If you setup the file and try to make a trusted connection you'll instead make a non-trusted connection. An example of this file content would be:

    ids1150srvr login fnunes password mysecret

    This would connect you as user "fnunes" using password "mysecret" if you attempt to make a trusted connection to an Informix instance running on host "ids1150srvr". You can have several lines, one for each host you connect to.
    There are obvious security issues derived from having a password in a file...
    The option that specifies if the client libraries will look and use this file is the "r=X" settings on the sqlhosts options (client side):
    • r=0 it ignores the .netrc file
    • r=1 it looks at the .netrc file when a client program tries to make a connection without using username and password

Windows specific information

In MS Windows, the network security files are used in a similar way, but they're in different locations. /etc/hosts.equiv becomes %SYSTEMROOT%\system32\drivers\etc\hosts.equiv.
The .rhosts and .netrc should exist in the users HOME

Common error messages

The following is not an exhaustive list of connection errors, but a small list of the more common ones with a very brief comment. For more information about each one consult the documentation or use "finderr errorcode":
  • -908 Attempt to connect to database server (servername) failed.
    Usually means some problem on the TCP level. Either your server is not running or you have your client SQLHOSTS points to the wrong server or port
  • -930 Cannot connect to database server servername.
    Typically the client cannot obtain the server IP address. Check your DNS configuration.
  • -951 Incorrect password or user user@host is not known on the database
    You can get this error on the server side (online.log) and on the client side.
    On the server side it means the user does not exist.
    On the client side you'll see these error in three situations.. The same error is sent to the client side on different situations in case someone is trying to guess a user or password. The database system administrator can see the real problem. The situations where the error is sent to the client are:
    • The user does not exists on the database server
    • The user's password cannot be validated, although the user exists
    • The user exists, but you're trying to make a trusted connection and the trust is not setup correctly
  • -952 User ()'s password is not correct for the database server.
    This is the error given on the server side when the user password did not validate successfully
  • -956 Client client-name or user is not trusted by the database server.
    Server side error when a trusted connection fails because it's not properly setup
  • -1809 Server rejected the connection.
    This is the only error you'll see when a connection to a PAM configured server happens. In order to understand what went wrong you have to check the PAM stack modules messages
  • -25596 The INFORMIXSERVER value is not listed in the sqlhosts file or the Reg
    The INFORMIXSERVER you're trying to connect is not properly defined in you configured SQLHOSTS
Summary

I really hope this article covers most practical aspects of connection establishment in Informix. There are several aspects I left out for simplicity. These include:

  • Single sign-on (introduced in IDS 11.50)
  • Password encryption
  • SSL connections
These topics may be covered in future articles. You can find more information about Informix connection setup in the Administrator Guide and the Security Guide.

Thursday, April 16, 2009

Returning null....

Once in a while I'm asked on how to return NULL in a SELECT. This can be a strange request, and usually relates to two also strange situations:

  1. UNLOAD TO file.unl SELECT "" FROM table
    This will create the unload file with "|\ |" (note that there's a space following the backslash)
  2. SELECT NULL FROM table
    This simple SQL instruction raises a 201 (syntax error) exception
Let's explain the first issue. Unload files (typically pipe separated) generated by the UNLOAD instruction were created as a way to export/import data into an Informix database table. This means Informix has to understand, by looking at what's in the middle of two consecutive pipes ("|"), what was the original value. Given this: "||" (pipe followed immediately by pipe), and assuming we're dealing with a VARCHAR value, what would be the original value? I could give you two options:
  1. NULL
  2. "" (empty string)
Having two options is not really an option. When we LOAD the file, we want to be sure that the inserted value is exactly the original value. Given the two options what would we INSERT? A NULL value or an empty string? This is why we need to identify the NULL differently. And this is done by using "|\ |" (pipe, backslash, space, pipe).
This is fine for Informix, but people tend to use UNLOAD files to move data from Informix into other applications. And other applications don't usually like this notation. So how can we create an unload file with a "NULL" column, meaning "||" (pipe, pipe). The first obvious answer leads to problem 2):

Why does Informix raise an error if we use NULL in the projection clause (SELECT list)? It's seems an error. Specially if you're used to other databases where this works.
Well, the answer is simple. Informix is very strict with the concept of NULL. NULL is the absence of value. So, if we're instructing the SELECT to return an absence of value it raises an unsolvable problem: What is the data type of an absence of value?! Of course Informix could assume some data type, but again, the engine is very strict. So it won't take decisions of this kind for you and consequently the use of NULL in this context is not acceptable (and it raises a syntax error) So, what is the solution? Simple:

SELECT NULL::CHAR FROM table

or
SELECT CAST(NULL AS CHAR) FROM table


This simply defines the data type of the result set to return to the client.

Another option would be to create a stored procedure which return a CHAR and inside return the NULL value. Something like:

CREATE PROCEDURE ret_null() RETURNING CHAR;
RETURN NULL;
END PROCEDURE;

SELECT ret_null() FROM table


But this is more complex and less efficient.

There are much more things to say about NULL, but I'll leave that for future articles.

Sunday, April 05, 2009

Informix in virtualized environments

I recall that when I was around my 16 to 19 years old I was completely amazed by the possibility of running a different operating system inside a window on my system. At the time I was using a Commodore Amiga, and I had software to emulate Atari, Apple and MS-Dos systems. The first two used the same CPU as my native system, and the later was a complete emulation off an Intel x86 CPU. Because of this, performance was really awful, but nevertheless it was very interesting to use.

At that time we called that emulation. And the purpose was a bit different than what we currently call virtualization. The similarity lies in the fact that in both situations we create a virtual hardware environment in which we run a operating system and applications. Today, virtualization is a widespread technology, used in high-end systems as well as in plain simple laptops. Some examples of virtualization technologies and uses include:

  • IBM's system Z (mainframes)
    These systems have virtualization technology for ages. We can run different operating systems on "partitions" which are groups of resources (CPU, memory, storage) allocated from the base machine. These OS include Linux for example
  • IBM's system P (Power processors)
    It incorporates some of the System Z concepts. The partitions can be "physical" and "logical". Can have a fixed or dynamic resources capacity. Can run AIX and Linux on same base equipment in different "partitions"
  • SUN's Solaris Domains and containers
    On SUN's boxes you can create different partitions running different copies of your operating system, or create "containers" which are logic groups of resources which share the same copy of the operating system. IBM provides the Workload Manager for AIX for this.
  • HP-UX npars, vpars, Integrity VM and Secure Resource Partitions
    HP provides physical partitions, virtual partitions, virtual machines and also virtual resources environments sharing the same copy of the operating system
  • VMware
    Probably the most well known virtualization technology. It can run on our desktop systems (Windows and Linux) or be directly installed on the base hardware.
  • XEN
    An open source virtualization technology. It is used by several other environments like Amazon EC2 (more on this later)
  • SUN's VirtualBox
    It's another x86 virtualization product which runs on Windows, Linux, Mac OSX and OpenSolaris
For performance reasons, usually, the virtualization technologies just create virtual machines of the same architecture as the base system. This means the CPU type is generally the same. Emulating other kinds of CPUs, although technically possible, imposes a serious performance overhead. Also, current CPU technologies include support for virtualization directly on the chips. It's perfectly possible to do it without hardware support, but it's slower. The main issue is that any machine code instruction that tries to access the hardware directly has to be intercepted. If the virtualization system (hypervisor) didn't do it, you'd have conflicts between the different virtual machines running on the same host.

So... Why would we want to virtualize? Well, several reasons for several uses:
  • Many hardware resources are used below it's capacities. Virtualization allows the sharing of the same resources (CPU, memory, network and storage) for usage in different (and isolated) machines. This leads to cost optimization
  • It's much easier to create a virtual machine on top of an existing hardware box, than to physically purchase, connect, install and manage a real machine
  • Due to the two reasons above, a virtual machine can be a great environment to support several activities like testing, learning and training, developing, demoing etc.
  • It's relatively easy to "shutdown" a virtual machine on one host, and "turn it on" on another hardware box. Latest versions of virtualization products sometimes even support "live" migration of virtual machines between different hardware boxes. This can become a real advantage in terms of system availability (without extra cost, like clustering, redundancy etc.)
  • It's possible to dynamically balance the physical resources (CPU capacity, disk and even memory) of the physical host between the virtual environments it supports. This means that different virtual machines with distinct usage cycles can co-exist on the same hardware box, and you can configure the resources to move between the virtualized hosts whenever their needs change
Ok. The above can give you an overview of the virtualization technologies and why you would want to use them. Now let's dig into the Informix related stuff. The first questions would be: Should you use Informix in virtualized environments? Does it work well? Does IBM support it? Does IBM provide flexible pricing to match the flexibility in these environments?

Well, the answer to all these questions could be a simple "yes". Let's see:

  • Informix architecture, usually referred to as Dynamic Scalable Architecture (DSA) is a perfet fit for virtualized environments. Informix implements the concept of virtual CPU in a operating system process. These CPUs then run user and system threads. This explains why it's so light. These virtual CPUs (CPU VP in informix jargon), can be added and removed dynamically. So, from the begining of IDS (when DSA was introduced) you can effectively dynamically adjust the CPU resources of your intance. Memory can also grow, and shrink. But I have to grant that it would be nice to see some improvments here. In practice it's very difficult to be able to shrink the memory once it grew.
    But the small footprint (both of installation and running resources) and dynamic resource adjustment are nice features for virtualized systems.
  • Regarding support, you can be confident that IDS is supported in these environments. There are obvious questions regarding performance issues, but you will not get the dreadful answer of "your setup is not supported" in case you need help from tech support.
  • Finally, IBM pricing is well aware of the virtualization needs (assuming a CPU based license policy). You will only pay for the resources you attribute to your virtual host. Accordingly to a recent announcement your license fees will depend on your virtual hosts environment and not the underlying hardware (which is usually much bigger, as as such would be more expensive).
    IBM calls this license scheme for virtualized environments "sub-capacity", alluding to the fact that you're running a virtual host with less capacity than the base hardware.
    If you want to license for concurrent session, than this is just like in any other (non virtualized) environment
Virtual appliance with IDS developer edition

IBM announced some time ago the availability of an IDS Developer Edition based virtual appliance. This is pre-installed and pre-configured VMWare image, running SUSE Linux Enterprise Server V11 and IDS 11.50. Everything is configured so you can easily deploy it and use it for testing, learning or developing purposes. Scripts are provided to create a full MACH-11 cluster and intructions are included to lead you through some demos. You just need a free product from VMWare to run it on your laptop. The appliance is available in 32 and 64 bit versions. You can access this virtual image in two ways:
When you first run the appliance, you'll go through some screens that allow you to make some configurations and also will prompt you for license acceptance. This process is fairly simple and will only take a couple of minutes. After that you'll see a normal Linux desktop with some shortcuts that will allow you to explore the power and simplicity of IDS.
This appliance is being constantly improved and updated by IBM. Current IDS version is 11.50.xC3, but you should expect 11.50.xC4 when available. I strongly recommend this appliance to anyone who wants to get familiar with Informix.


Amazon EC2 cloud


Cloud computing has become another buzz word of the IT industry. Large companies have large computing infra-structures. You can imagine that companies like IBM, SUN, Microsoft, Google, Yahoo, Amazon and so, on have large datacenters spread around the world. Like any other computer in the world, these datacenters are not always using it's full capacity. So, more, and more companies are trying to take advantage of some of their computing power, by making it available to customers as services. This resources are "somewhere" on the Internet. That's why the term "cloud" is used. Customers only have to know how to use these resources. They don't need to know how they're implemented or where they are located. You as a customer, pay a certain fee to use a determined amount of computing resources.
Amazon was one of the first companies to sell cloud computing type of services.. It started around 2006 selling an infra-structure where customers could implement web services. Later it introduced the EC2 (Elastic Computing) concept. The idea here is to rent virtual machines (Linux or Windows) to anybody who needs them. And you pay only what you use at the rate of $0.1 / hour for what Amazon calls a "small instance". This is "equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor" and has 1.7GB of RAM. Not a big server, but perfectly enough for some tests or studying. You can also rent bigger instances, and you can rent several of them.
So, the term "elastic" means you can rent the resources you need and grow them as your needs grow. And you won't have to pay for physical allocation and equipment.

Now, why am I talking about this? Simply because IBM made the same virtual appliance I wrote about above, available as an AMI (Amazon Machine Image). This means you can rent an Amazon instance running IDS 11.50.UC3 (32 bit only for now) on top of SUSE Enterprise Linux.
To be honest, I was a bit lost with all these concepts, so I decided to test this myself. I've followed the following steps:
  1. I went to Guy Bowerman's blog to search for info
  2. I got hold of the IBM Informix Server Amazon Machine Image (AMI) Get Starting Guide
  3. I went to http://aws.amazon.com/ec2/ and sign up. After login you'll have the access keys and an X.509 certificate (private and public key). These are used to identify you when calling Amazon web services (which implement the Amazon managing API). So you should download them into your local system (as explained in the Getting Started Guide)
  4. The next step is to "buy" the AMI of the IDS Developer Edition. I put "buy" between quotes, because although you have to put on a purchase order, in reality you will not have to pay any licensing fees. You'll just pay the use of it, at the standard Amazon small instance rate of $0.1/hour. This step and the URLs are perfectly documented in the guide
  5. The next step involves downloading and setting up an installation of the Amazon EC2 API (command line) tools. These are implemented in Java, which means two things: You'll need a Java (JRE) environment on you system, and you can run them on Windows, Unix and Linux. During the setup process it is suggested that you create another key pair that will be used to authenticate your logons to the instance.
  6. Then, instructions are provided in order to launch an Amazon instance based on the IDS Developer AMI that you "purchased" earlier. Detailed instructions are included so that you can access the running instance using an SSH connection. Remember that the authentication will be done through a pair of keys you generated a few steps ago.
  7. After you login to the instance you'll get through a similar process that the IDS virtual appliance also provides. Besides the common licencing acceptance, in this environment you'll also be prompted for:
    1. The keypair you generated (it's suggested that you copy the files and just point to them)
    2. The user's passwords (root, informix and developer)
    3. The configuration of a persistent storage.
      I should have wrote about this earlier... The AMI instances are volatile. This means that once they're stoped all their "local" storage is gone. So, you should allocate a permanent storage from Amazon EBS service (extra charge of around $0.1/GB/Month). This storage volume can be mounted in /data by the IDS Developer instance. I'll get back to this topic below.

So, after this steps I got a SUSE Enteprise Linux, running IDS Developer Edition, with a MACH 11 cluster already configured, running somewhere, in the Amazon Cloud, available for me (and anyone I want) to connect to. How much did it cost? Around $0.35, including an EBS storage volume.
Please note that IBM didn't just made an IDS Developer AMI available. IBM also established a policy for licensing Informix (and other IBM software) on the Amazon Cloud Computing platform. The relevant announcements are here; http://www-03.ibm.com/press/us/en/pressrelease/26673.wss and here; http://www-01.ibm.com/software/lotus/passportadvantage/pvu_for_Amazon_Elastic_compute_cloud.html (Processor Value Units - PVUs - for Amazon EC2 )

So, isn't this a perfect way to test software, or to create temporary machines for propotype developing, or for the purpose of distance teaching etc. ? Yes... But I feel there's a small issue:
As stated above, you pay what you use. This means that you pay for as long as your instances are running. Obviously, for saving money, you'll want to stop them when they're not used. But the instances are volatile. Meaning that it's not exactly like a VMWare image. When you restart them you'll get the AMI initial image, and not the machine's state when you shut it down. That's why Amazon provides the EBS volumes. These are permanent, non-volatile storage volumes. As mentioned in the getting started guide, you should keep you database files in these volumes. But even so, if you restart the instance, you'll have to go through the setup screens again. This is not convenient. But there is a simple solution for this: Private AMIs.

When you're running an instance, you can decide to make an AMI from it. The process is called "bundle" it. You can get the details on how to do it here: http://docs.amazonwebservices.com/AWSEC2/latest/DeveloperGuide/bundling-an-ami.html After you create a bundle from a running instance, you can upload it. This will make a new AMI available for you. It's called a private AMI. You can also make it available to the public.
After this you can launch an instance from your own AMI. So theoritically you could customize the IDS Developer AMI, bundle it, upload it as a private AMI and use it to launch your customized instances. You'd have to check the licenses though...

So, in short, in which scenarios could we use Amazon EC2, and more specifically the IDS Developer AMI?
  • You need some machine for a team of developers to work on a new project during a short period of time
    It's easy to setup and use. And you'll know how much it will cost you. And you don't have to depend on your own resources
  • You need to make a customer demo for an application you developed. You just install it, and use it at your customer site. Better yet, your customer can make it's own testing even after you leave
  • You want to provide some application training remotely (or long distance). Again, just install it, give the access details to your students, and there you go...
  • You want to learn about IDS and you don't want to install the virtual appliance locally (you don't have the necessary resources for running it)
  • And of course, you have a startup company, and you don't want to own your own datacenter. So you just rent it... In this scenario you would need payed IDS licensees of course....
Summary
In this long post I've gone through the following points:
  • Why IDS is a perfect match for virtualized environments
  • IBM Informix virtual appliance. A pre-configured VMWare image with IDS Developer Edition already installed. Everything ready for your experiments
  • IBM Developer Edition AMI (Amazon Machine Image). The machine image in Amazon EC2 format that IBM made available for use in Amazon EC2 environment
I haven't gone into details of the virtual appliance contents. But I recommend that if you're interested in IBM Informix Dynamic Server, you should really test it. It probably has everything you'll need to learn and test IDS.

Glossary
  • Amazon EC2
    A cloud computing environment run by Amazon
  • AMI
    Amazon Machine Image - A pre-built virtual machine that you can use to start an Amazon EC2 instance
  • Amazon EC2 instance
    A running virtual machine in the Amazon EC2 environment
  • Amazon S3
    Amazon's Simple Storage Service
    This is a non-volatile storage service provided by Amazon. It costs around $0.1/GB/Month
  • IDS Developer Edition
    A version of IBM Informix Dynamic Server, that you can use for application developing.
    It's freely available at and you can use it for learning, test and application developing. Please check the license for details
  • VMWare Appliances
    Pre-configured virtual images ready to run in one of the VWWare products (IDS Developer image is available)
References