Sunday, April 05, 2009

Informix in virtualized environments

I recall that when I was around my 16 to 19 years old I was completely amazed by the possibility of running a different operating system inside a window on my system. At the time I was using a Commodore Amiga, and I had software to emulate Atari, Apple and MS-Dos systems. The first two used the same CPU as my native system, and the later was a complete emulation off an Intel x86 CPU. Because of this, performance was really awful, but nevertheless it was very interesting to use.

At that time we called that emulation. And the purpose was a bit different than what we currently call virtualization. The similarity lies in the fact that in both situations we create a virtual hardware environment in which we run a operating system and applications. Today, virtualization is a widespread technology, used in high-end systems as well as in plain simple laptops. Some examples of virtualization technologies and uses include:

  • IBM's system Z (mainframes)
    These systems have virtualization technology for ages. We can run different operating systems on "partitions" which are groups of resources (CPU, memory, storage) allocated from the base machine. These OS include Linux for example
  • IBM's system P (Power processors)
    It incorporates some of the System Z concepts. The partitions can be "physical" and "logical". Can have a fixed or dynamic resources capacity. Can run AIX and Linux on same base equipment in different "partitions"
  • SUN's Solaris Domains and containers
    On SUN's boxes you can create different partitions running different copies of your operating system, or create "containers" which are logic groups of resources which share the same copy of the operating system. IBM provides the Workload Manager for AIX for this.
  • HP-UX npars, vpars, Integrity VM and Secure Resource Partitions
    HP provides physical partitions, virtual partitions, virtual machines and also virtual resources environments sharing the same copy of the operating system
  • VMware
    Probably the most well known virtualization technology. It can run on our desktop systems (Windows and Linux) or be directly installed on the base hardware.
  • XEN
    An open source virtualization technology. It is used by several other environments like Amazon EC2 (more on this later)
  • SUN's VirtualBox
    It's another x86 virtualization product which runs on Windows, Linux, Mac OSX and OpenSolaris
For performance reasons, usually, the virtualization technologies just create virtual machines of the same architecture as the base system. This means the CPU type is generally the same. Emulating other kinds of CPUs, although technically possible, imposes a serious performance overhead. Also, current CPU technologies include support for virtualization directly on the chips. It's perfectly possible to do it without hardware support, but it's slower. The main issue is that any machine code instruction that tries to access the hardware directly has to be intercepted. If the virtualization system (hypervisor) didn't do it, you'd have conflicts between the different virtual machines running on the same host.

So... Why would we want to virtualize? Well, several reasons for several uses:
  • Many hardware resources are used below it's capacities. Virtualization allows the sharing of the same resources (CPU, memory, network and storage) for usage in different (and isolated) machines. This leads to cost optimization
  • It's much easier to create a virtual machine on top of an existing hardware box, than to physically purchase, connect, install and manage a real machine
  • Due to the two reasons above, a virtual machine can be a great environment to support several activities like testing, learning and training, developing, demoing etc.
  • It's relatively easy to "shutdown" a virtual machine on one host, and "turn it on" on another hardware box. Latest versions of virtualization products sometimes even support "live" migration of virtual machines between different hardware boxes. This can become a real advantage in terms of system availability (without extra cost, like clustering, redundancy etc.)
  • It's possible to dynamically balance the physical resources (CPU capacity, disk and even memory) of the physical host between the virtual environments it supports. This means that different virtual machines with distinct usage cycles can co-exist on the same hardware box, and you can configure the resources to move between the virtualized hosts whenever their needs change
Ok. The above can give you an overview of the virtualization technologies and why you would want to use them. Now let's dig into the Informix related stuff. The first questions would be: Should you use Informix in virtualized environments? Does it work well? Does IBM support it? Does IBM provide flexible pricing to match the flexibility in these environments?

Well, the answer to all these questions could be a simple "yes". Let's see:

  • Informix architecture, usually referred to as Dynamic Scalable Architecture (DSA) is a perfet fit for virtualized environments. Informix implements the concept of virtual CPU in a operating system process. These CPUs then run user and system threads. This explains why it's so light. These virtual CPUs (CPU VP in informix jargon), can be added and removed dynamically. So, from the begining of IDS (when DSA was introduced) you can effectively dynamically adjust the CPU resources of your intance. Memory can also grow, and shrink. But I have to grant that it would be nice to see some improvments here. In practice it's very difficult to be able to shrink the memory once it grew.
    But the small footprint (both of installation and running resources) and dynamic resource adjustment are nice features for virtualized systems.
  • Regarding support, you can be confident that IDS is supported in these environments. There are obvious questions regarding performance issues, but you will not get the dreadful answer of "your setup is not supported" in case you need help from tech support.
  • Finally, IBM pricing is well aware of the virtualization needs (assuming a CPU based license policy). You will only pay for the resources you attribute to your virtual host. Accordingly to a recent announcement your license fees will depend on your virtual hosts environment and not the underlying hardware (which is usually much bigger, as as such would be more expensive).
    IBM calls this license scheme for virtualized environments "sub-capacity", alluding to the fact that you're running a virtual host with less capacity than the base hardware.
    If you want to license for concurrent session, than this is just like in any other (non virtualized) environment
Virtual appliance with IDS developer edition

IBM announced some time ago the availability of an IDS Developer Edition based virtual appliance. This is pre-installed and pre-configured VMWare image, running SUSE Linux Enterprise Server V11 and IDS 11.50. Everything is configured so you can easily deploy it and use it for testing, learning or developing purposes. Scripts are provided to create a full MACH-11 cluster and intructions are included to lead you through some demos. You just need a free product from VMWare to run it on your laptop. The appliance is available in 32 and 64 bit versions. You can access this virtual image in two ways:
When you first run the appliance, you'll go through some screens that allow you to make some configurations and also will prompt you for license acceptance. This process is fairly simple and will only take a couple of minutes. After that you'll see a normal Linux desktop with some shortcuts that will allow you to explore the power and simplicity of IDS.
This appliance is being constantly improved and updated by IBM. Current IDS version is 11.50.xC3, but you should expect 11.50.xC4 when available. I strongly recommend this appliance to anyone who wants to get familiar with Informix.


Amazon EC2 cloud


Cloud computing has become another buzz word of the IT industry. Large companies have large computing infra-structures. You can imagine that companies like IBM, SUN, Microsoft, Google, Yahoo, Amazon and so, on have large datacenters spread around the world. Like any other computer in the world, these datacenters are not always using it's full capacity. So, more, and more companies are trying to take advantage of some of their computing power, by making it available to customers as services. This resources are "somewhere" on the Internet. That's why the term "cloud" is used. Customers only have to know how to use these resources. They don't need to know how they're implemented or where they are located. You as a customer, pay a certain fee to use a determined amount of computing resources.
Amazon was one of the first companies to sell cloud computing type of services.. It started around 2006 selling an infra-structure where customers could implement web services. Later it introduced the EC2 (Elastic Computing) concept. The idea here is to rent virtual machines (Linux or Windows) to anybody who needs them. And you pay only what you use at the rate of $0.1 / hour for what Amazon calls a "small instance". This is "equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor" and has 1.7GB of RAM. Not a big server, but perfectly enough for some tests or studying. You can also rent bigger instances, and you can rent several of them.
So, the term "elastic" means you can rent the resources you need and grow them as your needs grow. And you won't have to pay for physical allocation and equipment.

Now, why am I talking about this? Simply because IBM made the same virtual appliance I wrote about above, available as an AMI (Amazon Machine Image). This means you can rent an Amazon instance running IDS 11.50.UC3 (32 bit only for now) on top of SUSE Enterprise Linux.
To be honest, I was a bit lost with all these concepts, so I decided to test this myself. I've followed the following steps:
  1. I went to Guy Bowerman's blog to search for info
  2. I got hold of the IBM Informix Server Amazon Machine Image (AMI) Get Starting Guide
  3. I went to http://aws.amazon.com/ec2/ and sign up. After login you'll have the access keys and an X.509 certificate (private and public key). These are used to identify you when calling Amazon web services (which implement the Amazon managing API). So you should download them into your local system (as explained in the Getting Started Guide)
  4. The next step is to "buy" the AMI of the IDS Developer Edition. I put "buy" between quotes, because although you have to put on a purchase order, in reality you will not have to pay any licensing fees. You'll just pay the use of it, at the standard Amazon small instance rate of $0.1/hour. This step and the URLs are perfectly documented in the guide
  5. The next step involves downloading and setting up an installation of the Amazon EC2 API (command line) tools. These are implemented in Java, which means two things: You'll need a Java (JRE) environment on you system, and you can run them on Windows, Unix and Linux. During the setup process it is suggested that you create another key pair that will be used to authenticate your logons to the instance.
  6. Then, instructions are provided in order to launch an Amazon instance based on the IDS Developer AMI that you "purchased" earlier. Detailed instructions are included so that you can access the running instance using an SSH connection. Remember that the authentication will be done through a pair of keys you generated a few steps ago.
  7. After you login to the instance you'll get through a similar process that the IDS virtual appliance also provides. Besides the common licencing acceptance, in this environment you'll also be prompted for:
    1. The keypair you generated (it's suggested that you copy the files and just point to them)
    2. The user's passwords (root, informix and developer)
    3. The configuration of a persistent storage.
      I should have wrote about this earlier... The AMI instances are volatile. This means that once they're stoped all their "local" storage is gone. So, you should allocate a permanent storage from Amazon EBS service (extra charge of around $0.1/GB/Month). This storage volume can be mounted in /data by the IDS Developer instance. I'll get back to this topic below.

So, after this steps I got a SUSE Enteprise Linux, running IDS Developer Edition, with a MACH 11 cluster already configured, running somewhere, in the Amazon Cloud, available for me (and anyone I want) to connect to. How much did it cost? Around $0.35, including an EBS storage volume.
Please note that IBM didn't just made an IDS Developer AMI available. IBM also established a policy for licensing Informix (and other IBM software) on the Amazon Cloud Computing platform. The relevant announcements are here; http://www-03.ibm.com/press/us/en/pressrelease/26673.wss and here; http://www-01.ibm.com/software/lotus/passportadvantage/pvu_for_Amazon_Elastic_compute_cloud.html (Processor Value Units - PVUs - for Amazon EC2 )

So, isn't this a perfect way to test software, or to create temporary machines for propotype developing, or for the purpose of distance teaching etc. ? Yes... But I feel there's a small issue:
As stated above, you pay what you use. This means that you pay for as long as your instances are running. Obviously, for saving money, you'll want to stop them when they're not used. But the instances are volatile. Meaning that it's not exactly like a VMWare image. When you restart them you'll get the AMI initial image, and not the machine's state when you shut it down. That's why Amazon provides the EBS volumes. These are permanent, non-volatile storage volumes. As mentioned in the getting started guide, you should keep you database files in these volumes. But even so, if you restart the instance, you'll have to go through the setup screens again. This is not convenient. But there is a simple solution for this: Private AMIs.

When you're running an instance, you can decide to make an AMI from it. The process is called "bundle" it. You can get the details on how to do it here: http://docs.amazonwebservices.com/AWSEC2/latest/DeveloperGuide/bundling-an-ami.html After you create a bundle from a running instance, you can upload it. This will make a new AMI available for you. It's called a private AMI. You can also make it available to the public.
After this you can launch an instance from your own AMI. So theoritically you could customize the IDS Developer AMI, bundle it, upload it as a private AMI and use it to launch your customized instances. You'd have to check the licenses though...

So, in short, in which scenarios could we use Amazon EC2, and more specifically the IDS Developer AMI?
  • You need some machine for a team of developers to work on a new project during a short period of time
    It's easy to setup and use. And you'll know how much it will cost you. And you don't have to depend on your own resources
  • You need to make a customer demo for an application you developed. You just install it, and use it at your customer site. Better yet, your customer can make it's own testing even after you leave
  • You want to provide some application training remotely (or long distance). Again, just install it, give the access details to your students, and there you go...
  • You want to learn about IDS and you don't want to install the virtual appliance locally (you don't have the necessary resources for running it)
  • And of course, you have a startup company, and you don't want to own your own datacenter. So you just rent it... In this scenario you would need payed IDS licensees of course....
Summary
In this long post I've gone through the following points:
  • Why IDS is a perfect match for virtualized environments
  • IBM Informix virtual appliance. A pre-configured VMWare image with IDS Developer Edition already installed. Everything ready for your experiments
  • IBM Developer Edition AMI (Amazon Machine Image). The machine image in Amazon EC2 format that IBM made available for use in Amazon EC2 environment
I haven't gone into details of the virtual appliance contents. But I recommend that if you're interested in IBM Informix Dynamic Server, you should really test it. It probably has everything you'll need to learn and test IDS.

Glossary
  • Amazon EC2
    A cloud computing environment run by Amazon
  • AMI
    Amazon Machine Image - A pre-built virtual machine that you can use to start an Amazon EC2 instance
  • Amazon EC2 instance
    A running virtual machine in the Amazon EC2 environment
  • Amazon S3
    Amazon's Simple Storage Service
    This is a non-volatile storage service provided by Amazon. It costs around $0.1/GB/Month
  • IDS Developer Edition
    A version of IBM Informix Dynamic Server, that you can use for application developing.
    It's freely available at and you can use it for learning, test and application developing. Please check the license for details
  • VMWare Appliances
    Pre-configured virtual images ready to run in one of the VWWare products (IDS Developer image is available)
References

1 comment:

Ernesto Pineros said...

Hello I'm on plan to virtualize 3 SUN SPARC Servers with Informix on a VMware Environment. I did a Capacity Plan and all 3 servers are consolidatable. What I Need is a best practices document to virtualize Informix with vSphere.

Thanks in advance