Monday, May 11, 2009

Isolation level in WebSphere

This is probably the first post I'll write following a customer facing situation. Although there may be good reasons to don't write about some customer facing situation, I feel this may bring some value to the blog, and we never know when someone having the same problem finds the blog on Google... So this may be the first of several small and direct posts.

Some time ago I had to go into a customer site who was having "major performance issues". After some examination and some talks with the development team I was able to identify several sessions running in Repeatable Read isolation level. The application has several components and one of them is an instance of WebSphere Application Server (WAS) v6.1.

Further investigation allowed us to understand that the application was not using EJBs, nor Session Beans (these allow the isolation level to be specified in the deployment descriptor XML file). As such the database connections were using the WAS default isolation level which is repeatable read.
For those less familiar with repeatable read it's equivalent to ANSI Serializable mode. Basically any record read in order to find the result set is locked and remains locked in shared mode for the duration of the transaction. So an instance which was configured for 20000 locks could reach about one million of them (thanks to the lock table automatic expansion). Obviously this caused a lot of contention between sessions and a lot more internal work for the engine. This was causing their "performance" problems.

The solution was simply to redefine the WAS isolation level for the datasource used by the application. This can be done by using a custom property called webSphereDefaultIsolationLevel which as the name implies can be used to change the database connection default isolation level. Complete information about that can be found in the following documentation:

http://www-01.ibm.com/support/docview.wss?rs=180&uid=swg21224492

In there you can find the property description with the explanation of the values it accepts as well as other ways to change the default isolation level.
After changing this the application behaved properly an most of the performance issues went away. The were some other issues like lack of indexes, and some minor configuration changes on the database side, but those were clearly not the cause of the problems.

Friday, May 08, 2009

11.50.xC4: Another mark in Informix history

IBM Informix Dynamic Server 11.50xC4 is available for customers on Passport Advantage.
It's another fixpack that comes with significant improvements. Traditionally we had to wait for full releases to have some significant features but we're getting used to see great improvements in fixpacks.
This one is no exception and from the release notes we can see an overview of the new functionality:


  • High Availability
    • Availability Enabled on Windows 64-bit Operating System
      Some of the HA features were not available on MS Windows 64 bit versions

  • Administration
    • Save Disk Space by Compressing Data
      Data compression for tables. This deserves a few more words... Check the end of the article
    • Improved Options for Specifying Processor Affinity
      New options for providing binding between virtual CPUs and physical CPUs. We can now specify a list of physical CPUs, one ore more ranges, and ranges with increments
    • Disable IPv6 Support
      If you prefer to disable the IPv6 protocol you can create and empty file named $INFORMIXDIR/etc/IFX_DISABLE_IPV6 (readable by informix) or set and enrionment variable: IFX_DISABLE_IPV6=yes
    • Enhancements to the OpenAdmin Tool for IDS
      These include: Control of table compression, query by example for all the usual table operations (query, insert, delete and update), SQL explorer history, managing of external directives and export/import of OAT connection definitions
    • Generating a Customized Database Server Configuration File
      A new utility called genoncfg was introduced. It takes a default onconfig.std and a text file with some directives (number of CPUs, memory, rootdbs location) and adjusts the instance configuration. The result is saved in a file called onconfig on the local directory. This can be the beginning of a great tool

  • Enterprise Replication
    • Generate ATS and RIS Files in XML Format for Custom Repair Script
      The Aborted Transaction Spooling and Row Information Spooling files can be generated in XML format to facilitate the implementation of recovery processes
    • Prevent Upserts During Conflict Resolution
      An Upsert happens if an Update it send to target and the row does not exists. This may not be desirable if the row was previously deleted. Now we can make the DELETE "win"
    • New Event Alarms for Enterprise Replication
      Three new situations will trigger an alarm: When a server connection is dropped, when a server, replicate or replicate set changes state (stopping, starting, modifying, deleting, or changing the connection status) and when a transaction fails (generating ATS/RIS files)
    • Monitor the Progress of a Consistency Check or a Direct synchronization
      New cdr commands to see the status of check and sync processes (cdr stats check and cdr stats sync)
    • Specify the Duration of a Consistency Check
      By default, inconsistent rows are rechecked once after two seconds, which might not be enough time for replicated transactions to be applied on the target server. You can now specify the number of seconds to spend on rechecking the consistency of inconsistent rows. Rechecking prevents transactions that are in progress from being listed as inconsistent in the consistency report. You can use the --inprogress option of the cdr check replicate and cdr check replicateset commands to specify the maximum number of seconds to perform rechecking.
    • Schedule Synchronization or Consistency Checking Operations
      cdr check and cdr sync allow for a new flag ( --background ) that will use the scheduler to make periodic runs of those processes
    • Improve Consistency Checking Performance
      Several options allow the execution of parallel threads for check operations. You can specify which data or intervals to check
    • Improve the Performance of Checking or Synchronizing Replicate sets
      A new option ( --process ) define how many threads to use in paralell for check and synchronization jobs
    • Role separation for Enterprise Replication (UNIX)
      If you're using role separations your DBSAs can control enterprise replication. Previously only user informix could
    • Enhancements to the Enterprise Replication plug-in for the OpenAdmin Tool for IDS
      A lot of new features in the ER plugin for Open Admin Tool (OAT)
  • Performance
    • Enable Concurrent I/O to Improve Performance on AIX Operating Systems
      IDS supports DIRECT I/O since version 11.10. Now it also supports Concurrent I/O (CIO) on AIX. It's a needed step to really get to the performance of raw devices. I have plans for an article on this subject...
    • IFX_LARGE_PAGES Environment Variable (AIX, Solaris)
      Allows IDS to take advantage of a pool of large memory pages configured by the system administrator. The shared memory virtual segments must be set to RESIDENT by the database system administrator

  • Application Development
    • DataBlade Module Registration through SQL
      Datablades can now be registered using plain old SQL. A new function, SYSBldPrepare is used for the registration process. So you can do it from any SQL tool or interface
    • Control the Results of a Fuzzy Search with the Basic Text Search DataBlade Module
      You can now specify the degree of similarity of search results in fuzzy searches when using the Basic Text Search DataBlade module. Specify a number between 0 and 1, where a higher value results in a higher degree of similarity. To limit results, specify a higher number. To maximize results, specify a lower number. The default degree of similarity is 0.5.
    • Map Characters for Indexing with the Basic Text Search DataBlade Module
      This allows us to replace special characters with other characters for indexing purposes
    • Default Boolean Operator in Basic Text Search Queries
      When we query a BTS index, the space between words is considered a default operator. Now we can define that default operator at index creation time
    • Storage for Temporary Basic Text Search Files
      It's now possible to define a separate sbspace for the temporary files used in BTS searches. This can increase performance. Up to now they were created in the same space as the index.
    • Track Basic Text Search Query Trends
      The possibility to save the queries made using a BTS index. This information can be used to provide hints to the end users in order to improve their own queries. It's like using past experiences to improve future queries
    • Fragment bts Indexes by Expressions
      It's now possible to split a BTS index into several sbspaces by using fragmentation
    • Monitor Change Data Capture Sessions
      A new onstat option ( onstat -g cdc ) that allows monitoring on data capture sessions. Change Data Capture API was introduced in 11.50.xC3 and can be used to capture data change and propagate it into other databases, web services etc. It's also one basic component to use SolidDB as a frontend cache for IDS.
    • View Change Data Capture Error Message Text
      A new function ( cdc_errortext() ) that retrieves an error message from an error generated by CDC

  • Security
    • Automatic Check for Installation Path Security (UNIX, Linux, Mac OS X)
      The installation process checks that the directory structure is secured
    • Utility to Enhance the Security of the Dynamic Server Installation Path (UNIX, Linux, Mac OS X)
      A new utility ( onsecurity ) that can check the integrity of the product installation. It can detect if the installation was compromised and in that case forbids the use of the product
As we've seen in previous fixpacks, IBM introduces another major feature in this one. Compression. There is a lot to say about IDS compression, but it would be a waste of time spending too much time here. The subject is already covered in several places:
I'd like to highlight a few aspects of IDS compression:
  • It's a dictionary based compression, similar to DB2 deep compression. This means you don't have to rely on pattern repetition on each page
  • Since the dictionary is created by sampling the table it's very fast to get a compression estimate
  • It works for normal INSERTs and UPDATEs. It doesn't depend on load operations
  • It works online (without blocking the tables or rows)
  • It comes with two other features: repack and shrink. Repack regroups the data rows sequentially on the table, leaving the free space at the end. Shrink can return this free space to the dbspace/chunk where the table resides
But if compression is the brightest feature we should not let the others go by without notice:
  • From the list above we can see that Basic Text Search datablade has seen a lot of nice improvements. I'd bet that if IBM keeps improving it like this it will have to change it's name to something like "Not so Basic Text Search Datablade" :)
  • And again, Enterprise Replication (ER) sees a lot of improvements
  • A new version of Open Admin Tool (OAT) with several major improvements
  • The new utility to generate a configuration file (genoncfg). It's still a bit simple, but I believe it can be the start of a great utility to help new Informix users to get a basic working configuration
  • The introduction of CIO for AIX. This is a very important step after the introduction of DIRECT_IO. I would love to write a post about this. Stay tuned.
And that's it for now. If you're an IBM customer you can download 11.50.xC4 from the usual sites. If you're not an IDS user, you can try it by downloading the IBM Informix Dynamic Server Developer Edition from:

http://www.ibm.com/developerworks/downloads/im/ids/dev.html