SSI Clusters

SSI Clusters

 

The main disadvantages of clusters based on grid computing methods (MPL – Message Parsing Layers etc.) is that application development and porting becomes very difficult and costly. This is specially true for financial applications were application costs are more than hardware.

 

To solve this problem, the next generation clusters have evolved (or evolving) which are called SSISingle System Image clusters. It is a form of distributed computing in which by using a common interface multiple networks, distributed databases or servers appear to the user as one system. In other words, the operating system environment is shared by all nodes in the system.

 

Which means, all the node see same operating system environment with added up resources. And ideally this means -

 

l        Single port entry

l        Single file namespace

l        Single i/o space

l        Single point of management control

l        Single virtual network

l        Single Memory space

l        Single process space and single init !!

l        Single /proc

 

This means your application needs no modifications, the operating system takes care of migrating application threads between nodes.

 

All long as the application is threaded and can use threads effectively, it will work without modification on an SSI cluster.

 

There are a very few open source projects which are able to achieve this properly.

 

Distributions/Projects

 

  1. OpenMosix – This still uses 2.4 kernel ( based on FC2).

It manages to create SSI to  quite an extent but fails to recover when a node faults. The upcoming 2.6 kernel release would likely take care of these things.

 

It somewhat fails to live up to expectations when 2.6 kernel features like network and storage drivers are required.

 

  1. OpenSSI – Much recent but mature than OpenMosix.

It meets almost all the SSI standards and is fault tolerant to quite an extent.

 

OpenSSI-1.9 (2.0 beta) comes with all the 2.6 kernel features and network drivers. It also suppports PVFS and GFS to create a single device namespace.

 

  1. ClusterKnoppix (http://clusterknoppix.sw.be/)

This is ready to use OpenMosix bases distribution which can help you setup a small cluster in a very short time. This can be used

 

OpenSSI and OpenMosix are just projects which provide a couple of RPMs which include a patched up kernel and some basic cluster monitoring software.

 

OpenSSI takes advantage of two more projects – Open Cluste Intrastructure (OpenCI) and Linux Virtual Server (LVS) to manage cluster and make it more fault tollerent.

 

 

Technical information on SSI

 

The SSI model uses different techniques to create a single image. After linux kernel 2.6 many of these projects got a big boost from some new features in introduced in the kernel.

 

Here is a brief over view -

 

Network

Almost all the projects use LVS (Linux Virtual Server) or like functionality which is build into 2.6 kernel. The load-balancing the network connections by making all the nodes listen of a particular IP on its “lo” (loop back) network interface. Which means they would never reply to any arp requests for that IP, but would server any incoming connections.

 

Then only master service the arp requests and pass the connection to different server depending upon the load.

 

 

Process

If the migrating process/thread has some open file-descriptors or shared inter-process memory in use, the parent (one who starts migration) starts of a ghost process locally which keeps track of what ever resources the threads needs on the original/parent node.

This technique also helps in fault-tolerance.

 

 

Device and File-system

The uniformity of file-system and storage device is handled by PVFS or GFS. Both of them have something called i/o fencing and global lock managers which help the nodes synchronize the locking and concurrency issues which writing to the disk.

 

 

Other Changes

The proc and process management hooks have beed added. A detailed description of this can be obtained from the site ( and is beyond the scope of this documents).

 

 

OpenSSI Vs OpenMosix

 

I am very sure you must be wondering now whats the difference between OpenSSi and OpenMosix .. here is a bit old but still very much valid summary -

 

http://sourceforge.net/mailarchive/forum.php?thread_id=5860473&forum_id=21441

 

But, after you Read this also read the refute by OpenMosix guys -

http://openmosix.sourceforge.net/openssi_refute.html

 

Although, the later link is not very much correct.

 

 

Building SSI cluster

 

The hardware requirements, interconnect and clustering basics of a SSI cluster are no different from the master/slave clusters.

 

The installation also is much simpler, SSI doesn't require you to install the OS on all the nodes but just the master node. But, the compute nodes must be network bootable i.e support PXE or Ethernet boot  options.

 

OpenMosix and OpenSSI have slightly different installation procedures, but to give an overview the following steps are involved -

 

  1. Install FC3/FC2 (depending upon the distribution) on master node
  2. Decide the network topology (as discussed previously)
  3. Install distribution RPMS ( this has modified kernel and a few apps)
  4. Run a configuration script which -

1.       Configures the boot loader to boot from a local partition

2.       Takes mac addresses of nodes and prepares boot image for them

3.       Sets up PVFS or GFS

4.       Kicks of some clustering software configuration (for LVS and OpenCI)

  1. Reboot

 

and you are ready with a cluster. ClusterKnoppix comes with some pre-configured software  and comes very handy if you want to test SSI out.

 

And this may take you by surprise but with OpenSSI you can use Xwindows applications also ... that too from my node !!

 

 

Disadvantages of SSI

The only applications which can take full advantages for openSSI are threaded applications like the webserver. Threads and processes can be migrated to other nodes but there is a cost of performance associated with it.

 

Also, due to these restrictions adding one more node doesn't linearly add to performance of the application or the cluster.

 

 

Where You can't use clustering -

 

  1. High-end Database like Oracle.  Because of two reasons –

1.       they manage there threads on there own, for concurrency.

2.       They use direct disk I/O and bypass the file-system

 

Some distributed database are specially designed to work with the cluster.

 

  1. Heavy I/O based applications .. clustering file-systems specially PVFS is known to be less reliable and available compared to other normal file-systems. Also they lag in performance.