Pages

Monday, July 21, 2014

how to find the linux os and version

type uname -a for the linux version
type lsb_release -a for the os details

Tuesday, July 1, 2014

Java EE singleton pitfall

If you implement a singleton class with the @Singleton annotation, then it is  considered all the method level access is read and write. Therefore a write lock is established automatically. Which is not good. Since when a method required a read lock its default setting is a write lock. So when you are programming, then you should always keep that in mind.

Wednesday, June 4, 2014

MySQL clustering(NDB) vs MySQL replication (InnoDB)

MySQL InnoDB

InnoDB often comes with master/slave configuration. The drawback of this method is writes are only on master and the slaves provide only reads. Therefore concurrent writes are not supported unless you use a mechanism called shading. 

What happens when a database grows so that it cannot handle in one database and a server. Then comes the shading option. But it has to be done very carefully in order to not to lose  performance. If you want to scale out the system for write operations, using this storage mechanism, then you need to use the shading option.

shading for reads

  1. when the data set do not fit into the memory and consists of may read hits from the disk rather than serving from memory.

shading for writes

  1. when there are too many writes that the replication lags considerably.
  2. the frequency of writes is overloading this servers disks permanently

How ever if you are going for the option of sharding, it is always good to have application level sharding. The reason is, when the application knows where the data resides, the performance becomes better.

Sharding types

  1. application level sharding - put the most busy tables into separate servers and access them.
  2. sharding by hash key
  3. sharding using a lookup service

Why sharding is one of the last options?

  1. developer has to write code to handle the shading logic
  2. backup, indexing, changing schema makes it more difficult to maitain

MySQL clustering

MySQL clustering how ever supports concurrent writes. Data is partitioned among the data nodes and a copy of a node or a backup is in another node. Therefore availability is assured. 

How ever the problem is even though it provides foreign key join supports the process is slow since the data is partitioned in several nodes. If the join operation results in large volume of data this could be slow. Therefore tasks such as generating reports that takes usually several minutes are not good to be implemented using this method. Also another thing to note in this method is, it supports concurrent writes.

Setting up MySQL clustering can be more tedious than setting up InnoDB. But still it prevents the developer from using shading, since the partitioning happens among the nodes.

Below is a link which provides some hints on how to increase the performance in a mysql cluster

https://blogs.oracle.com/MySQL/entry/mysql_cluster_performance_best_practices




Wednesday, April 23, 2014

ActiveMq message broker

Some userfull links I found on ActiveMQ

http://www.slideshare.net/dejanb/advanced-messaging-with-apache-activemq#btnNext
http://working-with-activemq.blogspot.com/2012/05/performance-improvements.html
http://www.javacodegeeks.com/2014/04/activemq-network-of-brokers-explained.html

Monday, April 21, 2014

Some very important links to be read for distributed computing

http://en.wikipedia.org/wiki/X/Open_XA
http://fusesource.com/documentation/fuse-esb-documentation/
http://activiti.org/components.html
http://servicemix.apache.org/
https://www.mulesoft.com/resources/esb/mule-esb-integration-platform

Tuesday, April 8, 2014

Scaling a relational database

Below are the highlights I saw from the valuable article given in the following link.

http://java-persistence-performance.blogspot.com/2011/05/data-partitioning-scaling-database.html

 So hats off to the author who outlined these valuable things.

You can take 5 steps to scale a database

  1. optimizing the number and types of queries hitting the database, using parametrized SQL, using batch writing, using lazy, join and batch fetching, a significant load can be removed from the database.
  2. ensuring your database is configured optimally, has the correct indexes, queries are using the optimal query plan, and the disk access optimally, its performance, and thus scalability can be improved
  3. caching objects and data in the mid-tier, you can offload a lot of the queries hitting the database, and improve your application's performance to boot. Most JPA providers support caching, and some such as EclipseLink offer quite advanced caching functionality including invalidation, and coordinated clustered caches. JPA 2.0 defines some basic caching annotations to enable and access the cache.
  4. scale the database through clustering the database across multiple machines. This could be a real clustered database, such as Oracle RAC, or just multiple regular database instances. Clustered database are good, and can improve your scalability without much work, but depending on your application you may also have to partition your data across the database nodes for optimal scalability. Without partitioning, if you write a row on one node, then access it on another, the other node must request the latest copy of the data from the other node, this can potentially make performance worse.
  5. partitioning data across each of the database nodes
Data partitioning can be done in 2 major ways
  • Vertical partitioning
  • Horizontal partitioning

Sunday, April 6, 2014

How to use log4j with your own configuration in jboss 7.1.1?

The following link describes this problem

http://stackoverflow.com/questions/10799028/jboss-7-1-logging-is-not-working