Blog

NextMark server upgrade completed successfully

Recently, our customers have been experiencing periodic latency problems during peak times during the day (late morning and early afternoon).  The system would be chugging along just fine and then it would "go out to lunch" for 15-20 seconds.  The system would eventually come back and process the request, but that lag is frustrating  when you are trying to get an answer for a client or just trying to get the job done.  This was happening a couple of times every day.

We’ve been monitoring this problem for a while and pinpointed the source as a constraint in the amount of physical memory on the web servers, which were maxed out at 8 GB.  What would happen (for you geeks out there) is the system would cache frequently accessed items in memory to avoid having to go to the database every time.  Over time, the size of this cache builds up.  Eventually, those items in the cache that are not accessed get evicted from the cache.  Those evicted items and other disposable items are marked as deleted.  However, they still remain in memory until the garbage collector comes along to reclaim the memory.  The garbage collector runs frequently throughout the day and users rarely would notice it.  However, when physical memory available becomes very low, the garbage collector throw up a red flag and takes over with a "full GC" which takes some time to complete.  Meanwhile, the users are locked out from getting any work done.

So, the solution is to either (1) use less memory or (2) to add more memory.  We are doing both.  Option #2 is more expensive but is quicker to implement.

Bigbox2_3So, last week, we upgraded our key web servers to a new box (HP DL380 G5) that allows up to 32 GB and filled it up with 14 GB of RAM.  The 14 GB should provide plenty of headroom for the foreseeable future and we always have the option to upgrade to 32 if we need to.  Early tests show this has virtually eliminated that latency problem because the system can be much more efficient with memory management.

As an added bonus, these machines are far more powerful than their predecessors.  It has twice the number of core processors, faster throughput, faster disk access.  Although sheer processing power was not a problem before (we were running at low CPU utilization), we have seen some nice performance improvements on computationally heavy tasks.

As a result of these upgrades, current users will notice a quicker system with no periodic lags and we have plenty of room to grow.

Comments are closed.