Wednesday, May 16, 2012


20 Tips for Using Tomcat in Production
Published: 17 May 2012
I've been working with Apache Tomcat for years and always seem to stumble upon new information related to the proper setup and configuration for a production environment. I've decided to put the instructions and tips I've collected together in one place.
So here are some helpful hints for running Tomcat in a production environment:
1. If you're running on a 1.5+ JVM...
Add the following to your JAVA_OPTS in catalina.sh (or catalina.bat for Windows): -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/j2ee/heapdumps Then use a tool such as YourKit to analyze the heapdump file.
2. When using Jasper 2 in a production Tomcat server you should consider...
Straight from the Tomcat documentation on Jasper 2...
When using Jasper 2 in a production Tomcat server you should consider making the following changes from the default configuration. development: To disable on access checks for JSP pages compilation set this to false. genStringAsCharArray: To generate slightly more efficient char arrays, set this to true. modificationTestInterval: If development has to be set to true for any reason (such as dynamic generation of JSPs), setting this to a high value will improve performance a lot. trimSpaces: To remove useless bytes from the response, set this to true.
3. Use Tomcat's clustering/session replication capability.
Use Tomcat's clustering/session replication capability to minimize application user impact during maintenance periods.
4. Implement custom error pages
To hide raw exception messages from the users. To do this, simply add something like the following to your web.xml:
   404
   /error/404.html
5. Use a logging toolkit.
Eliminate System.out and System.err statements from application code and use a logging toolkit such as Log4J for application logging.
6. Leverage Tomcat's shared library directory.
If you're loading several applications with several of the same library dependencies, consider moving them from the applications' WEB-INF/lib directory to Tomcat's shared library{catalina.home}/shared/lib. This will reduce the memory used by each application and result in smaller WAR files.
Update (comments from the user@tomcat.apache.org mailing list): The following should be considered when using the shared library directory: 1. The shared classloader is searched in last resort when looking for classes, according to http://tomcat.apache.org/tomcat-5.5-doc/class-loader-howto.html. 2. Because the classes are shared, they share configuration and singletons and if they store objects statically they will prevent your application from unloading.
This is turning out to be a more controversial suggestion...
Starting with Servlet Spec 2.3 (I think) there has been an emphasis on putting everything a web app needs to run into its war file.
Shared classloaders are evil, but not as evil as the invoker servlet. With a shared loader you can easily get Singleton assumptions being wrong, class cast exceptions, versioning woes, and other issues. Saving a little perm memory just doesn't justify it.
7. Tweak memory parameters.
Most of the time you will want to make a change to the default settings. The best advice here is to create a development environment that matches your production environment and load test the application. While you do this you can also use a profiler to identify bottlenecks, etc.
8. Remove unnecessary applications.
9. Secure the Manager application.
By default there are no users with the manager role. To make use of the manager webapp you need to add a new role and user into the CATALINA_HOME/conf/tomcat-users.xml file.
10. Use a valve to filter by IP or hostname to only allow a subset of machines to connect.
This can be configured at the Engine, Host, or Context level in the conf/server.xml by adding something like the following:
className="org.apache.catalina.valves.RemoteAddrValve" allow="192.168.1.*">
11. Strip down server.xml.
By removing comments to make it easier to read and remove connectors that you don't need. An easy way to do this is the following: Rename CATALINA_HOME/conf/server.xml toCATALINA_HOME/conf/server-original.xml and rename CATALINA_HOME/conf/server-minimal.xml to CATALINA_HOME/conf/server.xml. The minimal configuration provides the same basic configuration, but without the nested comments is much easier to maintain and understand. Do not delete the original file as the comments make it useful for reference if you ever need to make changes. Unless you are using Tomcat with the Apache server, comment out this line in CATALINA_HOME/conf/server.xml:
port="8009" enableLookups="false" redirectPort="8443" protocol="AJP/1.3">
12. Split your Tomcat installation for added flexibility when it comes time to upgrade Tomcat.
See the "Advanced Configuration - Multiple Tomcat Instances" section in the RUNNING.txt file of the Tomcat distribution.
14. Do NOT run Tomcat as root.
Look to my previous post, "3 Ways to Run a Servlet Container on Port 80 as Non-Root", for tips.
15. Precompile JSPs.
Precompile JSPs (at build time).
16. Secure directory listings.
In CATALINA_HOME/conf/web.xml:
   default
   org.apache.catalina.servlets.DefaultServlet  
   
      debug
      0
   
   
      listings
      false 
   
   1
17. If you have multi-core CPUs or more than one CPUs on your server...
It might be beneficial to increase the thread pool beyond the default 250. On the other hand, if you have a slow server, decreasing the thread pool will decrease the overhead on the server.
18. Monitor application applications via Tomcat MBeans.
This article provides some great insight on how to do this.
Consider JDK 1.5 or even better JDK 1.6
To take advantage of performance improvements.

Update (comments from users@tomcat.apache.org mailing list):
Note that you can gain even more performance if you recompile your "string concatenation hungry" (d="aaaa"+b+"ccc") support libraries for JDK 5+ on a multi-CPU system. This is because JDK 5 uses the non-synchronized StringBuilder instead of the JDK 4- synchronized StringBuffer. And synchronization over multiple CPUs takes a few more cycles than on single CPU machines.
19. Use the -server JVM option.
This enables the server JVM, which JIT compiles bytecode much earlier, and with stronger optimizations. Startup and first calls will be slower due to JIT compilation taking more time, but subsequent ones will be faster.
20. Use GZIP compression.
Look for the service connector you wish to configure for compression and add two attributes, compression and compressableMimeType. For example:
   port="80"
   maxHttpHeaderSize="8192"
   URIEncoding="UTF-8"
   maxThreads="150"
   minSpareThreads="25"
   maxSpareThreads="75"
   enableLookups="false"
   redirectPort="8443"
   acceptCount="100"
   connectionTimeout="20000"
   disableUploadTimeout="true"
   compression="on"
   compressableMimeType="text/html,text/xml,text/plain,application/xml">
For more information, read the Tomcat HTTP Connector documentation.
The default Tomcat configuration provides good protection for most requirements, but does not prevent a malicious application from compromising the security of other applications running in the same instance. To prevent this sort of attack, Tomcat can be run with a Security Manager enabled which strictly controls access to server resources. Tomcat documentation has a good section on enabling the Security Manager. 

Wednesday, May 9, 2012

J2EE Performace Tips


J2EE Performace Tips

Sometimes the performance aspects of a system are left to the end of the project. Developers just don't have time to consider response times when speed of delivery and functionality are the pressing goals. Performance bottlenecks must be systematically identified and solved, which is no light task. The later in the development life cycle you address them, the more expensive they become to fix.
Small optimizations (like turning logging down) can yield huge gains, whereas others can have minuscule effects. The key to tackling bottlenecks is to prioritize and focus on the areas of your code that will yield the biggest return for your time investment. Focus on frequently used parts of the code, the parts critical to your business. Gather accurate performance metrics and use repeatable test cases to validate against your performance targets.


Performance Patterns and Techniques
You can scale your system to satisfy an increased load by using vertical and horizontal scaling techniques (software and hardware), but not always. Sometimes you can't use these techniques because either the budget is tight, time is limited, or your system has inherent architectural limitations. In lieu of adding more hardware, distributing your code, or adding more servers, you can use the following server-side strategies to improve the performance of a single J2EE/EJB application.
advertisement
Database I/O Optimization
·  Poor DBMS performance can arise from a mismatch between the data-model, entity-bean design, and its usage. If your most frequent user request joins 10 different tables and returns five entities of which you use only one, then this is a design flaw. Align your DBMS access to support your usage patterns. Write a query to return only the one piece of data that you need.
·  Search queries that use finders can take a long time to run. Not only do they execute the finder query, but they also make subsequent calls to the DBMS to load entity beans. It's more efficient to call the DBMS with one bulk query than it is to issue a series of smaller requests.
Consolidate your search queries into a single JDBC call using a DataAccessObject (DAO) or FastLaneReader (Marinescu/J2EE Blueprints).
·  If your query results are large, however, you may encounter problems when you try loading it all into memory in one hit. If the client paginates the results, consider restricting the number of rows returned to a total closer to your page size. You can always fetch more rows on demand.
·  Use built-in database features that may help reduce query times, such as stored procedures, indexes, views, and table caches.
Caching
·  Within a given user transaction, the same data may be requested multiple times. Try to reduce the number of redundant reads. Either pass the information around as method parameters or consider caching it on the session or thread context. Data that changes infrequently, such as meta data or configuration data, is good for caching. You can load the cache once at startup or on demand.
·  Place your server caches strategically inside your facades to make them transparent to the caller.
Transaction Control
·  Long-lived transactions can lead to connection timeouts, sharp memory increases, and DBMS lock contention. Your session beans control transactions. Break up extremely long-lived transactions into multiple shorter, more-reliable chunks.
·  Chaining together many short-lived transactions can also be slow. Widen your transaction boundaries. Increase throughput and efficiency by manipulating the length of your transaction so it performs many operations at once rather than one at a time.
·  Establish general-purpose session bean controllers to control your transactions. These can be independent of your business logic. Use your beans to adjust the number of records processed in one transaction so that you get optimum throughput for your request.
Threading
·  Batch processes written with EJBs can run very slowly because developers often build single-threaded batch jobs without considering transactions. A process that executes inside one very long-lived transaction (that could take hours or days) is fragile. If something goes wrong, the entire job gets rolled back and you have to start again from scratch. If your job executes as many small transactions chained together inside a loop, then your job will be more reliable. The throughput will be low, however.
·  The container controls server-side threads. Application code cannot create its own threads inside the EJB container, so it cannot take advantage of threading directly. Typically, a container-managed thread pool dispatches and services incoming server requests. This model works well for session-oriented usage. The container load balances across competing requests.
However, for batch processes, the goal is to achieve high throughput so the job finishes as quickly as possible. Parallel processing can aid this. Spawn threads outside the container and divvy up the load into small units of work that are capable of being executed concurrently. Use threads to spread the load. Don't spawn more threads than the server can handle. Keep the number of client threads to fewer than the containers thread pool limit.
·  Aggressive threading can reveal locking issues and non-thread-safe code inside the server. Structure your components so that they are thread-safe and can work concurrently, avoiding DBMS contention. Divide your units of work up so that they operate over different data sets.
Deferred Processing
·  How real-time are your requirements? Not every component has to respond in real-time. If your on-line response time is slow, reduce the amount of work performed during the request and defer expensive or non-essential functions to later. You can store work temporarily in holding tables or execute it asynchronously using JMS.
·  Components that don't have strict real-time requirements, such as data-extraction programs, collation and sorting routines, file import/export, etc., can be performed off-line as batch processes.

Database I/O Optimization Implementation—Pure JDBC for Read-only Access
Entity beans provide an abstraction layer between the higher-level business components and the database. When a developer makes a call to a method on a bean, he or she doesn't see what happens under the covers because it's transparent. The developer can assemble large objects or data structures using many entity calls. He or she can be oblivious to the expensive low-level database I/O taking place. One way to alleviate this bottleneck is to replace your critical sections with specialized pieces of pure JDBC code that are optimized for the task. DataAccessObjects (DAOs) and FastLaneReaders (Marinescu/J2EE Blueprints) are common techniques for accelerating reads. These patterns give you fine-grained control over your queries for efficient read access. A DAO can be used to consolidate many DBMS requests made by an entity bean (finder + loads) into a single JDBC call. DAOs are useful for supporting search screens or requests over data that span multiple beans and/or database tables.
advertisement
The following are recommended practices for DAO:
  • Use your DAO to query for data directly from the DBMS rather than via your entity beans. Session beans and other components can call your DAO directly.
  • Design your DAO so that it can be tailored to return only what you need.
  • Design your DAO so that it can limit the number of rows returned.
  • Use finally clauses to ensure that all JDBC resources are closed when you're finished with them. This ensures that you don't hog or consume connections and cursors.
  • Query against DBMS views to make your DAOs more reusable and portable. If you need to make a change to a query, make a change to the view instead.
With a DAO you can take advantage directly of the features provided by the JDBC API, including:
  • PreparedStatements to pre-compile frequently executed SQL statements
  • Batch methods on the Statement class (addBatch() and exectuteBatch()) to batch up multiple SQL calls into one hit to the database
  • Row limiting (The Statement and ResultSet classes provide methods (setMaxRows() and setFetchSize()) to limit the number of rows returned and set hints for ResultSet fetch sizes.)
See Listing 1: A Simple DAO for executing SQL against a view and limiting the number of rows returned in a ResultSet.
As part of your DAO design, you must decide in which form the data should be returned and how it should be converted. In some cases, you may want to return loosely typed collections such as Lists or Maps, and in other cases you may want to immediately convert ResultSet data into strongly typed Java objects relevant to your system. One way to handle this at the DAO level is to use a helper ResultSetConverter interface. This interface is responsible for converting RowSets into strongly typed application object types or collections. The DAO uses it to automatically convert ResultSet data into your target object(s):
 
public interface ResultSetConverter {
       public Object toObject(ResultSet rs) throws Exception;
}
Create a simple ResultSet to Map Converter class by implementing the ResultSetConverter interface. Inside the toObject() method, pull the data from the result set and place it into a map of column name/value pairs. The map is returned to the DAO to be passed back to the DAO caller:
 
class MapConverter implements ResultSetConverter {
 
public Object toObject(ResultSet rs) throws Exception {
       Map map = new HashMap();
       ResultSetMetaData meta = rs.getMetaData();
       // Load ResultSet into map by column name
       int numberOfColumns = meta.getColumnCount();
       for (int i = 1; i <= numberOfColumns; ++i) {
              String name = meta.getColumnName(i);
              Object value = rs.getObject(i);
              // place into map
              map.put(name, value);
       }
       return map;
}
}
To use your DAO and converter, acquire a DAO instance and invoke the query() method to execute your SQL. Use the rowLimit parameter to limit the number of rows returned and pass in the converter class for the DAO to use.
 
DAO dao = DAO.get();
              
// Create our own converter for getting the first column as a String
ResultSetConverter myConverter = new ResultSetConverter() {
       public Object toObject(ResultSet rs) throws Exception  {
              return rs.getString(1);
       }
};
 
 
// Execute a query against a VIEW,

 
   limit the number of rows returned to 10 and use myConverter to convert the results
 
List data = dao.query("myView", 10, myConverter);
// Do something with the data, ship to the JSP etc.   
The J2EE Performance-tuning Trade-off
Experienced practitioners know that when addressing J2EE application performance issues, there are no silver bullets. Performance tuning is a trade-off between architecture concerns, such as flexibility and maintainability. Performance increases are won by combining different techniques, patterns, and strategies.

And if all else fails, you can hope that that extra-fast machine you ordered turns up sooner rather than later. :)

Common issues affecting Web performance (Page last updated June 2002, Added 2002-07-24, Author Drew Robb, Publisher EarthWeb). Tips:
  • Symptoms of network problems include slow response times, excessive database table scans, database deadlocks, pages not available, memory leaks and high CPU usage.
  • Causes of performance problems can include the application design, incorrect database tuning, internal and external network bottlenecks, undersized or non-performing hardware or Web and application server configuration errors.
  • Root causes of performance problems come equally from four main areas: databases, Web servers, application servers and the network, with each area typically causing about a quarter of the problems.
  • The most common database problems are insufficient indexing, fragmented databases, out-of-date statistics and faulty application design. Solutions include tuning the index, compacting the database, updating the database and rewriting the application so that the database server controls the query process.
  • The most common network problems are undersized, misconfigured or incompatible routers, switches, firewalls and load balancers, and inadequate bandwidth somewhere along he communication route.
  • The most common application server problems are poor cache management, unoptimized database queries, incorrect software configuration and poor concurrent handling of client requests.
  • The most common web server problems are poor design algorithms, incorrect configurations, poorly written code, memory problems and overloaded CPUs.
  • Having a testing environment that mirrors the expected real-world environment is very important in achieving good performance.
  • The deployed system needs to be tested and continually monitored.
Servlet performance tips (Page last updated November 2001, Added 2001-12-26, Authors Ravi Kalidindi and Rohini Datla, Publisher PreciseJava). Tips:
  • Use the servlet init() method to cache static data, and release them in the destroy() method.
  • Use StringBuffer rather than using + operator when you concatenate multiple strings.
  • Use the print() method rather than the println() method.
  • Use a ServletOutputStream rather than a PrintWriter to send binary data.
  • Initialize the PrintWriter with the optimal size for pages you write.
  • Flush the data in sections so that the user can see partial pages more quickly.
  • Minimize the synchronized block in the service method.
  • Implement the getLastModified() method to use the browser cache and the server cache.
  • Use the application server's caching facility.
  • Session mechanisms from fastest to slowest are: HttpSession, Hidden fields, Cookies, URL rewriting, the persistency mechanism.
  • Remove HttpSession objects explicitly in your program whenever you finish the session.
  • Set the session time-out value as low as possible.
  • Use transient variables to reduce serialization overheads.
  • Disable the servlet auto reloading feature.
  • Tune the thread pool size.


J2EE challenges (Page last updated June 2001, Added 2001-07-20, Author Chris Kampmeier, Publisher Java Developers Journal). Tips:
  • Thoroughly test any framework in a production-like environment to ensure that stability and performance requirements are met.
  • Each component should be thoroughly reviewed and tested for its performance and security characteristics.
  • Using the underlying EJB container to manage complex aspects such as transactions, security, and remote communication comes with the price of additional processing overhead.
  • To ensure good performance use experienced J2EE builders and use proven design patterns.
  • Consider the impact of session size on performance.
  • Avoid the following common mistakes: Failure to close JDBC result sets, statements, and connections; Failure to remove unused stateful session beans; Failure to invalidate HttpSession.
  • Performance requirements include: the required response times for end users; the perceived steady state and peak user loads; the average and peak amount of data transferred per Web request; the expected growth in user load over the next 12 months.
  • Note that peak user loads are the number of concurrent sessions being managed by the application server, not the number of possible users using the system.
  • Applications that perform very little work can typically handle many users for a given amount of hardware, but can scale poorly as they spend a large percentage of time waiting for shared resources.
  • Applications that perform a great number of computations tend to require much more hardware per user, but can scale much better than those performing a small number of computations.
J2EE Application servers (Page last updated April 2001, Added 2001-04-20, Authors Christopher G. Chelliah and Sudhakar Ramakrishnan, Publisher Java Developers Journal). Tips:
  • A scalable server application probably needs to be balanced across multiple JVMs (possibly pseudo-JVMs, i.e. multiple logical JVMs running in the same process).
  • Performance of an application server hinges on caching, load balancing, fault tolerance, and clustering.
  • Application server caching should include web-page caches and data access caches. Other caches include caching servers which "guard" the application server, intercepting requests and either returning those that do not need to go to the server, or rejecting or delaying those that may overload the app server.
  • Application servers should use connection pooling and database caching to minimize connection overheads and round-trips.
  • Using one thread per user can become a bottleneck if there are a large number of concurrent users.


Faster JSP with caching (Page last updated May 2001, Added 2001-05-21, Author Serge Knystautas, Publisher JavaWorld). Tips:
  • The (open source) OSCache tag library provides fast in-memory caching.
  • Cache pages or page sections for a set length of time, rather than update the page (section) with each request.
  • Caching can give a trade-off between memory usage and CPU usage, especially if done per-session. This trade-off must be balanced correctly for optimal performance.

Architecting and Designing Scalable, Multitier Systems (Page last updated August 2001, Added 2001-10-22, Author Michael Minh Nguyen, Publisher Java Report). Tips:
  • Separate the UI controller logic from the servlet business logic, and let the controllers be mobile so they can execute on the client if possible.
  • Validate data as close to the data entry point as possible, preferably on the client. This reduces the network and server load. Business workflow rules should be on the server (or further back than the front-end).
  • You can use invisible applets in a browser to validate data on the client.

Wednesday, July 28, 2010

Project Management Case Studies

Case Studies

The following case studies show the use of project management in practice. Studying real-life situations will help you see how others have been successful.

If you have a case study you think would be of interest to people managing projects, please let us know and we'll be happy to consider it for publication.

Every Beginning is Difficult

New undertakings or experiences are always challenging at first. This is no different when Schenker Singapore (Pte) Ltd, a transport and logistics company decided to embark on something new - a Lean Six Sigma programme. It might seem to be even more demanding at the outset since the number of 3rd party logistics providers rising to this challenge is limited. Best practices in this industry are not widespread and hard to come by. This is the story of what happened.

Project Management Approach for Business Process Improvement

Business process improvement initiatives are frequently key projects within an organisation, regardless of the size of the organisation or, frankly, the size of the business process improvement initiative. Even if a business process improvement initiative is targeted at an individual department, the impact of the change will be organisation-wide.

The Best Project Managers are Emotion-driven Leaders

Charles J. Pellerin's own personal ill-fated story, as the project director for the launch of the Hubble telescope, on his journey to the discovery of true leadership. This journey not only got him to redeem himself through an officially 'unauthorised' 60M US$ fix mission to get astronauts to repair the telescope, but also got him to better understand the root of true leadership and design a system to make it happen.

Using ROI to Evaluate Project Management Training

Return on Investment (ROI) is a monetary measurement that is used to evaluate the efficiency and effectiveness of an investment made by an organisation. Investments take many forms, financial, human capital, equipment, and training programmes, to name just a few. This article will focus on the use of ROI and the Phillips ROI Methodology to measure the effectiveness of a project management training programme completed within XYZ Law Firm.

The Hidden Costs and Dangers of the Shortcut

We live in a world where we are often pressured to take shortcuts to save time and cut costs as much as possible. However, if you're not a skilled and experienced project manager, the wrong shortcut could end up costing you a lot more. Here's an anecdote to think about.

Corporate Social Responsibility (CSR) and Project Management

Corporations are more sensitive to social issues and image than ever before. This sensitivity has given rise to CSR initiatives, but the question is: "How do I rationalise the organisation's demands for CSR with my project's objectives?" While there are no easy answers to this question, this article uses actual examples to point out what to avoid and offers tips and tricks on how to rationalise CSR and project objectives.

Green Projects

More and more emphasis is being placed on projects that help our environment, or are at least compatible with the environment. These projects are commonly referred to as "green" projects. Whether "greening" is an adjunct to the project, or a project objective more and more projects are initiated that can be called "green." Green projects place new demands on the project manager. This article describes one such project and some of these new demands.

Project Failures From the Top Down: Can Marchionne Save Chrysler

On the surface the merger between Fiat and Chrysler is very promising, but a bit of history on Chrysler and Marchionne's management style suggests that the sustainability of the merger might be in trouble. Will Chrysler be revived? Can they initiate the kind projects that will return it profitability, or is Chrysler headed for a fatal crash?

Communication is Key: Getting Everyone in the Loop

Are you finding that the communication among your staff, across different departments, and with your vendors is often inefficient and even quite redundant? How many times have you answered the same question either by e-mail or with a phone call? Do you find that inaccurate information is being passed on to customers because sales or services people are referring to outdated e-mails or an implementation schedule that has changed? Does each one of your teams have its own file system and database and use many interfaces to organise its information?

A Tale of Two Projects

A business tale of what it takes to turn around troubled projects. The year is 2005 and times are good. The business environment is vibrant and the economy is strong. Large businesses are committing large amounts of capital and resources to implement new strategies, establish new capabilities, and open new markets. It was no different at PintCo, where Jack works as a Director of Customer Relationship Management.

How Gantt Charts Can Help Avoid Disaster

A short case study about the importance of using appropriate tools, such as Gantt charts, when managing time sensitive projects. Having run 15 months late on completion of a construction project, a building company incurred extensive penalty charges, which eventually led to its closure. Not having any project Gantt charts indirectly led to the company's failure.

NASA Project Management Challenge 2008

One of the first major uses of project management as we know it today was to manage the United States space programme. It started with the inauguration speech in 1961 of John F. Kennedy when he said, "I believe that this nation should commit itself to achieving the goal, before this decade is out, of landing a man on the moon and returning him safely to the earth." In 1986 the Challenger space shuttle disaster focused attention on risk management, group dynamics and quality management. Today NASA continues to focus on project management best practice to deliver major aerospace projects costing many billions of dollars.

Rescuing a Small Project

Project Management. Recently I was asked to jump in and rescue a small infrastructure project that was headed for disaster. What did I do?