Logging Apache's *_cache modules efficiency
Recently I had to setup several Apache web servers with an in-RAM cache for frequently accessed content with the help of Apache's mem_cache module. All the servers where front-end servers to application servers that sits behind.
Although in some cases the content expiration time was set as low as 30 seconds, it worked great because it reduced request concurrency for the most costly requests, providing an important performance improvement and reducing heavily the requests to the application servers. The problem comes when you want to log cache hits and misses in order to analyze cache efficiency.
The *_cache modules doesn't provide any information to apache through environment variables, e.g., neither provides any logging facility.
Googling about it I even found some people that modified mem_cache module source code to add some logging facility to it. However, that's not the way in my opinion. If you are working with separate servers (separate Apache instances, not necessarily two different machines althought it's better), there is a better solution to it:
1. Enable mod_headers in the application server and add an HTTP header to register the "cost" to process the request (the time taken by the server since it received the request until the response is sent to the wire). We need to add the following header:
Header set CacheMiss "%D %t"
2. In the front-end server we will add a new header (so we ned mod_headers again) and create a specific log file for the cache performance data.
Header set CacheHit "%D %t"
LogFormat "%t \"%r\" %X \"%{CacheMiss}o\" \"%{CacheHit}o\" %>s" performance
CustomLog /var/log/apache2/performance.log performance
The LogFormat drective tells Apache to define a new log entry type containing:
Provided this log, is quite easy to analyze how many cache hits and misses we have over the total number of requests, the time to process a cache hit vs. a cache miss,... I highly recommend to enable this kind of log some days or weeks before enabling the cache, so you can analyze your cache performance better by comparing with historical data. This way you can know better the impact of every configuration change. This method is valid also to measure the request processing cost between front-end and backend servers, no matter if you are using a cache or not.
Enjoy with mem_cache and logging!
PS: I encourage you to read apache documentation for mod_log_config, mod_cache, mod_mem_cache and mod_headers. You may need to take a look at mod_proxy and mod_http_proxy too.

Comments
Post new comment