Optimizing Server Performance, Part 3

April 3, 2008

workmen-2.jpgPart 3 of 3: Part 1 | Part 2 | Part 3

This is part 3 in a series of posts covering server performance optimization. You should try to follow these tips in order; i.e. start at the top of post one and work you way through each post. If you’re coming here from anywhere other than Part 1 of this post, head there first before continuing on.

Most of the optimization tips presented herein will provide little incremental value. We’ve now reached the point of diminishing returns on our server tweaks.

Install Post Query Accelerator

Important: if you are using WordPress v2.1 or above, skip this step. This tweak is already contained in newer versions of WordPress.
The Post Query Accelerator plug-in improves your server’s perform-ance by ensuring that the MySQL query cache is able to cache query requests for posts.

  1. Download the Post Query Accelerator plug-in.
  2. Upload and extract it in your plug-ins folder.
  3. Activate it in your WordPress administrative panel.

Edit Your Theme or Plug-ins

  • Move comments to a separate page from your posts.
  • Paginate comments.
  • Optimize your plug-ins. Some are poorly written unfortu-nately.
  • Optimize your themes: many themes query for information that is static in nature. If it’s static, it should be treated as such, and be hard-coded. Note that this tip would have a huge im-pact if we hadn’t already taken care of caching from the start.

Compress JavaScript and CSS

Use a CSS compressor to remove white space and comments. Try this one at Arantius, or the one at CSS Drive.

Use a JavaScript obfuscator to remove white space and comments, and shorten method and variable names. Or minify your code with jsMin. The space saved is a bit less than with obfuscation, but it could lead to fewer debugging problems down the road.

Place JavaScript at the Bottom of the Page

While a script is downloading, it will block downloading of other page components, even if those components are on different servers. Moving scripts to the bottom of the page alleviates this problem, but obviously not every script can or should be moved to the page bottom.

Additional Security

This is not an optimization technique. In fact, using this technique will hurt performance on your server (not for everyone; just for you when you are using your administrative panel). However, if you are beginning to receive high traffic, it might be a wise idea to harden the security on your WordPress installation.
The Admin SSL plug-in for WordPress will secure admin and login pages via SSL. You will need access to your own or a shared SSL certificate. Installation and usage instructions can be found on the developer’s site.

Use eAccelerator

“eAccelerator is a free open-source PHP accelerator, optimizer, and dynamic content cache. It increases the performance of PHP scripts by caching them in their compiled state, so that the overhead of compil-ing is almost completely eliminated. It also optimizes scripts to speed up their execution. eAccelerator typically reduces server load and increases the speed of your PHP code by 1-10 times.”

Tune your MySQL Database

MySQL performance tuning is a bit of a mystical art, and conflicting advice is common. Use these settings at your own risk, and only use them on a capable machine:

  • connect_timeout = 5
  • join_buffer_size = 1M
  • key_buffer_size=64M
  • max_allowed_packet = 16M
  • max_connect_errors = 20
  • max_connections = 500
  • max_heap_table_size = 128M
  • myisam_sort_buffer_size = 64M
  • read_buffer_size = 1M
  • read_rnd_buffer_size = 2M
  • query_cache_limit = 8M
  • query_cache_size =128M
  • query_cache_type = 1
  • sort_buffer_size = 16M
  • table_cache = 512
  • thread_cache_size = 256
  • tmp_table_size = 64M
  • wait_timeout = 14400

Tune Apache

IBM’s developerWorks is always a great source of information, including this article on Apache tuning.

Conclusion

Preparing for an expected traffic spike or responding to a traffic spike is a great problem to have, and you’re in good company. I hope these tips get you our or keep you from getting into a jam. If you have any additional tips, please post in the comments. I’m all ears.

Part 3 of 3: Part 1 | Part 2 | Part 3

Optimizing Server Performance, Part 2

April 2, 2008

Part 2 of 3: Part 1 | Part 2 | Part 3

This is part 2 in a series of posts covering server performance optimization. You should try to follow these tips in order; i.e. start at the top of post one and work you way through each post. If you’re coming here from anywhere other than Part 1 of this post, head there first before continuing on.

Most of the tips in this section could be classified as incremental improvements. Taken by themselves, they each have a small impact on overall performance; whereas the previous chapter’s tips all have rather significant effects.

Use Feedburner

Your RSS feeds are a silent killer that could quietly be eating 20 to 40 percent of your bandwidth. RSS readers automatically pull the latest posts from your site and aggregate them for your subscribers. And they do this often and many times, inefficiently.

You can use a service like Feedburner to handle this traffic for you. A word of warning for established sites: you’ll need to redirect your existing subscribers (preferably, automatically and seamlessly) to your new feeds.

If you don’t want to entrust your RSS feed to a third party, another option is to serve your feeds from another server or hosting account.

Profile your Pages

workmen-2.jpgIf you use a profiler to look at what is required to be downloaded to run each of your blog pages, you might be surprised. You’ll see a mixture of scripts, stylesheets, images, and maybe Flash in the page. Yahoo! determined that 80 percent of the end-user response time is spent downloading these front-end components.

If you’re a Firefox user, download the Firebug extension and examine each of your blog pages. This is a great way to examine the effect of your plug-ins, images, styles, scripts, ads, and anything else you might be serving to the public.

If you don’t use Firefox, you can get a lite profile by using a web-based profiler.

Trim the fat!

Any static file can be offloaded to another server. Aside from images (mentioned above), you can also offload JavaScript, CSS, Flash, and other static files. Amazon S3, Steady Offload, and even another hosting account or server are all options.

YSlow

Yahoo! also has a Firefox extension called YSlow. Its analysis of your page is based on 13 strategies that they use to optimize their pages for download. To use it you must first have Firebug installed. Read more about it on Yahoo’s Developer Network.

Reduce the Number of HTTP Requests

Reducing the number of HTTP requests required to render a page reduces server load.

Use Image Maps

Combine multiple, contiguous images into one image, and use an image map to link it. The file size will be the same, but you eliminate separate requests for each image. This is not a feasible strategy in all situations, but it should work for some.

Use CSS Sprites

You can combine all of the images in your page into a single image and then use CSS properties to display the desired image segment. Dave Shea has an article on A List Apart and there’s a good sprite generator from Ed Eliot and Stuart Colville (authors of High Perform-ance Web Site Techniques).

Combine Files

If you use more than one stylesheet or more than one script, consider combining like types together. The weight of the page won’t change, but you’ll reduce the number of requests required to build the page.

Use External Files

To allow for caching, your CSS and JavaScript should be stored in external files, and linked / imported into the page being downloaded to allow for caching.

Optimize Images

workmen-2.jpgMost bloggers don’t use image editing tools to resize or polish their images. They should. By using a high-quality web graphics program like Adobe Photoshop or Adobe Fireworks (formerly Macromedia), you can ensure that you’re using the appropriate file format (e.g. jpeg’s for photos, gif’s for line art), and that your images are optimized for web viewing.

I often see JPEG’s decrease in size from 40 KB to 20 KB with a negligible impact on image quality.

Turn off or Slow Down Comments

If your blog posts are receiving a large number of comments, you can do one of two things to temporarily ease the load on your server:

Temporarily turn off comments

Find which of your posts is receiving heavy comment traffic, and disable commenting on it until traffic has died down a bit.

  1. Open the WordPress administrative panel
    Go to Manage | Posts
    Deselect the “Allow comments” checkbox for each high traffic post

After a few hours or days, turn them back on again.

Update the Cache Less Frequently

Every time a new comment is posted, WP-Cache will try to update the static version of the page. Tell it not to. Comments might appear stale throughout the day, but it’s a better alternative than a server crash.

  1. Open the wp-content | plugins | wp-cache plugin folder.
  2. Open wp-cache-phase2.php for editing
  3. Comment the line: add_action(’comment_post’, ‘wp_cache_get_postid_from_comment’, 0);

Pages will now not be regenerated every time a new comment is entered. After a few hours or days, uncomment the line. Thanks to Simple Thoughts blog for this tip.

Monitor your Site

This is not really an optimization tip, but monitoring your server will at least tell you when you’re in trouble. Think of it as a first line of defense or an early warning system. You know the call from the host will be coming!

I use iWeb for dedicated servers, and they have a free monitoring tool that’s simple to setup. Create an account, add an address to monitor, and then specify ports to watch and query intervals. Common ports include 80 (web), 8080 (Tomcat), 3306 (MySQL), and 110 (mail).

That’s it for part 2 of this series. Tune in tomorrow for the third and final installment.

Part 2 of 3: Part 1 | Part 2 | Part 3

Optimizing Server Performance, Part 1

April 1, 2008

Part 1 of 3: Part 1 | Part 2 | Part 3

I recently compiled a list of tips for people interested in easy ways to optimize server performance. The tips were directed at WordPress users, but many carry over to traditional web applications as well. After all, a blog is a specific kind of web site, so performance improvement tips for a blog will also improve the performance of a web site. That said, you will certainly get more value out of this book if you’re running WordPress and / or a MySQL database server. There are quite a few tips, so I’ve broken this post into multiple parts (yes, another one of those!).

workmen-1.jpgMy goal with this series of posts is to give you some fast and easy ways to prepare for traffic spikes. Of course, a crash due to a traffic spike is a good problem to have, but it would be better still if we could avoid it altogether. So to get you started quickly, try to follow these tips in order; i.e. start at the top of post one and work you way through each post. The tips near the top will give you the most bang for your buck, according to my determination of the trade-offs among tip priority, tip complexity, and time required for implementation. And keep tuning until you feel you’ve done enough to ensure that your server is able to handle the traffic. As we move lower down in this list, we’ll reach a point of diminishing returns. Each additional tweak could take more and more time for less and less improvement.

These posts contain the tips that I’ve picked up over the past several years. Some are from experimentation. Some were learned during consulting engagements. More are from administrators who have kindly shared their experiences with others. Many are from RTFM (“delving into the product documentation”). I’m not going to provide an overly deep explanation for each tip. You’re busy. You’ve got a site to administer.

I hope you find these tips useful.

Knowing When to Upgrade

Although it might be a hard decision to make, we must each make a determination about the viability of our current hardware and network. Not all servers, networks, or hosting accounts are created equally.

Server: your server must be physically able to handle high traffic loads. This means an adequate processor, plenty of RAM, and the often over-looked network card.

Network: hosts will have network limiters on their servers because they must parcel out their limited bandwidth to many servers at the data center.

Hosting account: your hosting account will have stated provisions for bandwidth. Larger hosts are usually much more generous with bandwidth. Some hosts will also make bursting provisions, which permit periodic spikes in traffic that are not maintained over long periods.

Upgrade Path

So if you’re using a shared hosting account (good ones are Lunar Pages and BlueHost) and your host is forgiving, you might need to be moved to a newer, more powerful server. You might also be forced to upgrade to a VPS plan. I am not a fan of entry-level VPS plans, so if you’re forced to go this route, don’t choose the least expensive plan and hope to maintain the response times you have with a shared account. A good VPS provider is Spry. Another option is to try moving to a host that uses grid service or cloud hosting, like Mosso or Media Temple.

If you’re using a VPS plan, you might need to consider upgrading to a more expensive plan that gives you a larger share of the processor and RAM. Or at least negotiate for an increased burst rate. If you’ve outgrown VPS entirely, you’ll need to move to a dedicated server. Good providers are iWeb and Liquid Web.

If you’re already using a dedicated server, you might need to look into upgrading your network card, adding RAM, using more and multi-core processors, or upgrading your hard drives. Or, you might need to add additional servers and configure a load-balancing solution. Or move MySQL to a server or servers on its / their own.

Hard decisions, especially when each step up the ladder involves increasing amounts of money.

Enough with this, let’s get started with the tips!

Install WP-Cache

WP-Cache is a WordPress plug-in that can have a dramatic impact on the performance of your blog. By dramatic, I mean anywhere from a 1,000 to 10,000 percent performance improvement, reducing response times by a few tenths of a second in many cases.

By default, all pages requested from WordPress are built dynamically; with WP-Cache, when a visitor requests a page or post, a stored, static version of the requested page is presented instead.

  1. Download the WP-Cache plug-in.
  2. Upload and extract it in your plug-ins folder.
  3. Activate it in your WordPress administrative panel.
  4. In Options | Reading, make sure that gzip compression is not selected.
  5. Open the Options | WP-Cache subtab, and it will attempt to configure itself.

Deactivate Plug-ins

It is possible to use too many plug-ins. Each plug-in requires additional server resources and processing time, and some require the use of additional software.

In all situations, a good practice is to only use plug-ins that you need to help you administer your blog, or that enhance the experience of your reader. In high-traffic situations, you need to be ruthless about trimming the fat.

One of WordPress’s nice features is being able to quickly activate and deactivate plug-ins. Go through your list and see if any plug-ins could be removed temporarily while you meet a spike in traffic.

Use Your Visitors’ Browser Caches

workmen-2.jpgFirst-time visitors to your page need to download all of the page components (HTML, CSS, JavaScript, Flash, images, etc.) before the page can be viewed, and this can require a fair amount of bandwidth.

But you can use the “Expires” or “Cache-Control” headers to make sure that their browsers cache those components. This won’t help with the first page request, but all additional page requests will avoid downloading those same resources again.

There are quite a few considerations to look at here, but many can be avoided if we just keep things simple. It won’t be optimal, but it will give you the most bang for your buck.

  1. Go to Presentation | Theme Editor in your WordPress administrative panel.
  2. Click the “Header” link on the right.
  3. At the very top (this must come before any other output), enter:
   ?php
       Header("Cache-Control: max-age=172800, must-revalidate");
       $strExpires = "Expires: " . gmdate("D, d M Y H:i:s", time() + 172800) . " GMT";
       Header($strExpires);
   ?>

You can also set an expiration date far into the future, but you run the risk that the user never sees the content that you’ve updated unless the file name changes. Setting expiration to two days (172,800 seconds) is a good tradeoff to get you past any traffic surges. Adjust as necessary.

Optimize your Database

The larger your database, the more advantageous this tip will be.

  1. Open phpMyAdmin
  2. Open your WordPress database (not information_schema)
  3. Perform a backup just in case
    1. Click the “Export” tab
    2. check the “Add DROP TABLE / DROP VIEW” checkbox;
    3. check the “Complete inserts” checkbox;
    4. check the “Save as File” checkbox;
    5. Click the “Go” button and download your backup
  4. Optimize your tables
    1. Click the “Structure” tab
    2. Click the “Check all” link below the table list to select all tables
    3. In the “With selected” drop-down box, select the Optimize tables option.

Serve your Images from Somewhere Else

This tip is often a tough sell, because it does require more time than the others to implement; however, its effect can be large.

Images eat a lot of your bandwidth, especially in blogs, and result in additional requests to your server. If you find that you use images in your posts, consider serving them from another server.

A user’s browser can only open two resources from one address at any one time. Offload your images, and your viewers’ browsers can open four simultaneous connections.
So your server is using less bandwidth, is being hit with fewer requests for resources, and your viewer’s browser is receiving twice as much content simultaneously. Not bad.
A service I would wholeheartedly recommend is Amazon S3; to a lesser extent, Steady Offload and Flickr.

Amazon opens up their server farms for several uses, one being their S3 storage service. You can access Amazon’s rock solid reliability for a very small price: $0.15 per GB per month for storage, and $0.20 per GB of bandwidth.

Assuming your average blog page contains 100 KB worth of images, you currently have 100 posts in your blog, and you post at a rate of 100 posts per year, your image storage cost is literally less than $0.01 per year. Your bandwidth costs would work out to $2 per 100,000 visits.

Steady Offload is an interesting service that mirrors your static content on their site. It’s effortless to set up, and you only pay for bandwidth you use. For high-traffic situations, I still would prefer Amazon’s reliability.

Flickr is a great option if you have a personal or non-commercial blog. They have restrictions against professional and corporate use, which can limit its usefulness for some. Also, it’s an image-sharing site, which might be a concern. On one hand, your images get more traffic, on the other hand, your images could be “involuntarily shared” with others.

I’d recommend that you stay away from ImageShack and PhotoBucket, although use of both is popular. ImageShack’s bandwidth per hour limit makes it useless during traffic spikes and both ImageShack and PhotoBucket are frequently blocked by corporate firewalls, and it’s never a good idea to alienate portions of your audience.

Don’t Touch your MyISAM Tables!

This is not a to-do tip, but rather a not-to-do tip. It had to be mentioned, as it’s a very common source of questions with folks who are just digging into MySQL. Many people contend that converting your MySQL MyISAM tables to InnoDB tables will have a large impact on performance, and this is true in certain situations.

InnoDB tables feature row-level locking, meaning that when a row in a table is being edited, only that row must be locked. In MyISAM tables, the entire table must be locked to edit a row.

In blogs, the only time you would require table locking is when content is being added, e.g. a comment or a blog post. As readers outnumber commenters and posters by an overwhelming majority, using InnoDB tables for a WordPress blog would actually hurt per-formance, not improve it.

If you were considering it, don’t do it!

That’s it for today. Check back in tomorrow for part 2 of this post.

Part 1 of 3: Part 1 | Part 2 | Part 3

Dash Express: Next Generation GPS Navigation

March 28, 2008

dash-2.jpgIf you’re a fan of GPS navigation systems, this is the most compelling reason to upgrade that I’ve yet seen; if you’ve been on the fence about whether to try one on, this just might push you over the edge.

Dash Express is a navigation system with built-in WiFi, better routing capabilities, traffic data, and a boatload of minor feature improvements, combined with an online control panel to expand your reach outside of the car.

The old way to navigate: at home, find the address of where you want to visit, write it down on a sticky pad, and head out to your car. Punch in that address and off you go. If they’re closed, you’re out of luck. Hit a bad patch of traffic, too bad.

Dash Express navigation: hop in your car, search Yahoo! Local for a destination, and get routed. Scan ahead for any traffic backups and reroute as needed. Change your mind as needed. Alert your friends (who are also Dash users) of where you’re heading to, or get updates in your unit of locations sent to you by your friends. Or rewind to the beginning. Find your destination address on your home PC, right-click it in your web browser, then select “Send 2 Car.” When you fire up Dash, the address will be waiting for you.

dash-1.gifDash is to standard GPS navigation systems what Google Maps was to MapQuest. One of things I always thought was “broken” in my nav system is the routing capabilities. To reroute, I’d have to pull up the turn by turn directions, and delete some of the waypoints, hoping I deleted the correct ones to force a reroute. Dash displays multiple routes and lets you choose from them. The UI has some significant improvements as well — at a glance it just looks more organized and intuitive.

Dash isn’t cheap at $399, but it sure is pretty! And maybe you can make up for some of that cost with one of Dash’s built-in features: find the cheapest gas prioces in town, and get routed there. Plus, if it ever gets stolen, you can remotely disable it so the thieves can’t enjoy your new toy.

Find it at Amazon

Nifty hack: turn text into images with CSS

February 18, 2008

This is a pretty slick CSS effect — hiding images in plain sight. If you’re using Safari or Firefox, this should work for you.

This post looks like plain text, right?

It is, except that many styles have been

applied to the letters. See the styles

and the overall effect by selecting all

of the text in this paragraph.This post

looks like plain text, right? It is, exc

ept that many styles have been applied t

o the letters. See the styles and the ov

erall effect by selecting all of the tex

t in this paragraph.This post looks like

plain text, right? It is, except that m

any styles have been applied to the lett

ers. See the styles and the overall effe

ct by selecting all of the text in this

paragraph.This post looks like plain tex

Interesting, no? Try it for yourself at http://metaatem.net/highlite/. You can select the text width and color to control resolution. BTW, this is the image that is being used as a source:

wyatt-beach1.jpg

Swirl Connect: location-based mobile social software

January 28, 2008

We have quietly released Swirl Connect, software for your mobile phone that helps you stay connected with your friends and their latest activity, as well find new people and places, whether you’re mobile or at home on your PC.

The current release is slightly hobbled, as $$$$ is tight, and we had to turn off the messaging features, but there’s still plenty to do and see. Here are a few features:

  • Find your friends and get alerted to their latest activity
  • Explore nearby places of interest
  • View and share photos, notes, and places on your PC or mobile phone
  • Mobile instant message or group message with your friends
  • Meet new people while you’re on the move
  • Get location-based alerts
  • Interact with both PC and mobile users in real time

Try it out! It’s free, supports popular Nokia, Sony-Ericsson, and Motorola phones (with plenty more coming), and is a lot of fun.

swirl-shot-11.png swirl-shot-21.png

I Can’t Imagine a Worse Suggestion

January 3, 2008

I was searching for relevant keywords for Swirl Connect. I don’t recall
exactly what I entered, but I was using keywords similar to “mobile
social software, friend finder, geotag photos, mobile instant
messaging, location based, share photos, share place.” I don’t know
what kind of leap in logic resulted in this suggestion:

interesting-keyword-selection1.gif

Credit: this is posted on the Swirl Connect site, as I was generating keywords for it when it appeared. Swirl Connect is location-based mobile social software that helps you stay connected with friends and meet new people while you’re on the move.

Geocoding based on an IP Address

November 13, 2007

Okay, so I’m on a bit of a geocoding kick here. Previous posts have discussed geocoding when you have a physical street address. But obtaining an address can be obtrusive, and the dataset used is North America-centric. This post focuses on very quickly geocoding a user’s location based on their originating IP address with MaxMind’s GeoLite City database and Java API.

There are a number of reasons why you might want to determine a user’s location based on their IP address:

  • Center a map mashup on the user’s location
  • Serve localized content, e.g. language, currency, time
  • Reduce credit card fraud (this seems to be the most commercial use at present)
  • Target marketing and ads

The biggest problem with geocoding by IP address is that it can be inaccurate for many IP addresses. This is because the coordinates for a given IP address are for the organization that owns the IP address block, and not necessarily the location of the end user of that IP address. Complicating this further are those users who connect via a proxy — e.g. AOL users. So private IP’s, VPN’s, proxied browsers, internal network blocks, and so on are difficult to geocode.

Using the GeoLite City Database on Your Server

You have two download options: CSV and binary. If your project requires that you import data into MySQL, you can use the CSV option, but it is much slower and requires more effort to setup. Binary is your best bet, and is what I used.

I installed the GeoLite City binary on both a Windows development PC and a Linux server. On Windows, download to your PC, and extract using WinZip or similar tool. On Linux:

$ wget http://www.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz
$ gunzip GeoLiteCity.dat.gz
$ mv GeoLiteCity.dat /path/to/database/location/GeoLiteCity.dat

Import the Java API into your current project in your IDE of choice. I’m using JBuilder (boo, hiss). There are also API’s for C, Perl, PHP, C#, Ruby, Python, VB.NET, Pascal, and JavaScript.

Using the Database (IP Address to Latitude and Longitude)

1) In your class file, create a LookupService object, specifying the location of the database you extracted. Then create a Location object for the IP address you want to geocode:

LookupService lookup = null;
try {
lookup = new LookupService(PATH_TO_DATA, LookupService.GEOIP_MEMORY_CACHE);
} catch (IOException e) {
System.out.println(ex.getMessage());
ex.printStackTrace(System.err);
}

Location location = lookup.getLocation(”62.75.185.174″);

2) Then you can extract location information from the Location object, including country, region, city, postal code, latitude, longitude, area code, and timezone.

String city = location.city;
float latitude = location.latitude;
float longitude = location.longitude;

If you create two Location objects from LookupService, you can calculate distance between them with:

double distance = location1.distance(location2);

3) Remember to close the database connection. Data access is thread safe, by the way.

lookup.close();

Note that to use GeoLite City on a public web site, you must include the line “This product includes GeoLite data created by MaxMind, available from http://www.maxmind.com/” in any documentation or promotional materials.

[Read more]

Apache and Tomcat via Cpanel - Servlet Display Problems

November 7, 2007

Problem

Apache Web Server was not passing servlet requests to Apache Tomcat. Instead it served 404 errors, even though the Apache Tomcat Connector (JK 1.2, mod_jk) was auto-configured by the WHM / Cpanel installation.

Other Possible Descriptions of the Problem

  • Jsp’s work in Tomcat, but servlets do not
  • Apache Http Server won’t pass servlet requests to Tomcat
  • Tomcat problems using the Cpanel plugin
  • Virtual host configuration problem with Cpanel Tomcat
  • Apache not recognizing servlets
  • Servlets can’t be accessed through Apache

In the latest Cpanel Release (11.15.0-RELEASE 17853), Tomcat support has been integrated. Prior to this (I’m not sure for how long), Tomcat was available via a beta plug-in. I experienced this problem with both the beta plug-in and the integrated support.

What’s Happening

Apache Http Server accepts all web requests and determines which are requests for static content, and which requests should be forwarded to Tomcat.

Apache correctly serves static content, and correctly passes all requests for .jsp pages to Tomcat. But when a servlet is requested, e.g. www.myserver.com/myapp/myservlet, Apache looks for the “/myapp/myservlet” directory, and finding none, spits out a 404 error.

How to Resolve the Problem

I tried several things that I thought should work but did not, though I don’t know if it was due to my specific configuration or because they were just the wrong things to do. What finally solved the problem was just adding an .htaccess file to the root of the web application with the following lines:

SetHandler jakarta-servlet
SetEnv JK_WORKER_NAME ajp13

This forces Apache to forward all requests to resources within this context to Tomcat for processing, specifically to worker ajp13. Ajp13 is one of the default workers set up, and is defined (on my system) in /usr/local/jakarta/tomcat/conf/workers.properties.

Other things that I thought should work but didn’t (your mileage may vary):

1) In /etc/httpd/conf/jk.conf (if your httpd.conf file includes jk.conf),
add/edit the switch “+ForwardDirectories.” Normally, if Apache runs
across a directory it doesn’t recognize, it will spit out a 404. This
switch says to forward those requests to Tomcat, and let Tomcat spit
out a 404 if it can’t fulfill the request.

2) In /etc/httpd/conf/jk.conf (if your httpd.conf file includes jk.conf), specifically mount each context, and unmount static content. Mounting tells Apache to pass requests to Tomcat, and unmounting tells Apache to serve the content itself. Newer versions of Tomcat are faster than Apache at serving static content, but apparently, using Apache to serve static content is safer from a security perspective.

JkMount /mywebapp/* ajp13
JkUnMount /*.gif ajp13
JkUnMount /*.jpg ajp13

A separate issue that is outside the scope of this post is whether you should use Apache Web Server to front Tomcat requests, or whether you should just have Tomcat accept requests over port 80. If you’re using a recent Tomcat version (5.5+), Tomcat can serve both static and dynamic content faster than Apache.

Use Tomcat if: 1) you’re only dealing with a single server; 2) and you’re not using any other software that requires Apache (e.g. forums or wikis written in PHP).

Use Apache to front Tomcat requests if: 1) you want to load balance across multiple servers; 2) or you want different web applications or virtual hosts to be served by different processes.

Disclaimer

I know embarrassingly little about hardware, networking, or server setup. This solution might be a hack or it might be obvious to some more familiar with the components mentioned, but it couldn’t be resolved through a dedicated server help desk or through the Cpanel help desk, so I assume there are others out there that this could help. Everything in this post is based on my limited experience with the aforementioned software and my own research. If you know of a better, cleaner way to do this, or if you know how better to describe this problem or solution, please forward to me and I’ll amend this post.

Rationale Behind this Post

I recently (as in yesterday) resolved a difficult to diagnose problem involving Cpanel, Apache Web Server, Apache Tomcat, and the Apache Tomcat Connector (JK 1.2, mod_jk2). My googling skills tend to be above average, but I could find no reference to this specific problem anywhere, and the sole purpose of this post is to hopefully save someone else the aggravation. So please disregard the keyword-heavy text — it’s altruistic in nature, I assure you.

Geocoding with the Google Geocoder

November 5, 2007

Before using the Google Geocoder, you must have a Google Maps API key. It will not work without one. If you don’t yet have one, get yours via the Google Maps API page. Also, to get this out of the way, Google has provided a fantastic service free of charge for non-commercial purposes. Please respect their terms of service.

Download the free Google Geocoder

The Google Geocoder is very similar to the Quick ‘n Dirty Geocoder. It is free software, and can be used and distributed however you like. It installs in a servlet container, and accesses the Google Maps web service to translate the names and addresses you supply (in a text file) into geographic coordinates, which it then writes back to your PC and/or a database.

For details on how to install and use this software, frequently asked questions, configuration, etc., refer to the wiki article. The short version is just drop goog.war into Tomcat’s webapps directory, start it, then follow the instructions at http://localhost:8080/goog/code. This post describes some of the Java code used to implement the communication between the Google geocoder and the Google Maps web service.

[Read more]

Next Page »