Free Geocoders / Geocoding Posts

December 28, 2007

Several weeks ago I was making changes to an as-yet-unreleased wiki, and to protect it from prying eyes I locked it up from all IP addresses except my own. My IP address was just updated, and looky looky, I can no longer access the wiki, which brought that permission change to the forefront.

Problem is, I posted several articles on some free geocoders I wrote. Just simple tools that you can use to geocode addresses via Google’s Map API or the Geocoder.us service. Those tools are only available on the wiki, so I’m afraid that everyone was met with a ‘403 access denied’ error when attempting to access them.

Everything is open again; please accept my sincerest apologies. There’s nothing I hate more than wasting my time, so I hope I didn’t waste too much of yours.

Geocoding based on an IP Address

November 13, 2007

Okay, so I’m on a bit of a geocoding kick here. Previous posts have discussed geocoding when you have a physical street address. But obtaining an address can be obtrusive, and the dataset used is North America-centric. This post focuses on very quickly geocoding a user’s location based on their originating IP address with MaxMind’s GeoLite City database and Java API.

There are a number of reasons why you might want to determine a user’s location based on their IP address:

  • Center a map mashup on the user’s location
  • Serve localized content, e.g. language, currency, time
  • Reduce credit card fraud (this seems to be the most commercial use at present)
  • Target marketing and ads

The biggest problem with geocoding by IP address is that it can be inaccurate for many IP addresses. This is because the coordinates for a given IP address are for the organization that owns the IP address block, and not necessarily the location of the end user of that IP address. Complicating this further are those users who connect via a proxy — e.g. AOL users. So private IP’s, VPN’s, proxied browsers, internal network blocks, and so on are difficult to geocode.

Using the GeoLite City Database on Your Server

You have two download options: CSV and binary. If your project requires that you import data into MySQL, you can use the CSV option, but it is much slower and requires more effort to setup. Binary is your best bet, and is what I used.

I installed the GeoLite City binary on both a Windows development PC and a Linux server. On Windows, download to your PC, and extract using WinZip or similar tool. On Linux:

$ wget http://www.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz
$ gunzip GeoLiteCity.dat.gz
$ mv GeoLiteCity.dat /path/to/database/location/GeoLiteCity.dat

Import the Java API into your current project in your IDE of choice. I’m using JBuilder (boo, hiss). There are also API’s for C, Perl, PHP, C#, Ruby, Python, VB.NET, Pascal, and JavaScript.

Using the Database (IP Address to Latitude and Longitude)

1) In your class file, create a LookupService object, specifying the location of the database you extracted. Then create a Location object for the IP address you want to geocode:

LookupService lookup = null;
try {
lookup = new LookupService(PATH_TO_DATA, LookupService.GEOIP_MEMORY_CACHE);
} catch (IOException e) {
System.out.println(ex.getMessage());
ex.printStackTrace(System.err);
}

Location location = lookup.getLocation(”62.75.185.174″);

2) Then you can extract location information from the Location object, including country, region, city, postal code, latitude, longitude, area code, and timezone.

String city = location.city;
float latitude = location.latitude;
float longitude = location.longitude;

If you create two Location objects from LookupService, you can calculate distance between them with:

double distance = location1.distance(location2);

3) Remember to close the database connection. Data access is thread safe, by the way.

lookup.close();

Note that to use GeoLite City on a public web site, you must include the line “This product includes GeoLite data created by MaxMind, available from http://www.maxmind.com/” in any documentation or promotional materials.

[Read more]

Apache and Tomcat via Cpanel - Servlet Display Problems

November 7, 2007

Problem

Apache Web Server was not passing servlet requests to Apache Tomcat. Instead it served 404 errors, even though the Apache Tomcat Connector (JK 1.2, mod_jk) was auto-configured by the WHM / Cpanel installation.

Other Possible Descriptions of the Problem

  • Jsp’s work in Tomcat, but servlets do not
  • Apache Http Server won’t pass servlet requests to Tomcat
  • Tomcat problems using the Cpanel plugin
  • Virtual host configuration problem with Cpanel Tomcat
  • Apache not recognizing servlets
  • Servlets can’t be accessed through Apache

In the latest Cpanel Release (11.15.0-RELEASE 17853), Tomcat support has been integrated. Prior to this (I’m not sure for how long), Tomcat was available via a beta plug-in. I experienced this problem with both the beta plug-in and the integrated support.

What’s Happening

Apache Http Server accepts all web requests and determines which are requests for static content, and which requests should be forwarded to Tomcat.

Apache correctly serves static content, and correctly passes all requests for .jsp pages to Tomcat. But when a servlet is requested, e.g. www.myserver.com/myapp/myservlet, Apache looks for the “/myapp/myservlet” directory, and finding none, spits out a 404 error.

How to Resolve the Problem

I tried several things that I thought should work but did not, though I don’t know if it was due to my specific configuration or because they were just the wrong things to do. What finally solved the problem was just adding an .htaccess file to the root of the web application with the following lines:

SetHandler jakarta-servlet
SetEnv JK_WORKER_NAME ajp13

This forces Apache to forward all requests to resources within this context to Tomcat for processing, specifically to worker ajp13. Ajp13 is one of the default workers set up, and is defined (on my system) in /usr/local/jakarta/tomcat/conf/workers.properties.

Other things that I thought should work but didn’t (your mileage may vary):

1) In /etc/httpd/conf/jk.conf (if your httpd.conf file includes jk.conf),
add/edit the switch “+ForwardDirectories.” Normally, if Apache runs
across a directory it doesn’t recognize, it will spit out a 404. This
switch says to forward those requests to Tomcat, and let Tomcat spit
out a 404 if it can’t fulfill the request.

2) In /etc/httpd/conf/jk.conf (if your httpd.conf file includes jk.conf), specifically mount each context, and unmount static content. Mounting tells Apache to pass requests to Tomcat, and unmounting tells Apache to serve the content itself. Newer versions of Tomcat are faster than Apache at serving static content, but apparently, using Apache to serve static content is safer from a security perspective.

JkMount /mywebapp/* ajp13
JkUnMount /*.gif ajp13
JkUnMount /*.jpg ajp13

A separate issue that is outside the scope of this post is whether you should use Apache Web Server to front Tomcat requests, or whether you should just have Tomcat accept requests over port 80. If you’re using a recent Tomcat version (5.5+), Tomcat can serve both static and dynamic content faster than Apache.

Use Tomcat if: 1) you’re only dealing with a single server; 2) and you’re not using any other software that requires Apache (e.g. forums or wikis written in PHP).

Use Apache to front Tomcat requests if: 1) you want to load balance across multiple servers; 2) or you want different web applications or virtual hosts to be served by different processes.

Disclaimer

I know embarrassingly little about hardware, networking, or server setup. This solution might be a hack or it might be obvious to some more familiar with the components mentioned, but it couldn’t be resolved through a dedicated server help desk or through the Cpanel help desk, so I assume there are others out there that this could help. Everything in this post is based on my limited experience with the aforementioned software and my own research. If you know of a better, cleaner way to do this, or if you know how better to describe this problem or solution, please forward to me and I’ll amend this post.

Rationale Behind this Post

I recently (as in yesterday) resolved a difficult to diagnose problem involving Cpanel, Apache Web Server, Apache Tomcat, and the Apache Tomcat Connector (JK 1.2, mod_jk2). My googling skills tend to be above average, but I could find no reference to this specific problem anywhere, and the sole purpose of this post is to hopefully save someone else the aggravation. So please disregard the keyword-heavy text — it’s altruistic in nature, I assure you.

Geocoding with the Google Geocoder

November 5, 2007

Before using the Google Geocoder, you must have a Google Maps API key. It will not work without one. If you don’t yet have one, get yours via the Google Maps API page. Also, to get this out of the way, Google has provided a fantastic service free of charge for non-commercial purposes. Please respect their terms of service.

Download the free Google Geocoder

The Google Geocoder is very similar to the Quick ‘n Dirty Geocoder. It is free software, and can be used and distributed however you like. It installs in a servlet container, and accesses the Google Maps web service to translate the names and addresses you supply (in a text file) into geographic coordinates, which it then writes back to your PC and/or a database.

For details on how to install and use this software, frequently asked questions, configuration, etc., refer to the wiki article. The short version is just drop goog.war into Tomcat’s webapps directory, start it, then follow the instructions at http://localhost:8080/goog/code. This post describes some of the Java code used to implement the communication between the Google geocoder and the Google Maps web service.

[Read more]

Geocoding with the Quick ‘n Dirty Geocoder

November 1, 2007

The geocoder described in this post is free software, and can be used and distributed however you like.qd-geocoder1.gif

It installs in a servlet container like Tomcat, and accesses the geocoder.us web service to translate the names and addresses you supply (in a text file) into geographic coordinates, which it then writes back to your PC and/or a database.

Download the free geocoder

For details on how to install and use this software, frequently asked questions, configuration, etc., refer to the wiki article. The short version is just drop geo.war into Tomcat’s webapps directory, start it, then follow the instructions at http://localhost:8080/geo/code. This post describes some of the Java code used to implement the communication between the Q&D geocoder and the geocoder.us web service.

Most lines of code in this application are used in IO on your own PC, rather than the web service. The GeoServlet receives requests, and uses the GeoImporter to read in and parse files from the text file on your PC. It then uses the GeoTransport class to communicate with the geocoder.us service, and finally writes the results back to a file and/or database with the GeoExporter class.

[Read more]

Geocoding with Geocoder.us and Google Maps

October 30, 2007

Geocoding is the process of assigning geographic identifiers to map features — a specific example is assigning a latitude and longitude to a given street address. A common technique uses address interpolation. Using this method, if we know a street address and the endpoints of that street, we can interpolate the approximate location of the specific address.

The address information comes from the TIGER/Line files, which are extracts of selected geographic and cartographic information from the US Census Bureau’s TIGER (Topologically Integrated Geographic Encoding and Referencing) database.

So the task of a geocoder is to parse an address for street numbers, names, cities, states, and zip codes, and then interpolate the coordinates of that address by finding its endpoints in the dataset. I recently used two geocoders, Google Maps and Geocoder.us, and thought I’d share the results of my work along with free software that you can use to geocode your own addresses.

[Read more]

Job Scheduling with Quartz

September 1, 2006

I’ve recently become enamored with the Quartz job scheduling framework from OpenSymphony. While Quartz can be used in many types of applications, I ran across it looking for a solution to fire scheduled events in web applications. I wanted a Cron replacement that allowed me to instantiate and run any Java class in my application.

Before Quartz, I relied primarily on Caucho Resin’s runat tag to fire events. By adding a runat tag to the application’s web.xml file, you can call a servlet extending GenericServlet. Using Timers or application server-specific solutions is another option, but for the current project, I really wanted to start out with Tomcat, and I wanted the option to easily migrate to an application server if the need arose.

Quartz just works, and integrating it with an existing project was a snap. I approach most new tools and frameworks cautiously. The feature rich, easy to configure, and mature trifecta doesn’t come along as often as one would like, and I’ve been continually impressed with how carefully executed Quartz is.

There are three pieces that make a Quartz application: jobs, triggers, and the scheduler.

  • Jobs: contain the logic or processing that you want to perform.
  • Triggers: determine when jobs are run.
  • Scheduler: the conductor that coordinates jobs and triggers.

Since I’ve started using Quartz, I’ve offloaded much of the on-demand processing into jobs of all sorts. Some jobs are used to perform routine maintenance, in a Ronco set it and forget it fashion, e.g. backups. Some jobs are used to generate pseudo-dynamic content, by running frequently in the background and generating static pages. This takes load off of the server, and conveniently aids in search engine spidering. Some jobs are traditional batch processing. For example, in the current project, a lot of geocoding is performed. Batching these requests, and running them every hour is much more efficient than firing them off on demand.

Adding to the flexibility of Quartz is the ability to schedule jobs both programmatically and declaratively, the ability to use listeners, the ability to persist jobs in a JobStore, and very versatile triggers.

Quartz is most definitely worth a look-see. The latest stable release is 1.5.2, and 1.6.0 alpha was recently released. Both can be download via OpenSymphony.

Call for Entries: JBoss Innovation Award

March 1, 2006

‘Tis the season, I suppose.

“We want to hear how developers have used JEMS products to improve existing processes, overcome technology challenges, and enhance their company’s bottom line. Winning projects across several categories will be promoted and recognized at the largest worldwide JBoss community event, JBoss World in Las Vegas, June 12-15, 2006.”

Major award categories are Partner Innovation Awards, Best Practices Innovation Award, and Technology Innovation Award. Entries judged on creativity, impact, and presentation.

Tons of publicity and a free pass to JBoss World. Let’s face it, we all need a little Vegas now and again (and again).

Award Details | JBoss World Vegas 2006