Thursday, July 30, 2009

I can see my house.. - Part 2 : Open Source Geocoding

The drawback of relying on Google APIs is that they've got all these 'Terms of Use'. Unless you pay, you can only call the geocoder so often and can't use it for applications which aren't available freely to the public, and you can't store results except for caching.

What you could really do with is your own geocoding service and that might not be as far-fetched as it seems. If you have enough monkeys and typewriters, they'll eventually write Hamlet. In a more realistic timescale, if you have enough people driving around with GPS devices, they'll come up with a map. They have and it is called OpenStreetMap. It is still a work-in-progress (and always will be, as roads get built or redeveloped). You'll have to see if the data is sufficient for your area of interest. If not, buy a GPS and fix it.

You can get raw data from them, in an XML format called OSM. I had a look at that route and it is not ideal. I'll blog on that later. In the meantime, here's one some-else made earlier. Cloudmade have done a nice chunk of processing for us, and generated Shapefiles similar to those I loaded up in the prior post. From here, I can get a zip file containing shapefiles, one of which is 'australia_highway'. I load that the same way I did the others.

They do other countries too and boundaries as well as highways, so if the Australian postcodes/suburbs from the last post weren't useful to you, this resource may help out.

The local_geocode.sql is an example of how I've used the data to do some basic geocoding. To get that code to work in XE, you'd have to get rid of the call to "sdo_geom.sdo_closest_points" as it is new in 11g. Lets hope that OpenWorld will give us a hint on when we may see 11gR2 XE.

The data isn't at the level of Google. It doesn't do street numbers, for example. Some streets in the source OSM don't have names, and the street next to the one I live in is missing entirely. But if you are trying to determine the closest store for your customer based on their address, it could be good enough. Even if coverage isn't perfect, you may still be able to reduce the amount of calls you make to a "paid for" geocoder (which could reduce the amount you need to pay for it).

Warning: You may be tempted to switch to use sdo_geom.sdo_intersection to find where two streets cross. This is actually part of Oracle Spatial and isn't licensed for use with the basic Locator functionality that comes free with the database. In 10gR2, Locator only included the SGO_GEOM functions SDO_DISTANCE , VALIDATE_GEOMETRY_WITH_CONTEXT, VALIDATE_LAYER_WITH_CONTEXT. In 11g, it includes all subprograms EXCEPT for the following: SDO_GEOM.RELATE, SDO_GEOM.SDO_DIFFERENCE, SDO_GEOM.SDO_INTERSECTION, SDO_GEOM.SDO_UNION, SDO_GEOM.SDO_VOLUME, SDO_GEOM.SDO_XOR.

With XE, the licensing is less-defined. According to this, Spatial is not included with XE. Logically that would mean sdo_intersection can't be part of Spatial as it is included. And even if it is part of Spatial, then are the license restrictions in either the 10gR2 or 11g 'Spatial' documents applicable. Since though either would preclude sdo_intersection, the last question is probably academic.

A discussion of Spatial vs Locator licensing can be found here. As reported here, the cost of Spatial has risen from an arm and a leg to boths legs, one arm and an ear (or by 52% in dollar terms).

1 comment:

sydoracle said...

Code now located at http://www.sydoracle.com/Codespace