Archive for the ‘Geograph’ Category

Piclens viewer for Geograph

Thursday, June 19th, 2008

Piclens is an amazing program for very simply viewing photos full screen, in a fluid browser plugin, minimising distractions to focus on the photos. Really you have to try it for yourself to experience it, but below are a few screenshots, I highly recommend downloading it yourself, works on Internet Explorer, Firefox 2 and 3 amongst others. Even more interesting is thanks to the team at Piclens we have a special version that incorperates a Geograph search, so download Piclens using the special link below:

If you already have it, it might ask to reinstall again.

Piclens screenshots
(click to enlarge – my graphics skills don’t do it justice)

Also of note the search box uses Geograph’s new experimental full text engine, so should be quick and intuitive to use… Also when viewing the search results, continues rightwards in a continuous 3D wall – no paging required.

As expected the piclens display page on Geograph still works, which can be used to visualise any standard results. Choose the ‘piclens’ option using the dropdown on the search page.

Reusing Geograph Images

Wednesday, April 23rd, 2008

The Creative Commons licence used for ALL (768,000+) Geograph Images was chosen for a very specific reason – it allows moneyification – helps preserve the archive forever – but it also makes the archive freely available to all – at least as long as reuse is attributed.

As evidenced by the number of images put in forums, blogs, and en-mass in sites like Wikipedia, there is clearly interest in reusing our images, but it its also clear there is some confusion on how to credit photos, and many people also hotlink images directly from the Geograph Servers.

To make it easier we have recently introduced a new page that helps explain the requirements, however as its linked directly from each photo it can be aware of the photo in question, it can also provide snippets of code ready for copy/pasting.

We currently have HTML, code for BB-compatible forums, Creative-Commons RDF matadata, even Wikipedia templates. Example here, look for the link under the big image on the main photo page.

This last entry is particularly new and interesting – this outputs three Wikipedia templates, making sure the image has the maximum amount of the data we have available – this includes the information box (title, links and licence), geotagging, as well as the specific Geograph Template.

Suggestions for other snippets welcome….

Also while here will also mention we have created a Google Gadget, which makes it easy to embed a small selection of Geograph Images in any webpage, or iGoogle itself. To change the selection, run a suitable search on Geograph and copy the i number from the url into the Gadget preferences.

Geograph … three years in the making!

Thursday, March 6th, 2008

Geograph is 3


Another Whole Myriad :: TG

Sunday, February 10th, 2008

Preview of TG Myriad Mosaic The pace new photos on Geograph is not relenting, so much so we how have another whole Myriad! This time its TG and covers East Anglia and Norwich area, covering 1991 land squares. There is more on the overall progress here, which shows the coverage by Myriad (which by the way is a Geographism for a 100x100km square on the National Grid)

To try to showcase these have been creating some Zoomable viewers to really see the coverage:

Geograph Mosaic Collection

Unfortunatly they require too much manual work to be part of the real site (creating a ‘printable page’ at the appriate scale, stitching all the images, and then running it though zoomifies utility) – but it would be really good to get a Flash programmer to be able to create a viewer like this that runs directly off Geograph tiles!

Playing with (geo-enabled) Full-Text Searches

Thursday, December 20th, 2007

Recently I have been playing a lot with Sphinx full-text search engine, in particular with regard to indexing the Geograph archive. (a bit of background – Geograph has a fairly good homegrown site text search – but its not full text, so many queries will not return that many results – not to mention been based on MySQL ‘like’, so is pretty slow – so a full text search is the next level). And I have to say I am liking it a LOT, in fact I would say I am a fanboy :)

So to that end of created a whole bunch of demos based around the flexible indexing it provides, location based searching is even possible!

At the most basic is simple text based search, one point of note, there is no pagination, simply add more keywords (including negative) or grid references to refine the selection.

Next is a ‘auto-complete’ style image finder, this is designed to find ‘that image’ quickly, in a similar way to the above but shows the results in a autocomplete box immediately!

A refinement of the first is search with location, this allows you limit the search to near a particular Grid References – this is particully cool in that there is Sphinx powered auto-complete for place names for finding GRs. (a real auto-complete not a like the search in the previous one pretending to be one)

This is all building towards the Illustrator demo. Which from a block of text attempts to find relevent images. The idea is that a (geolocated) news article, walking route, place description and such could be automatically have relevent(ish) images shown. (an example demo here)

(a few more ‘toys’ can be found in GeographTools!)…. Try them out and let me know how you get on…

I have learnt a lot about search indexing from this, including how to perform location searches in the index (I know latest versions of sphinx include a lat/long based geosearch – but I think this r-tree method in text has better scalability), and how to create an autocomplete function with sphinx. If anybody is interested in these, they will eventually make it into the geograph codebase, or let me know and I might make a separate post.

Interestingly (huh?), it was actually creating a ‘autocomplete’ textbox for finding trigpoints (which included the forerunner to the sphinx location search in but implemented in mysql), is actually what inspired me to actually go the trouble if figuring out how to install Sphinx on linux, which I have been interested in for a long time! – that is also now sphinx powered for text searches :)

As a side note have now reached the ‘linux sysadmin’ level that I can compile it on Geographes servers, yay! But I do worry for the sanity of others due to this (a little knowledge is a dangerous thing!)

The British Isles is burning!

Sunday, November 4th, 2007

It has to be said that when we started the Geograph British Isles project we certainly didn’t think it would grow quite a quickly as it has, nor that we could get such a submission rate, particulaly in adding depth – adding photos to a square once it was turned red on the map. Well to this end we recently introduced depth maps which colour the map based on the number of photos in the square. A preview is available here, but click the image for the normal Geograph map viewer in this new mode. Just goes to show how well photographed some areas are, but also a lot of the country only has the first geograph in the square, so much more depth to go!

Update: View as a time-based animation.
(includes raw frames as would like someone to make a better presentantion of this!)

btw, as I am colour-blind and unable to come up with the colour scheme, can only thank who ever created the colour scheme for CIS. (A very interesting package – worth playing with just for the data it has – I can provide a file to load geograph depth into it if interest!)

(yes the title is a tribute to this)

More inc KML2RSS

Tuesday, October 16th, 2007

The people at (Microsoft!) have been making some huge improvements to their map system of late, adding lots of data in both 2D and 3D. has long been good in the UK as it uses imagery from getmapping which is at least 2m resolution, but 25cm in many places. It’s now been backed up by lots of Birds Eye imagry which are static photos taken at an angle, which is great for visualising cities. How these are available in the 3D view (which is currently downloading so havent tried it…) ! Also now supported is visualling KML, GPX and GeoRSS files.
Anyway that was kinda rambley, the point of this post was to highlight small feature I found by accident and not sure it been highlighted on other blogs, namly you can subscribe to a RSS feed for many collections, including a KML or GPX file directly, effectively giving a KML to (Geo)RSS converter!

Example: Recent Geographs KML file as GeoRSS

(looks like we can input any KML/KMZ url in there!)

Geograph completes its first whole Myriad

Monday, August 27th, 2007

SP Myriad - Geograph Coverage Aug 2007 - (c) Geograph Creative Commons LicencedYesterday Geograph reached a significant milestone, getting photographs for a whole myriad. That’s a whole 100x100km square, or 10,000 squares.

Ok there have been a few smaller myriads complete for a while, but SP is a fully landlocked square so represents a significant achievement.

To celebrate: here is a zoomable flash wotsit to showcase all those glorious squares

This month I have mostly been scaling

Wednesday, August 15th, 2007

… a website for more traffic that is. This is something a little off-topic perhaps for this blog, but it might be of interest to a few so will document a few tricks have learnt, in tweaking Geograph to cope with more traffic as it’s daily visitors and hits continue to climb. If you are not familiar with Geograph, or not a System-Admin (or budding – like me!), then probably can stop reading now!

First a little background, Geographs code started very humble, and coded to work off a single server, later with OS sponsorship we upgraded to multiple servers to cope with increasing traffic. This was done with a single larger server for Database and photo storage, and then multiple commodity webservers (with a front end load-balancer) More. This worked well for a number of months, but simply the DB/NAS server couldn’t cope with the increasing DB load, and bandwidth for serving all hundred of thousands of photos.

  • Split the database, a small quick win is/was split off php sessions and gazetteer queries to a second Database. Sessions of course have lots of writes, so where tending to saturate the main db, this perhaps reduced its load by a 3rd!
  • Cache Images on the separate servers. The servers aren’t big enough to house a copy of the full archive, but thumbnails are certainly more manageable. Seeing as thumbnails actually account for about 60-70% of the raw hits to the site, this is a potential win, as previously each server would have to seperatelly fetch individual images off the NAS. We use Apache as the webserver, so could easily create another simple VirtualHost to serve thumbnails, a empty DocumentRoot save images, with a simple 404 handler to fetch and store images not ready copied. This greatly reduces load on the NAS as its not having 3x servers fetching random thumbnails. (this also paves the way to move away from full-blown Apache simply for static content)
  • Cache stuff in Memory – with Memcache. Related to the above point, quite a bit of load is actually random disk IO to determine image sizes as this requires reading the jpeg data. Caching this all in memory is good. Memcache can easily distribute it cache across multiple machines, so even losing a server means only part of the cache needs rebuilding. We also use ADODB as a database abstraction layer, with the latest version it has support to use Memcache for its caching, great! Last up is to do lots of application level caching on key places. Of course sessions and also the templating system (smarty) could benefit from memcache, but one step at a time!
  • Optimise the HTTP headers. There are lots of tweaks and stuff here that can be done to lower the bandwidth and improve external cacheability of objects. This post is getting quite long so I think that might be a separate post…
  • Optimize the slow queries. And last of not least, learn to love going though the log of slow database queries, and really stepping on the slow ones. This of course is an ongoing project. I found a script to summarize these, but it seems that mysql 5 at least comes with its own equally good one!

This list isn’t exhaustive, and of course is an on going project, always more can be done…

The State of the Map…

Wednesday, July 18th, 2007

Over the weekend I attended OpenStreetMap‘s first conference “State of the Map“, all in all a very enjoyable time, great to to listen to all the talks, and also chat with various mappers, meet up with various people I’ve only met before in cyberspace.

Hopefully it will inspire me to actually contribute, esp as frequent two ‘holes’ in the current data…

An interesting little snippet from Ed Parsons talk, is this slide, which shows KML/GeoRSS publishing as indexed by Google, somehow I think I reconsise the British Isles hotspot; geograph, which publishes many KML feeds, (about 600k (the Superlayer, and also a file per photo), of which about 300k are reported to be indexed in Google’s main index, so show up well in ‘User Generated Content‘ in Google Maps!)

Everyone (nearly) – me third from left