User:Fæ/dezoomify

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

What is it?

[edit]

In the last couple of years I've had a few people puzzled and sending me questions, about how to get the right information to unzoom a file which has been "hidden" behind a zoomify web page. Often the Ophir Dezoomify open source tool will just work with the URL you give it. However sometimes galleries or archives have used custom web pages which obscure the information, so the standard tool fails. To solve this you need to locate the image information file which the gallery page may pull in and then push out to its own tile painting program.

The steps below use Chrome due its good inspector tools, though something similar can be done within Firefox.

Some detailed notes

[edit]

Sometimes the file wanted is not a "dzi" file but something like "info.json" or "info.xml", the Ophir tool will accept these cases. In the case of some IIPImage files where you cannot find an info file, a general call to the server looking like <domain>/iipsrv/iipsrv.fcgi?FIF=<image file>&obj=IIP,1.0&obj=max-size&obj=tile-size&obj=resolution-number&obj=bits-per-channel&obj=min-max-sample-values&obj=subject will be how the website finds the relevant image data, and can be seen in the network file list when you refresh the page; pasting that link into the Ophir tool will work. For Ophir's own guide read their FAQ.

If you are content to download a high resolution jpeg, which may be limited to a given maximum depending on server configuration, the iipserver can take a "CVT=jpeg" parameter and you can avoid using any special tools. Refer to http://iipimage.sourceforge.net/documentation/protocol/

The version of Chrome I'm using to take these images is "55". You should probably consider this guide to be suspect if not updated for more than 2 years. -- (talk) 12:49, 22 August 2017 (UTC)

Guide to using dezoomify

[edit]

1 Open a Chrome browser and go to the gallery page with a thumbnail of the image you want to unzoom.

2 Select the Inspect option from the menu to open up the inspection sub-window.

3 In the Inspection sub-window, go to the Network tab by clicking on it.

4 For the first time, zoom the image by clicking on the thumbnail version. If necessary re-fresh the page, the Inspector window should stay open.

5 As the image loads up its tiles, you will see the Network tab fill up with various files being accessed.

6 You can now filter the list of files by typing "dzi" into the filter bar area. See the pink circled text entry area in the image. This will instantly filter the list so that only files with "dzi" in the name are left. This is a Deep Zoom instruction file, telling zooming tools how to find image tiles.

7 Copy the full link address for the dzi file by right-clicking on it and selecting the correct type of copy function.

8 Paste the link into Ophir's dezoomify tool, http://ophir.alwaysdata.net/dezoomify/, and click to unzoom. Chrome may have a problem with allowing a right-click download-image action, if that fails on your version of the browser, just use a different browser, like Firefox, to run Dezoomify in.

The outcome is a large PNG file which you will need to rename. It is recommended that you do not convert this to a jpeg before uploading to Commons, as this will always lose quality of the image. Sometimes you may want to upload a jpeg as well as the uncompressed file, for example when the uncompressed file is hundreds of megabytes in file size and some users will have problems viewing it.

Automation

[edit]

It is possible to automate the use of dezoomify for batches of images, but you should think about creating your own dezooming module in your favourite language first, as browser automation is darn tricky and near impossible to generalize, as getting an implementation to work properly can be platform specific. It will probably be twice as fast to dezoom locally by reading every tile in your own program, rather than using Ophir's dezoomer as a type of poor remote server.

Here's my noddy workflow for how I did it using Python:

  • Install Selenium, this is used to automate a browser
  • Install a Chrome executable for Selenium to use
  • Spend ages working out how the options work and trying to get Selenium to run, possibly give up at this point
  • After successful automation for other tasks, discover more settings you did not know about to enable the browser to run "insecurely" so that dezoomify's use of a canvas can be handled as a blob, and converted from memory to a saveable PNG file

Some references:

  1. Headless browser with Selenium
  2. RealPython guide

Domain advice

[edit]

The images are zoomed using IIIF calls. Ophir fails to work this out, but if you use the browser inspector, look at the Network calls that supply the image tiles, then cut&paste one of those into Ophir, it will create the full size PNG.