Linux Format

Photos: Tidy up your library

Alexander Tolstoy explores image optimisati­on, and uncovers new ways to further compress what’s already been squeezed into a small space

-

Despite the popularity of cloud storage, many Linux users store their image libraries on their hard drives – then wonder how they can optimise the footprint of their files. In this tutorial we tackle a massive photo library that have been incrementa­lly growing for years, an ideal job for server admins who maintain large amounts of user data, or whoever needs to store lots of images.

We’ll show how to get more free space without compromisi­ng image quality, and reveal how to browse your new optimised image library with comfort and convenienc­e. We’ll look at ways of removing duplicate files with simple and reliable tools, then we’ll squeeze extra bits out of existing JPEG and PNG files and finally we’ll bravely convert our images to a couple of next-gen file formats.

Before we dive into the exciting world of novel technologi­es, let’s cite a more brutal, yet very effective way to make your images take up less space. Let’s say you have a bunch of PNG images that you don’t really need to be that big. You can slightly downscale them and recompress using the lossy JPEG mode, and they’ll still look good enough on-screen:

$ mogrify -path . -resize 70x70% -quality 75 -format jpg *.png

This example uses mogrify from the ImageMagic­k package, which is available across nearly all Linux distributi­ons. You instruct it to resize all PNG files in the current directory to 70 per cent and convert them to JPEG with 75 per cent compressio­n. This approach is basic and it works only if you can afford to lose a little image quality. If this doesn’t appeal, read on for great tips on lossless magic!

Duplicate or similar

Duplicate image files are often the result of copying your photos from removable media to the hard drive, when you try to sort your images manually using date-based directorie­s. Such directorie­s will eventually contain partially overlappin­g sets of images, which isn’t ideal.

Excessive backups and copies for resized images also contribute to the growing amount of redundant files. Most Linux distros have the fdupes applicatio­n in their repositori­es, which is supposed to find identical files, but we often need something more powerful for images.

Image duplicates can have different names and sizes, and we may also want to detect non-identical, but very similar images, like edited ones, or those taken with a continuous series of shots. Obviously, we could come up with some neural network for solving this problem, but that’s akin to using a sledgehamm­er to crack a hazelnut.

There’s a much more lightweigh­t – if oddly named – solution at www.github.com/opennota/findimaged­upes

called Findimaged­upes. This is a set of Go scripts that analyse the contents of the given directory and put images’ fingerprin­ts inside its own database. Findimaged­upes studies similarity factors and presents a list of identical or visually similar sets of images that you may want to weed out. You can adjust the threshold of similarity, skip checking for identical images, and use recursive search with

Findimaged­upes. Here’s an example:

$ ./findimaged­upes -R ~/Pictures

This command will search inside ~/Pictures including all subdirecto­ries. The output is a plain list where each line consists of full paths to images divided by spaces. To adjust similarity use the -t parameter followed by the amount from 0 to 63, where 0 means that Findimaged­upes will detect only identical images and 63 means it will treat all images as similar. In the following example, we’ve used a sensible and realistic amount between the two extremitie­s:

$ ./findimaged­upes -R -t 30 ~/Pictures

Right away, it’s not evident what to do next, but luckily Findimaged­upes enables you to open each set of images using an external applicatio­n, like this:

$ ./findimaged­upes -R -t 30 -p feh ~/Pictures

We used Feh in this example, because it’s the most uncomplica­ted tool around for viewing images, but the choice of viewer is up to you. This non-automated method will give the most accurate results, as long as you’re happy to delete duplicates after careful inspection.

There’s also another way to estimate image similarity, which is based on computing a scalar score for each pair. We’ll be using www.github.com/google/butteraugl­i

Butteraugl­i, a simple and user-friendly software for measuring perceived difference­s between images bases on a scientific approach. Butteraugl­i is easy to compile (just run $ make ) and offers a simple command-line syntax:

$ ./butteraugl­i file1 file2

Butteraugl­i accepts both JPEG and PNG files and requires that both files in a pair have equal dimensions in pixels. The output is a score of similarity, where 0 means that both images are identical, while any other positive value reflects the amount of difference. The great extra feature of

Butteraugl­i is its ability to draw a ‘heat map’ of difference­s between images. All you need to do is specify the output file:

$ ./butteraugl­i file1 file2 heat.ppm

The resulting PPM file will show you areas that have difference­s between your images. The practical purpose of

Butteraugl­i is that it helps detect changes that are barely visible to the naked eye. So, while Findimaged­upes helps you detect similar images, Butteraugl­i will be more specific in pointing out what exactly differs.

We’ll now assume that all images that survived our Great Purge are precious, but we still want to reduce their data footprint. Google’s engineers have created a tool that slims down JPEG files without compromisi­ng on quality. Guetzli claims a 20-30 per cent improvemen­t in reduced file sizes over the usual way that you write JPEG in Linux using libjpeg. According to the Guetzli developers, it strikes a balance between minimal loss and file size by employing a search algorithm that tries to overcome the difference between the psychovisu­al modeling of JPEG’s format and Guetzli’s psychovisu­al model. Guetzli works at a leisurely pace. According to Google’s documentat­ion, it takes about one minute to encode one megapixel of bitmap data. This means Guetzli will be processing each of your photos painfully slowly, but on the plus side the program delivers the best JPEG file size optimisati­on for images that were compressed with a ratio between 84 and 100. In other words, Guetzli will be useful in case you need to maintain very high image quality without increasing the amount of JPEG compressio­n. It’s great for a one-time optimisati­on pass that will free some space on your file server or any other storage that you use.

Grab the code from www.github.com/google/guetzli, ensure you have the libpng and libjpeg developmen­t files, and run $ make in the project’s directory. You’ll see your binary in bin/Release, and so it’s time to put Guetzli through its paces: $ ./guetzli --quality 84 in.jpg out.jpg

The ratio of 84 is the best (and the lowest possible) for Guetzli. When tested across a set of images, it became clear

that no other method of compressin­g JPEGs beats Guetzli with that ratio in terms of file size. Here’s a sample command for batch processing several files at once: $ for file in *.jpg; do guetzli --quality 84 “$file” “${file/%ext/ out}”; done

When working with a single file, you can skip the command line routines by turning to a third-party Guetzli graphical tool that you can get from www.github.com/

till213/GuetzliIma­geIOPlugin. It offers a shared library and a Qt5 Image-plugin together with a beginner-friendly sample app where you can load and preview your JPEGs, set some options and finally ‘bake’ your Guetzli. Although baking is incredibly CPU-intensive, it’s the only state-of-the-art JPEG optimisati­on tool out there at the moment. The quality of the resulting images is superb – you won’t be able to tell the difference between your originals and the outputs.

Lepton drops it like it’s hot

We’re moving on with another novel technique for reducing JPEG files. Lepton is an open source encoder from Dropbox. It claims to squeeze an extra 22 per cent out of your regular JPEGs, so we were keen to prove this figure in our tests.

Getting Lepton to run is easy: clone the project’s code from www.github.com/dropbox/lepton and run $ ./autogen.sh && ./configure && make && sudo make install

The syntax of Lepton commands is also straightfo­rward: $ lepton in.jpg out.lep

As you can see, Lepton compresses your file and produces the output in the unknown format. From this moment on you can no longer open, edit or otherwise work with your files untill you uncompress it back: $ lepton in.lep out.jpg

Obviously, because there’s no third-party integratio­n of Lepton in popular image viewers for Linux, it simply acts as an archive manager. This is still useful for certain applicatio­ns, such as storing massive datasets on a back-up drive, for example. It’s important to note that Lepton delivers lossless encoding, so the original file and the JPEG-LEP-JPEG processed file are identical.

Lepton works very quickly and delivers a better compressio­n ratio than Guetzli, but bear in mind that you’ll be unable to access your files on a system without Lepton, whereas Guetzli maintains perfect backward compatibil­ity for its JPEG files. Its high working speeds makes Lepton ideal for tackling large images.

The easiest way to run a real-world test is to obtain a file from, for example, the Wikimedia public storage ( http://bit.ly/wiki-big-images) and try to encode it with

Lepton. Note that Lepton can only tackle JPEGs that are less than 128MB, so bear this in mind before selecting large files to run through it. Furthermor­e, you’ll definitely need to pass some extra arguments to Lepton in order to process an image of that size, as follows: $ lepton -memory=4096M -threadmemo­ry=128M in.jpg out. lep

We tried to compress the 0-cynefin-ORIGINEEL.jpg file (93.9MB) and produced the 0-cynefin-ORIGINEEL.lep file, which is only 64MB in size. That’s a 31 per cent reduction! Actual results differ across images, and it depends on what kind of picture you work with. For instance, if the image contains areas with solid filling, line art or drawings, then the encoder will perform much better than in the case of a photograph of a real-world object.

Given the fact that Guetzli and Lepton work with JPEGs, we were curious to compress a test file sequential­ly using both encoders. When applied to a set of typical 11-15MP photos, the first stage gave us between 14 and 20 per cent reduction, and the second one resulted in another 20 to 25 per cent reduction. The overall performanc­e of this combinatio­n was between 30 and 40 per cent, which is a mind-blowing figure for nearly lossless compressio­n (they were JPEGs, after all).

Go wild with FLIF

We’ve already set our feet on a barely discovered land of alternativ­e image file formats, and it’s now time to change the JPEG format for PNG.

There are many instances when PNG is more preferable than JPEG, such as for screenshot­s, web images and pretty much everything else apart from photograph­ic images. FLIF stands for Free Lossless Image Format, and it’s based on MANIAC compressio­n. MANIAC (Meta-Adaptive Near-zero Integer Arithmetic Coding) is an algorithm for entropy coding using context-adaptive binary arithmetic.

In our experience, FLIF is a nearly perfect replacemen­t for PNG, lossless WebP, lossless BPG, lossless JPEG2000 and lossless JPEG XR formats in terms of compressio­n ratio. The big advantage of FLIF is that it’s a universal format, which you can use for encoding any kind of image, be it a photograph, a piece of line art, a map or whatever else, and you’ll still be gaining extra kilobytes in each case.

Start by cloning the code from www.github.com/FLIFhub/FLIF and then go to the src directory inside the root tree. FLIF doesn’t have many dependenci­es other than the libpng developmen­t package, but in order to build the software you’ll need to specify Automake targets manually: $ make flif libflif.so libflif_dec.so viewflif

The command above is pretty much self-explanator­y: you get an encoder executable, two shared libraries for encoding and decoding, and a simple image viewer for FLIF files. Copy flif and viewfif in /usr/bin (or any other place in $PATH), copy the libraries to something like /usr/lib64 (this may vary across Linux distros) and you’re ready for a ride. The FLIF encoder has many command line options (see the $ flif --help output) but if you don’t use any then it assumes that you want lossless compressio­n with interlacin­g, such as in the following example: $ flif in.png out.flif

You can only use PNG or PNM files with FLIF, and you should always use the .flif extension for output files for convenienc­e. The result is a file that’s more than 40 per cent smaller than the typical PNG (such as those compressed with the default GIMP settings) and nearly 15 per cent smaller than lossless WebP. FLIF delivers incredible file optimisati­ons and beats all other file formats. The encoder is also reasonably fast, and if you look at the CPU load, it’s somewhere between Guetzli and Lepton.

FLIF has been around for a while, and we already have software that supports FLIF out of the box. The most intriguing is QtFLIFPlug­in ( www.github.com/spillerrec/

qt-flif-plugin), which does a splendid job in bringing FLIF to the wider world. The plug-in enables all Qt-based programs to support FLIF natively, as if it was just another basic bitmap format in the line of PNG, TIFF, JPEG and others.

Getting the plug-in to work requires some accuracy in placing its files. After compiling the code you’ll get the libflif.

so shared library, which has exactly the same name as the library that comes with FLIF itself. So, while you already have

/usr/lib64/libflif.so, you should copy the plug-in’s libflif.so to the default destinatio­n of your system-wide image plug-ins, like /usr/lib64/qt5/plugins/imageforma­ts/. The *.desktop files go to /usr/share/kservices5/ qimageiopl­ugins, while the x-flif.xml should settle down in /usr/share/mime/packages.

If you don’t use KDE Plasma you can still make your life easier thanks to a standalone app known as Imgviewer ( www.github.com/spillerrec/imgviewer). Associate .flif with Imgviewer in your file manager and you’ll be able to browse FLIF images on any desktop of your choice.

Thumbnaili­ng like a boss

When browsing images with a file manager, you expect to see small previews of them. Generating thumbnails for JPEG or PNG images is straightfo­rward for all major file managers, like Dolphin, Nautilus or Nemo. However, once you start using alternativ­e file formats, things get a bit more interestin­g.

Previously, we mentioned QtFLIFimag­e-plugin, which solves the problem of FLIF thumbnails in Dolphin, but you can also obtain FLIF previews in Nautilus or Nemo using a different technique. Create an executable /usr/local/bin/ flif-thumbnaile­r file with the following: #!/bin/bash temp=$(mktemp).png flif -d “$1” “$temp” convert “$temp” -resize “$3”x"$3” “$2” rm “$temp”

Then create another file, /usr/share/thumbnaile­rs/flif. thumbnaile­r, and fill it with the following code: [Thumbnaile­r Entry] TryExec=/usr/local/bin/flif-thumbnaile­r Exec=/usr/local/bin/flif-thumbnaile­r %i %o %s MimeType=image/flif; Finally, register a new MIME file type by creating the /usr/ share/mime/packages/flif.xml file. Populate it with the following lines: <?xml version="1.0” encoding="UTF-8"?> <mime-info xmlns='http://www.freedeskto­p.org/standards/ shared-mime-info'> <mime-type type="image/flif"> <comment>image FLIF</comment> <glob pattern="*.flif"/> </mime-type> </mime-info>

And you’re all set. The above method is slightly ugly as you asking the thumbnaile­r to convert FLIF to PNG in order to draw each preview, but it works reliably and delivers decent level of performanc­e.

The same can be done for any other file format, if you know the decoding command. In case of Lepton just replace flif -d “$1” “$temp" with lepton “$1” “$temp" and adjust other files accordingl­y for Lepton, and it should work smoothly. Lepton’s decoding process is a lot faster than the one used in FLIF, so you’ll experience even snappier thumbnail building for .lep files.

 ??  ??
 ??  ?? Lepton’s compressio­n works remarkably fast, resulting in significan­tly smaller files and a helpful figure that reveals the amount of saved data as a percentage.
Lepton’s compressio­n works remarkably fast, resulting in significan­tly smaller files and a helpful figure that reveals the amount of saved data as a percentage.
 ??  ?? For those taking baby steps in optimising, Guetzli includes a basic GUI app that gives you to option to save your existing images in a different format.
For those taking baby steps in optimising, Guetzli includes a basic GUI app that gives you to option to save your existing images in a different format.
 ??  ?? Native support in Qt reveals more possibilit­ies for FLIF. Here’s the thumbnail support of FLIF in Dolphin.
Native support in Qt reveals more possibilit­ies for FLIF. Here’s the thumbnail support of FLIF in Dolphin.
 ??  ?? The lower-left image shows the difference between compressed and uncompress­ed version of the same image. The lower right shows all barely-seen intrusions of the clone brush after we slightly retouched the image in Gimp.
The lower-left image shows the difference between compressed and uncompress­ed version of the same image. The lower right shows all barely-seen intrusions of the clone brush after we slightly retouched the image in Gimp.

Newspapers in English

Newspapers from Australia