Linux Format


Version: GIT Web:


We’re constantly looking for more applicatio­ns that could help us optimise a huge collection of images gathered over recent years. We have already tried the promising Lepton and FLIF file formats and enjoyed their splendid compressio­n ratio (see LXF215 and LXF205), but frankly, even perfect data compressio­n doesn’t solve the problem of redundant files and general image overload.

The basic problem is that we have tons of photos, some of which are duplicates, while others are a series of nearly identical shots, e.g. when taking multiple group shots. Thus we need to detect and easily smash unneeded files. Surfing GitHub repositori­es reveals that there are many such projects based on computer vision technology known as OpenCV, but most of them were designed for Windows and introduce unresolvab­le troubles when you try to compile the code in Linux. Luckily there is a lighter yet very effective tool called Findimaged­upes. This is a small utility that hashes files and can instantly detect either identical or similar images. You control the degree of similarity by passing the desired argument.

The applicatio­n is a command-line utility written in Go, which is good news, because Go already has nice package management options that can fetch the code and compile it without any hassles. Once you install the package using your standard package manager, set the $GOPATH variable: $ export GOPATH=/path/you/choose

Select the destinatio­n, then install Findimaged­upes with: $ go get findimaged­upes

After a minute or two you’ll find the Findimaged­upes executable under the bin subdirecto­ry inside Go’s root. You can simply pass the path to your images as an argument to Findimaged­upes to make the applicatio­n analyse files and show up duplicates (if there are any). Use the -t <int> option to control similarity, where <int> should be in the range of 0 - 64, like this: $ findimaged­upes –t 22 ~/Images

Findimaged­upes remembers all hashes so it works very fast after chewing your files for the first time.

“Findimaged­upes remembers all hashes so it works very fast.”

 ??  ?? Eliminate redundant similar files and keep your image collection­s tidy.
Eliminate redundant similar files and keep your image collection­s tidy.

Newspapers in English

Newspapers from Australia