ROTTEN RESTRICTIONS
Copyright is a funny old thing, and despite compulsory modules on media law that we were forced to take during our journalism qualification, we’re still not 100 per cent sure that we understand the implications of FreshRSS.
Copyright belongs to the person who created a work and remains with them until it’s assigned to some other entity. When this article was written, copyright belonged to the author. It was later assigned to Future Publishing for a financial consideration and, according to the terms of the contract, may not be published elsewhere without its consent.
If you put a PDF of this article on your own site, Future has a right to be annoyed and to send a pack of bloodthirsty lawyers to follow the reek of copyright infringement to your doorstep.
However, copyright can be a grey area in practice. No one is going to sue you because your browser makes a local copy of the HTML of a web page. Similarly, no one is going to sue you for using their published RSS feed to retrieve information.
It’s highly unlikely that you’ll be on the receiving end of a restraining order for using wget to pull full pages, either.
In reality, using FreshRSS to systematically strip a blog or news site of their fresh content for your own consumption is going to be overlooked. If it shows up at all in the access logs of the site admins, then it will be as a bot. It’s insignificant.
Where you may run into difficulty is if you share your pilfered full-text articles and blog posts with the world.
To avoid the kind of legal difficulties which inevitably accompany copyright violation on a massive scale, we deleted all the feeds we had so lovingly collated and repopulated the LXF server at https://fresh.lxf.by using only feeds from the Future Publishing stable. Pop along and have a read of titles as varied as Marie Claire, Cycling News, Ideal Home
and the GoCompare blog.
Additionally, in an attempt to ensure that the LXF server isn’t being scraped by search engines and competing for keywords with our SEO-optimised brethren, we checked to make sure that the robots.txt file (located in the p
directory) was set to reject all.
If you’re unwise enough to follow our example by opening up your instance to the world, we suggest you do the same.