The New Zealand Herald

Your personal data is traded by reckless idiots

Another week, a new leak by murky, careless brokers Sooner or later almost any service provider you use will leak data.

- Juha Saarinen comment

Another week, another massive data leak. This time, someone had left some 1.2 billion personal data records in open, internetex­posed databases hosted on the Google Cloud.

Of the records, over 622 million contained an email address, including my one. I know that as I have alerts from the excellent Have I Been Pwned? service enabled; if you haven’t signed up to HIBP, do so now.

Sooner or later almost any service provider you use will leak data and it’s good to know when that happens so you can change passwords or delete accounts on compromise­d hosts.

In this case, the leaking organisati­on isn’t known but the data came from two companies: People Data Labs (PDL) and OxyData.

The two provide a service called data enrichment. If you know someone’s name, or better yet, email address, it’s easy to look them up in giant databases and add more informatio­n on people.

That’s the enrichment and lots of companies collect data on you and trade it for marketing purposes. Adobe — yes, the Photoshop creator — is big on marketing and offers lots of specific lists with data on individual­s, as researcher Wolfie Christl pointed out.

PDL and Oxydata think the data comes from one of their customers but they don’t know who it was that left lots of personal informatio­n accessible to anyone with an internet connection; a fast one, as the data trove weighed in at 4 terabytes in size.

That’s pretty amazingly careless, but massive data leaks like that happen at regular intervals so nobody should be surprised.

Oxydata is pretty anonymous with no indication as to who’s behind the company on their website, which brags about 380 million profiles on individual­s.

PDL on the other hand is more open, naming its co-founder Sean Thorne on the site, and even offers free access for up to 1000 queries a month to its 1.5 billion records on individual­s.

For New Zealand, PDL holds 2,930,945 records in its database.

The data includes links to LinkedIn and Facebook profiles, birth dates, phone numbers, locations, work and personal email addresses and much more in some records, much less in others.

According to PDL, the informatio­n has been gathered from customers who volunteere­d it, and public data sources.

Think about that for a while: if like me you had never heard of PDL before, the next thing you wonder is who gave or sold them your data?

It’s free and simple to peek in PDL’s database with minimal coding required, so I did.

There’s plenty of detail on me in there, but not as much as I thought and one particular record looked like it had been merged with someone else’s data and was really wrong as a result; which is arguably worse as it could lead to mistakes and confusion with other people.

PDL holds far less data on Jacinda Ardern, but has almost as much on

Privacy Commission­er John Edwards (yes, I notified him about this) as it does on me. For Edwards and myself, the PDL data looks similar to what’s on our LinkedIn profiles which might be an indication of its origin. LinkedIn suffered a data breach in 2012, with 167 million account credential­s being leaked.

All the records I looked at have unique identifier­s and SHA256 computed hash values for email addresses.

I am going to go out on a limb here and say that neither Ardern nor Edwards have given their permission for PDL to aggregate their data for profile enrichment or other purposes.

I most definitely have not done that, and am waiting to hear back from PDL with an explanatio­n as to how they got the data, and an assurance that it will be deleted.

Even if the data is deleted from PDL’s database, it will be stored elsewhere though. By whom? By mysterious, irresponsi­ble organisati­ons that are completely clueless about basic informatio­n security I wager. I don’t want that to happen for obvious reasons.

Sure, the cat’s out of the bag on this one but the potential for misuse of data is massive and not just by surveillan­ce capitalist­s and hypertarge­ting marketroid­s. Spammers and phishers love data that make their scammy offerings look less bogus as well and scaling them up to billion-target levels must seem like a dream to criminals.

We need to think about what it means to have an unregulate­d personal informatio­n broking industry selling personal data on just about anyone cheaply and efficientl­y and ask ourselves if this is really what we want — or need?

At the very minimum, we should be told who holds our data and why, and given a chance to opt out.

That’s for organisati­ons operating above the line of course; the others need to be reported to the authoritie­s and closed down.

 ?? Photo / 123rf ?? It’s not known who leaked the 1.2 billion personal data records but it came from two data-enrichment sites.
Photo / 123rf It’s not known who leaked the 1.2 billion personal data records but it came from two data-enrichment sites.
 ??  ??

Newspapers in English

Newspapers from New Zealand