PC Pro

When it comes to heavy industry such as cranes and trains, IoT needs to be as mature as the products it’s embedded in – plus why we need smart data.

When it comes to heavy industry like cranes and trains, IoT needs to be as mature as the products it’s embedded in – plus why we need smart data

- cassidy@well.com

You have to feel for those heavy, long-term industries. They’re still out there, trying to align themselves with the trends from the flashier, faster-moving business sectors. Which is how I found myself sitting in a giant warehouse/art installati­on in Milan, between a double-deck train carriage and an MRI machine, trying to make sense of Hitachi’s new IoT platform.

It’s called Vantara – and that’s about all I can tell you. There were no downloads, no review copies, no discussion on when the iPad version might be out. In my normal encounter with software companies looking to impress with their IoT offerings, this would be grounds for me finding a technical exec for cross-examinatio­n.

But here you need a different mindset. One that’s measured in years, rather than seeing how much the latest gizmo is going to sell for on Amazon. If we’re honest, the lifecycle of the current crop of poorly implemente­d, cheap IoT systems is probably a few years. That’s no good in industry. Integrate a poor-quality IoT product and you’ll shorten the lifecycle of the surroundin­g refrigerat­or.

Okay, we’re probably willing to replace something that’s crashed, or has been compromise­d in the domestic world. Not so much in the industrial or medical worlds.

I have friends who make a tidy living – if an occasional­ly downright dangerous one – revamping MRI machines suffering from marginal power supplies, because it’s definitely a good idea to keep an early, low-res MRI system going. You can use the low-res one on simpler cases and thereby bump up your overall capacity to scan. But the industry of rebuilding an MRI PSU is driven by the reassuranc­e that there’s in-depth support available for the IT-driven parts of the machine long-term.

In fact, that might be the way to draw a clear line. If it’s a throwaway device then the IoT can be expected to be of the same quality. If it has nuts and bolts and parts that unscrew, and a manufactur­er that sets a 20-year lifecycle, then the IoT has to participat­e in both longevity and modularity.

While I was sitting in the big dark shed in Milan, the UK mainstream press was obsessing about a train going into service with an expected lifespan of 27 years. It didn’t help that an aircon failure meant it delivered punters late on its first day of service.

But Hitachi is all about the long-haul, and looking at its various products on show in Italy, it was clear the company wasn’t in any rush to slap some cosmetic controller­s or trivial logging into its existing, highly automated tech products. Although, to be fair, thinking of a whole train – a quarter of a kilometre long – as a single “product” is a freaky idea. More so if all you’ve seen are throwaway CCTV systems and washing machines with last year’s Android tablet glued to them with a hot-melt gun.

It isn’t just about longevity, though. The jokey vulnerabil­ities that IoT suffers that people file under Schadenfre­ude are just a pale shadow of what’s to come if companies such as Hitachi nip out and hire the guys who made the CCTV camera whose footage you could see only on an iPad and IE6, or those who left internet-connected fridges open to spammers. An IoT takedown of a car is bad enough – but a train? Or how about a constructi­on site crane? Hitachi makes those too. (At this point, don’t Google “hitachi crane” – you’ll lose about half a day just looking at the size of the things. I want the one on the Dutch secondhand crane site, built in 1991.)

IoT at this kind of scale won’t be anything like what we’ve seen so far. It’s likely to be far more passive than the CCTV/fridge early efforts . Simple devices that spit out data, with the brains responsibl­e for doing something with that data being separated and hence upgradable.

That’s unfortunat­e, because smarter IoT reduces data volumes, while at the same time increasing hacking risk. It’s a damn shame that hackers make all this so difficult – because, as my lost afternoon shows far too well, once dealing with this kind of equipment people form fan clubs. If we have Eddie Stobart truck spotters then it would be cool for science classes to adopt a big crane, say, and read its data feed – rather in the way that you can track Swiss train movements or the whereabout­s of The Maltese Falcon (the ship, not the movie).

The culture of the hacker now has us continuall­y thinking about security and the kangaroo court of the data breach, rather than the benefits of sharing data. Much of this stems from people believing that public cloud servers and public data movements using open standards are the be-all and end-all. But that’s wrong: deploying IoT for heavy equipment will rely on hybrid deployment­s, with some data open for public access, but with the sensitive data tucked away in a special-purpose private cloud.

Which brings me back to Hitachi and the way it plays its cards close to its chest. Google “hitachi vantara” then try to stay away from crane and earthmover porn (I couldn’t find any MRI machine porn): this will give you some idea of the spread of services it has stuck under one brand. There’s the sensor bit, the cloud compute bit, and a data analytics bit that used to be an outside company.

Even at this early-days launch, Hitachi showed baffling industrial processes being controlled in real-time. There was a lot of hard-to-follow stuff about common formats for data, reusable libraries for visualisat­ion, and cross-developmen­t between previously separate divisions.

The one interestin­g omission was any reference to a standards body. One expects nerd projects to breed standards bodies like naked mole rats: how else are third parties going to use their products in a hybrid deployment? Surely they’ll need some certainty around the developmen­t roadmap? Infrastruc­ture of the type that Hitachi had dumped in this warehouse, including another amazing-looking crawler-crane-tilt- swivel thing, is only bought by very large institutio­ns that will inevitably have a broad spread of IoT investment, right across the varied range of their asset portfolio. Having your crane unable to talk to your dumpster would be rather irritating, to say the least. But bearing in mind the lifecycles of this kind of kit, I expect a standards body will pop up pretty soon.

Smart logging, at last

I don’t know about you, but I find that coughs and colds do terrible things to my concentrat­ion. This is by way of excusing my manners while listening to the CEO of NetScout, during the recent NetEvents tour of Silicon Valley. I try to limit my exposure to chief executives’ presentati­ons: they and marketing directors are restricted by either job role or shareholde­r relations in terms of what they can talk about. Plus, even in tech companies they can be quite remote from the basic details of how their products work. I like to chat with the men and women in the engine room; the soldiers on the battlefiel­d.

So, allegedly, I may have been nodding off. When I came round a little later, something was niggling at me in the back of my mind about “smart data”. Almost all of his slides and a lot of the words pointed towards smart data being an important concept, especially in the field of internal network traffic monitoring – which NetScout has some right to consider as its home turf.

The penny didn’t really drop until I remembered a completely different conversati­on I’d had back in the summer with a friend who runs a heavyweigh­t WordPress website.

“Smarter IoT reduces data volumes, while at the same time increasing hacking risk”

He was getting into a spot of computatio­nal bother, because he wanted to keep track of what was happening in log files of anything up to a terabyte in size.

If your basic documents are always in the region of 100KB to 100MB, then it seems absurd that a website – made of parts that fit neatly into that size range – could ever generate so much log file data. When logs become that big, someone isn’t doing their job right, surely?

No, is the surprising answer. The problem is in the format. Log files for websites are in human-readable text, quite often with some nod to the trials of being read in isolation. So the line of text you’re reading can have huge labels and descriptor­s around values or copies of data that are comparativ­ely tiny.

This style of presentati­on is pervasive, across far more job functions than its genesis might indicate, mainly because you’re likely to want to use a variety of analysis tools to find those trends or prove your point. And that demand has shaped the market pretty strongly in favour of syslog as a format for just about everything.

This is a bit of a shock for folk like me. For one thing, the decision about what’s “in the log” and what’s “data” has clearly gone too far over in favour of the first option. Hence my friend’s problem, of having to track what’s going on in the most recently written entries in a log that blazes along like crazy. This makes my comparison from a couple of issues ago about the Sorcerer’s Apprentice look more like Mickey Mouse has been strapped into a rocket at Cape Canaveral.

Logging data formats underpin all the big numbers you see on the internet, about the internet. Systems that run out of space, or are quoted as “facing performanc­e challenges”, are all trying to cope with immense volumes of data kept in humanreada­ble logs. The Internet of Things chats away in formats that are vendor- and device-specific, yet also open, so that any analysis tool can be used on them. And because log files are raw text, they aren’t readily used by several tasks at the same time.

In my friend’s case, he couldn’t get the real-time results off quickly enough, so that the web server jobs wanting to write fresh events to the log could be satisfied, before crashing.

All of which means I really should have paid more attention at the NetScout talk. The boss had the right idea: the days of verbose logging may have to come to an end.

Some of the inattentio­n on my part was driven by his essential CEO-ness and grand strategic presentati­on, but think about it: this is a grand strategic matter. The last revolution­ary work in this area was the birth of XML, courtesy of Sir Tim Berners-Lee and others at the W3C. XML was a long-time coming, and it’s hard to say whether the web would have been impeded had it not been born. By contrast, logging issues have slowly become a problem for everybody, and until now there hasn’t been a handy guru waiting in the wings with the right answer.

If that’s a bit planetary-scale for you, then think of it like this. When we all started to learn about databases and how to design them, everybody took great care to understand the difference between a text field (twelve) and a numeric field (12). One, you could do maths on; the other would have to be parsed first, should you have been dumb enough to misuse it in such a way. Parsing is an expensive business when considered computatio­nally: if you’re going to store all your maths fields as “twelve”, not 12, then the fans in your PC are going to whizz up a lot, and results will arrive slowly.

A log, in pursuit of being readable by a person, uses text almost to the complete exclusion of all else. This brings about the ludicrous situation in which a log of database transactio­ns can easily exceed the size of the database it’s logging.

In many ways, the notion of smart logging is a case of back to the database. Giving up on the plain-text readabilit­y is an opensesame decision. Then, all you have to do is figure out in advance what you want to know, and set up smart data capture rules for that.

Let’s say you need to know about 404 errors, or maybe mistyped passwords – who had them, when they had them, how many times they retried. A structured database can contain fields for those items, and the trick favoured by smart data designers for at least the past three decades, where every event is just a pointer back to a field containing the full content, actually helps to reverse the bloat trend.

Of course, this is a considerab­le simplifica­tion. Nobody can do or say anything until at least 2022 without sticking “AI”, “machine learning” or “smart” into every descriptio­n. In this case, quite a bit of machine learning can be put in to the early days of a smart data log-building process, since both the AI and the database get to see what kind of stuff they’re dealing with.

The bit where the humans get to stop reading a 100MB log file isn’t going to happen any time soon, because very often in looking at raw log files you’ll discover things you weren’t expecting to find. So I expect “smart logs” and “smart data” to appear alongside the old full-text logging – with very little space saved – for a good few years yet.

“Logging issues have become a problem for everybody, with no handy guru waiting in the wings with the right answer”

 ??  ?? ABOVE If Eddie Stobart can have fans, I’m sure there are people who’d love to access data from Hitachi’s array of heavy machinery
ABOVE If Eddie Stobart can have fans, I’m sure there are people who’d love to access data from Hitachi’s array of heavy machinery
 ??  ?? BELOW Suffice to say, Hitachi offers a wide spread of services
BELOW Suffice to say, Hitachi offers a wide spread of services
 ?? @stardotpro ?? Steve is a consultant who specialise­s in networks, cloud, HR and upsetting the corporate apple cart
@stardotpro Steve is a consultant who specialise­s in networks, cloud, HR and upsetting the corporate apple cart
 ??  ?? BELOW Hitachi’s new fleet of InterCity trains are designed for a 27-year life, so their IoT integratio­n better have a similar lifespan
BELOW Hitachi’s new fleet of InterCity trains are designed for a 27-year life, so their IoT integratio­n better have a similar lifespan
 ??  ?? ABOVE Need to find out what’s causing the error? It might take a while...
ABOVE Need to find out what’s causing the error? It might take a while...
 ??  ??

Newspapers in English

Newspapers from United Kingdom