Do you CVE what I CVE?

Wind River Systems’ David Reyna talks to Jonni Bidwell about keeping the embedded space safe – and also fixing robots on Mars.

2019-07-30 -

Wind River Systems’ David Reyna talks to Jonni Bidwell about keeping the embedded space safe, and also fixing robots on Mars.

building real-time os vxworks “It had to be modular and that was a challenge. You couldn’t just start with Windows and add stuff or take stuff away. You had to start from scratch and really be as small as possible.”

david Reyna, one of the Senior Technical Staff at Wind River Systems, is a veteran of the embedded computing world, with over 20 years of experience. He’s involved with Yocto Project (see tutorials

LXF251) and develops workflow and optimisation tools for Linux. His work began with digital radiography and most recently involves management tools for software vulnerabilities.

Wind River is a leader in the embedded space, being one of few companies that can claim its code is running on Mars. David gave a talk on this at the Linux Foundation’s Embedded Linux Conference in Edinburgh in October 2018, and was good enough to take the time to share some of his wisdom with our own suspected Martianin-residence.

Linux Format: Wind River has been around for a long time – almost as long as I have, in fact (blimey – Ed). Could you tell us a bit about the early days of the company, and how your real-time OS (RTOS) called Vxworks came about?

David reyna: Yep, we’ve been around since the ’80s. We were embedded since before embedded was a thing. We were in IOT before IOT was a thing.

I should say that I wasn’t around in the early days – I joined Wind River in 2000. But I know some of the history, so here goes… Wind River started at UC Berkeley, back when people were just starting with embedded systems. There were a few other companies doing similar work – Vertex was one of them and there were a couple of other small ones. The difficulty was that there was no operating system for embedded devices.

So Wind River decided to move into that industry and provide an OS with some key features. The main one was [being] real-time, that was a real distinguishing factor. It didn’t just need to run, it needed to have response, provability, latency, all those kind of things.

That’s why they started more or less in the real-time world. It had to be modular, and that was quite a challenge. You couldn’t just start with Windows and add stuff or take stuff away. You had to start from scratch and be as small as possible. LXF: Today we find Linux in all kinds of embedded devices… Dr: Yes! Isn’t that amazing?

LXF: It is, but alongside that there’s a number of people saying it’s not quite the right fit, particularly in the realtime setting. So there are a number of other RTOSES either out there already, Vxworks for example, or some in the works – Zephyr (see interview LXF247) Google’s Fuchsia (see feature LXF255) – that hope to be a better fit. Dr: At Wind River we obviously have Vxworks, but we also have Linux. We find that people want hybrid solutions. First of all, once systems are powerful enough, hardware and memory-wise, Linux is often the better solution – particularly if you need networking, interaction or GUI interfaces or other complex things like that. Lots of embedded systems are powerful enough to do that now.

So even now as the traditional operating system grows down into the embedded world, we’re growing up and into the bigger world. Linux is so powerful, it can get so much done. It’s difficult in the sense that the reliability, the provability, the real-time stuff isn’t there. That’s why these hybrid environments are becoming common, that’s where the world is going. Wind River, for example, has this whole thing where we mix systems with virtualisation. We have hard code for the real-time stuff that has to get done. For example, in auto-tainment systems the braking system is done by a RTOS, so you can prove to the government that everything’s going to be safe and sound. Then the whole infotainment side of things will be handled by Linux, because it’s so much faster and you’re not reliant on it – it’s allowed to fail, to be slow or whatever. You can get Linux to market so much faster, and it’s more than ‘good enough’.

LXF: I guess being powerful enough to do virtualisation and containers opens up a lot of possibilities?

Dr: Yep, that’s a solution between these two boundaries, RTOS and traditional. Containers too – we’re putting them in a lot of devices. It’s amazing what can be done these days.

LXF: Tell me a little bit about your

background. What were you doing before you joined Wind River?

Dr: I was in digital radiography. We made a system that was like television but for X-rays. We called it ‘reverse geometry’. The goal was to try and find better ways to do X-rays, to introduce more computer control, and just better technology in general. There were problems with backscatter messing up X-rays, so you reverse everything and do all kinds of other clever-looking stuff to fix them.

Anyway, it was all digital, so I was brought in to do the engineering and the software development side of things.

I did the embedded part and the GUI/ presentation part on, oh goodness, back then it was a [Motorola] 68000 processor. Also about that time the PC came about, and it turned out it was just powerful enough to do this stuff, and since it was cheap, you couldn’t not use it. So that’s where I started, with GUI stuff and with embedded stuff, writing some documentation too.

From there I moved into the embedded world, dealing with embedded servers, like CLA servers, web servers. We were a small company and Wind River acquired us. Now I’m in Vxworks and Linux, and I do a lot of helping with customer tools, lots of Eclipse-based stuff, Srtool for security management – the subject of my talk. We use the Yocto Project (www.yoctoproject. org) tools to help manage their builds and UI, and lots customer-experience stuff. That’s roughly where I’m focused. So there’s the hard-code engineers, but then there’s plenty of other things that need to be done, like making sure the documentation’s up to date and working with everything [see box, lower left, for more on this].

LXF: The Yocto Project is pretty important. Can you explain to our readers what it is and where it came from? I visited their stand earlier and got this furry mascot, if that helps.

Dr: It came from the open embedded world. Before, everything was built with the tools like Make, it being by far the most common. Make is very powerful, but it’s not designed for cross-building, it’s not designed for scaleability. We have to do scaleability, we’re not just making a thing. We’re making stuff that has to fit in very, very small things and also very big things. We have to cater to displays, to networking, autonomous, automotive, lots of different markets.

Traditional tools like Make just weren’t cutting it, so a number of people said “This isn’t working, we need something more powerful”. They started to make – ha ha

– a replacement called Bitbake, designed to do all those things better and to be modular and scaleable and powerful. The challenge is when you have all that power, you have to learn how it works. If you don’t need that power that’s fine, just stay with the simple systems, like Buildroot or whatever. Those are powerful enough for a lot of stuff, and there are other build systems too.

But Yocto Project on top of Openembedded on top of Bitbake, that’s how we’ve scaled for large corporations like Wind River and Metrographics and for people who use it directly. That’s how we maintain the complexity we need to service our customers and to do multiple releases per year and to address multiple markets. You really need that

power. In fact we contribute a lot of our knowledge to from the hard lessons we learned meeting the needs of our customers. So it’s not just something made up, it’s driven by competing and surviving in the market.

LXF: Wind River products are found all over the place. I believe some are even on Mars?

Dr: When someone asks me “Well, what do you guys do?”, I say “You know those Rovers crawling about on Mars, Curiosity – who’s alive and well – and Spirit and Opportunity, which are no longer active… that’s our operating system”. We were adopted early on by the military because we were in the embedded space, as were they, and we’re also involved with realtime and provability. Those are exactly the kinds of things that NASA needs.

So we were involved with a lot of satellites, orbiters, landers and other unmanned systems, because we filled that need. And it works. They used our radiation-hardened processors, they actually used these really old ones because that’s what they had at the time, but their lifespan is about thirty years – some are still going. One of our famous stories is about a live update – I dunno if you remember this? There was a satellite, Pathfinder 1, heading off to Mars and it kept crashing. It turns out there was something still running that shouldn’t be – it was gathering data and would fill up the system and cause faults.

So they grabbed some engineers, I actually know one of them, and said “We gotta solve this!”. So they got a simulator and figured out there was a problem with one of the filesystems. They managed to work out a solution and test it locally, and then, while the satellite was somewhere in between here and the moon, they sent it via live update, and fixed the problem. We did a live update in the middle of space [read more about this at http://bit.ly/ lxf253mars].

Stuff like that, if you have to write operating systems designed for that kind of thing, you need the right partners. NASA definitely know what they’re doing, so that partnership worked. So we’re really proud of that.

LXF: On the subject of fixing things, your talk “Keeping up with the Joneses (CVES)” was about keeping embedded Oses vulnerability-free [see the PDF at http://bit.ly/lxf253jones]. On desktop/ server Linux this is generally just a matter of keeping packages up to date, and I guess trusting your distribution to downstream fixes in a timely manner. But on embedded Linux you don’t necessarily have that luxury, so how do you keep up to date?

Dr: Well, there are two answers to this. The first is we’re still Linux, just like Debian and Ubuntu, but we actually have more control over our distribution, because our footprint is so much smaller than that of a desktop. So we have more control over the components involved, which is a bit of a luxury. But it’s the same problem as for desktop Linux – we still rely on open source. We rely on thousands of packages to make a working system and each of those can be vulnerable.

Here’s a strange observation: it tends to be the very old and the very new packages that have vulnerabilities – the middle guys seem to have it worked out, either because they came into the world when CVES were bad enough that you really had to fix stuff, or they’ve already been patched. But for really new stuff you have to be vigilant, you have to have multiple tools because, like all things security, one solution doesn’t solve all the problems. Just as there are many ways to cause a vulnerability, so there are many ways to protect against it. There’s a lot of talk about the scanners – commercial scanners that scan for signatures of particular patches; those can alert you to vulnerable packages. There’s a lot of data, that’s what the CVE database is all about, trying to capture vulnerabilities on an industry level. You can scan your own systems and see how that matches with what they’ve discovered.

There are a lot of tools based around that idea. Then there’s also vigilance – running real-time tests, patches and reproducers. That’s the ideal world, having all these tools at your disposal. But like any kind of scanner, for viruses or malware or whatever, if you don’t know about it you don’t know how to look for

how to fix your satellite in orbit “They managed to work out a solution and test it locally, and then, while the satellite was somewhere in between here and the moon, they sent it via live update, and fixed it.”

it. And if it’s really new, you may have no information, no reproducer, not even a specific package or package version… that’s when it gets really scary. You need intelligence, you need people, you need clever, flexible, heuristic solutions. That situation also doesn’t have the awareness of others.

When you have, say, a Spectre/ Meltdown scenario, people know they have to fix it, and they’ll fund it because they have to. But managing all that, scanning all these CVES, going into detail

to know whether or not it applies, that’s hard work. I gave an example in my talk where a defect was listed as an Android CVE, but it was really a kernel defect that just happened to get found in Android first, and nobody mentioned that it applied elsewhere. You have to pay for that level of vigilance, if you want to have a system that customers can rely on and not hit you with escalations and lack of trust. Without awareness though, that cost is not contained, so you need to recognise costs, and what we’re trying to do with this particular talk is to address that gap. So I mention that people will fund the defects but they won’t fund writing of tools around it or otherwise assisting it.

LXF: A lot of this workload must come from projects fixing things in an ad hoc way, which might be okay for a small project, or might’ve been okay in the past. But projects grow and time maketh fools of us all…

Dr: Yes, systems do become legacy. Say there’s a system that was running four years ago when there were two or five hundred defects here: “OK, that’s a small incremental overhead, so I’ll just do it”. They may have a system that was on that level. I’m sure a lot of systems are based on spreadsheets, they’re based on email, they’re based on documents, because that was sufficient to meet the problems. But here we are in 2018 with 14,000 CVES so far and the year’s not done yet. So it’s finally become a big enough pain that we have to do that, to really ensure that we’re catching that stuff, and providing tools to make it faster and easier.

Because again, the cost is to the people that you most need for other stuff. It’s to the engineers who shouldn’t be scanning stuff when it could be automated and the answers got to faster. We need tools to track all this stuff and do reports. The amount of money we lost over lead engineers doing a lot of customer calls, doing a lot of reports to management, and always regarding the same information because that information changed incrementally, and needed to be constantly regathered…

So we have tools now where you can just click a button and see “This is the state of that CVE in that product”. I’ll also say part of the problem is not so much “What am I vulnerable to?”, but rather “What am I not vulnerable to?”. Let’s say 30 per cent of those 14,000 CVES apply to us, then that means 70 per cent do not. And it’s important to know that, because customers have to have the whole picture. The existing tools will tell you what you are vulnerable to, but not what you aren’t. You might think, “Well, I don’t care, I’m not vulnerable to it”, but the customer cares because they have to know. Something in future might change and that ‘not vulnerable’ might become ‘vulnerable’ once people understand it

the importance of automation “The engineers shouldn’t be scanning stuff when it could be automated and the answers got to faster. We need tools to track all this stuff and do reports.”

better, or discover some new implications. So you need to be able to say “I wasn’t vulnerable, is that still the case?”. That’s management, and that’s why I got involved, to make these tools that really help people out.

LXF: I went to one of the Zephyr talks yesterday and they were talking about commercial code safety certifications. It’s not something I fully appreciated, but they use all kinds of crazy expensive tools, complicated tests and every line of code has to be audited. I guess Vxworks has to worry about this stuff too?

Dr: Zephyr actually came from Wind River, and now it has grown into its own thing and we’re happy about that. But they’re trying to find the same solution as Vxworks is, just for even smaller embedded devices. Provability, that’s the word they use for safety-critical stuff – where you have to do those academic, line-by-line tests so you can be sure it’s going to work, every time.

You run every single code path and use every single tool against it, eg Coverity, all of those things. You can’t rely on any one solution if you really want to prove it. Wind River is actually doing a lot of that certification because Vxworks can do that, Zephyr can do that… Linux? No way, it’s impossible. There are too many contributors, it’s too complex, there’s too many millions of lines of code.

LXF: Tell me about the tool you talked about in your presentation, Srtool. Dr: That’s one of my little contributions.

?? ?? “So this coder walks into a bar and says…” — “So this coder walks into a bar and says…”

?? ?? “…Hey pal, can I have a double please?” — “…Hey pal, can I have a double please?”

?? ?? “And the bartender says, Get outta here!…” — “And the bartender says, Get outta here!…”

?? ?? “…This is a private function.” (Oh good lord – Ed) — “…This is a private function.” (Oh good lord – Ed)

Do you CVE what I CVE?

Wind River Systems’ David Reyna talks to Jonni Bidwell about keeping the embedded space safe – and also fixing robots on Mars.

Newspapers in English

Newspapers from Australia