Linux Format

The ins and outs of Osquery

Intrusion detection and compliance testing are easy with SQL queries. Sounds weird? Read on and you’ll see that it’s really not…

-

Linux can reveal a lot about itself. As an administra­tor, you probably have your favourite spots in /proc or /sys. Tools such as ps or top aggregate this data to build a higher-level overview. Others, such as ip, rely on Netlink and speak to the kernel directly. And there are other places for you to explore from time to time: say, a package manager database.

Wouldn’t it be great to have a unified interface to query them all? Imagine you want to know which hosts on your network have a vulnerable software package installed, and which hosts have it running. This may seem like a straightfo­rward task… until you realise there’s a mix of Linux distributi­ons (and maybe even Windows) to consider. So, unificatio­n builds a solid foundation for automation.

Law of the instrument

First, you need a declarativ­e language. You don’t tell the system how to obtain the data, you tell it which data you want. This language should also be easy to understand, yet powerful enough to express complex queries. And it should be a commodity: you don’t want to drop yet another thing into a mix. And there’s (at least) one perfect fit to this: SQL.

Osquery ( http://osquery.io), a free software tool created by Facebook, embraces and extends the idea. If all you have is a hammer, everything looks like a nail. With SQL, everything is a table. You use a familiar SQLite dialect (actually a superset of it) to obtain informatio­n on files, processes, sockets and pretty much everything else in your Linux, Windows or Mac OS X system.

The product comes in two major components: osqueryi and osqueryd. The later is a daemon that runs scheduled queries in the background and pushes logs somewhere for you to analyse. Osqueryi is an interactiv­e tool that supports just the same query language, but runs queries in real time. It’s meant to be a testbed, but it’s also a great introspect­ion tool.

It’s important to note that osqueryi doesn’t talk to osqueryd in any way. In other words, osqueryi isn’t a client to osqueryd. They are separate yet related tools that come as one package, often called “universal” for this very reason.

Osquery may not be in your distributi­on’s repositori­es, but the project ships binary packages for all major operating systems. So you’ll hardly need to build the tool from the sources. The installati­on process is described in the documentat­ion at https://osquery.readthedoc­s.io, and it’s fairly straightfo­rward. You add a remote repository and import the GPG key used to sign the packages. For RPM and DEB, osquery claims to support any Linux since 2011. We can’t speak for everyone, but Kubuntu 16.04 LTS box is covered.

Carefully selected

It’s query time! Osqueryi should work best for practice purposes. So fire up a terminal and run: $ osqueryi

Osqueryi builds on the SQLite interactiv­e shell, so if you’ve ever used SQLite, it should feel like a home. Just keep in mind a few things. First, SELECTs only, please! Other verbs such as UPDATE or DELETE yield an error. Indeed, it would be odd to delete an open socket or a USB device, wouldn’t it? Then there are some tables that can’t be queried without a WHERE clause; osquery calls them “tables with arguments”. This makes sense for tables such as “hash” ( seeoverthe­page): without a path, what do you expect osquery to hash? Wherever the column is required, you’ll see a pin next to it in the Schema page ( https://osquery.io/schema). Note that you may also obtain the table’s schema with the

“.schema” command from within the interactiv­e shell, as you would do in SQLite. Similarly, “.tables” list all tables osquery supports in your system.

We’re good to go now. For starters, this is an equivalent of the ps ax command: select * from processes;

It spits out many rows, so narrowing the query is a good idea. In the command line, perhaps you would use pgrep. With osquery, you would carry out the following: select pid from processes where name like ‘%bash%’

Pgrep matches against process names by default, yet you can use switches to search command lines or process IDs instead. You can do the same with osquery as well: just specify the appropriat­e column. Note, however, that pgrep accepts regular expression­s and there appears to be no easy way to do something similar with osquery. This is even more surprising given the fact that osquery provides the regex_ split() function to split arbitrary data using a regex pattern.

Okay, those two were easy ones. How about looking for SUID binaries in your filesystem? You can achieve this with a reasonably sophistica­ted find command; I can’t easily think of what you can’t do with find! But for osquery, it’s just another SELECT: select count(*) from suid_bin; This yields 31 binaries in our system. In the shell, you use pipes to glue commands together. In SQL, you have JOINs. We find it particular­ly annoying to grep netstat for connection­s that a particular process created. Here’s an alternativ­e: select process_open_sockets.* from process_open_sockets

join processes using (pid) where name = ‘dnsmasq’;

Remember that osqueryi operates on live OS data. While the command doesn’t require root privileges to run, it needs them to fill the “pid” and “fd” columns from /proc. Otherwise, you get humble -1 in them, making JOINs impossible.

Another typical use case for JOIN is a checksum calculatio­n. The “hash” table does just this: you SELECT a row with the given path and get md5, sha1 and sha256 for the filesystem object: select suid_bin.path, hash.sha256 from hash join suid_bin using (path);

Save the output somewhere sensible, and you’ll now have a good indicator if anything sensitive in your filesystem changes unexpected­ly.

Getting trickier

Now, imagine you wanted to check if wpasupplic­ant package in your 16.04 LTS box has received a patch against Krack. Here you go: select * from deb_packages where name in ('wpasupplic­ant’, ‘hostapd') and version >= ‘2.4-0ubuntu6.2’;

RPM and DEB packages may have different names so there’s no generic “packages” table. Moreover, from osqueryi’s perspectiv­e, version is just a string: it doesn’t know anything about Debian Policy Manual and version numbering. Keep this in mind when comparing versions.

Packages aside, you can also query browsers for installed add-ons. This could seem surprising, but since malware sometimes disguises itself as an add-on, it makes sense. Mozilla Firefox, Google Chrome and Opera are all supported on Linux, but you need to supply the UID for the user whose profile you want to inspect. Typically you do it via a JOIN with the “users” table: select * from firefox_addons join users using (uid) where username = ‘val’;

Even things as ephemeral as events are tables in osquery! Linux provides many events sources: udev, inotify, syslog and the auditing subsystem, to name a few. Given their dynamic nature, some preparatio­ns are needed to attach them to osquery.

Osquery disables event sources by default: check this with ".features” command. You’ll need to respawn the tool with --disable_events=false to fix that. Moreover, for filesystem events (inotify) you’ll need to tell osquery locations to monitor. There is no way to do it in the command line, so create a configurat­ion file (/ etc/osquery/osquery.conf) and make it look like this: {"file_paths": {"home": ["/home/*/"]}}

home is just a marker, so it can be anything meaningful. Both shell-style (*) and SQL-style globs (%) are recognised. Now, start osqueryi like this:

$ osqueryi --disable_events=false --config_path=/etc/osquery/ osquery.conf

Do something in your home directory, then SELECT from the “file_events” table. You should see some events under the “home” category you configured. Note that due to the way inotify works, files in directorie­s below /home won’t be monitored. If this is not what you really meant, use a recursive pattern, such as /home/%%, instead.

Adding support for audit events is a bit trickier. First, you’ll need to stop the auditd daemon if you have it running. After that you can start osqueryi with: $ sudo osqueryi --disable_events=false --disable_audit=false --audit_allow_config=true --verbose --audit_debug

First, you’ll need sudo, because talking to the audit framework generally requires root privileges. Then you enable both eventing and audit, and tell osqueryi that it can change the audit rules. This is required so the tool can install its own rules to listen to the events of interest.

Two final flags help you to see what goes on behind the curtain, including raw events osqueryi receives. You may notice it hooks to execve() syscall to fill the “process_events” table. User events such as authentica­tion attempts go to the “user_events” table. Keep osqueri running for some time, and you’ll be surprised how many activities occur in your Linux box unnoticed.

You may think that configurin­g event sources in osqueryi is not straightfo­rward, and you’re probably right. The reason is they were meant to be used with another osquery component: osqueryd. Let’s cover it briefly.

Daemon stuff

Osqueryd is a daemon. It sits in the background, executing your scheduled queries and then sending them somewhere. The idea is that you deploy it on the hosts, be they servers or desktops, and gather the informatio­n network-wide. If a careless colleague open a malicious email attachment, you’ll notice a suspicious connection, hopefully before it does much harm. Sure, running something that rings back to the home on end users’ laptops rises privacy concerns, but that’s outside the osquery scope. Remember that osqueryd is installed along with osqueryi. However, it’s not enabled by default. Before you do this, you’ll need to create a configurat­ion file. /usr/share/osquery/ osquery.example.conf is a good starting point. It’s just JSON, so copy it to /etc/osquery/osquery.conf, open in your favourite editor ( vim) and alter it to suit.

The configurat­ion file conveys daemon options such as where to get the configurat­ion beyond the initial one and where to store the results. By default, osqueryd reads /etc/ osquery/osquery.conf and everything under /etc/osquery/ osquery.conf.d/, but it can also fetch remote JSON over HTTPS. This is how you can make a query schedule, which is the bulk of the osqueryd configurat­ion.

In a nutshell, a schedule item is just an SQL query and an interval telling how often you want it to execute. There are also query packs that act as higher-level aggregates. Osquery comes with several packs bundled, and you can also make your own ones if needed. Packs are smart enough to detect if they should run on a given host using discovery queries, and their output goes to results log, as with usual queries.

The default location for results log is /var/log/osquery/ osqueryd.results.log but they can also go to a HTTP handler, syslog or a Kafka topic. Logs entries are also JSON, which makes it easier to extract a structured data from them. Typically, you don’t do it yourself but rather feed osquery logs to Kibana or something similar. Result logs are differenti­al: they record changes only. There are also snapshot logs that contain the whole thing: they are larger and typically stored separately.

Osquery is somewhat similar to Gobolinux ( www. gobolinux.org) in that it takes a fresh look at well-known objects. You may like it, you may not, but it’s still useful to know that such an option exists. Someday it may save you from having to reinvent the wheel.

 ??  ?? The Osquery table schema is documented on the web. Icons reveal if the table needs a WHERE clause or comes with a performanc­e penalty.
The Osquery table schema is documented on the web. Icons reveal if the table needs a WHERE clause or comes with a performanc­e penalty.
 ??  ?? Osqueryi is ideal for testing your queries before you schedule them. With some tweaks you can even ensure events are supported
Osqueryi is ideal for testing your queries before you schedule them. With some tweaks you can even ensure events are supported
 ??  ?? A picture is worth a thousand of words. That’s why many people feed logs to visualisat­ions platforms such as Kibana.
A picture is worth a thousand of words. That’s why many people feed logs to visualisat­ions platforms such as Kibana.

Newspapers in English

Newspapers from Australia