Linux Format

Build a software analysis pipeline

In the second part of our web applicatio­n security series, Tim Armstrong takes us through the essentials of software compositio­n analysis.

-

In the second instalment of our

Web Applicatio­n Security series, Tim Armstrong takes us through the essentials of Software Compositio­n Analysis.

Software developers the world over have a hardenough time maintainin­g and securing their own code, so it’s fairly common for the libraries and docker containers used, especially in large projects, to be a few versions behind. When was the last time you actually audited 100 per cent of the code for all of the software used in any of your projects. Never, right? You don’t have time, you’re not an expert in every language, and by the time you were done you’d need to do it all again. Software compositio­n analysis (SCA) solves this problem by effectivel­y doing this for you.

In this tutorial, you’ll learn how to use a number of SCA tools to protect your code by extending the CI/CD pipeline created in the first part of this series, where we learned about static analysis and setting up a pipeline in GitLab CI. You can get a copy of where we left off by forking the repository at https://gitlab.com/ plaintextn­erds/web-app-security-tutorial1-lxf279, but we highly recommend picking up a copy of the previous issue and following that first if you can.

SCA tools such as Snyk, WhiteSourc­e, Gemnasium and Dependabot scan your dependenci­es and containers for vulnerable versions, with the goal of either updating it for you via a pull request (PR) or notifying you of the issue. Each of them works in slightly different ways, uses different databases, and presents the data in different ways, so finding the right one for you takes some exploratio­n. To this end, this tutorial will be looking at Dependabot which is open source, and Snyk which is reasonably priced and offers a free option for individual­s and open source projects.

Snyk it to them!

Snyk is a hosted solution, so to get started you’re going to need to create an account. You can do this by going to https://app.snyk.io/login and selecting the identity provider of your choice. There isn’t a direct registrati­on option, which could be an issue for people who don’t trust any of the providers listed, but the selection is pretty big so it shouldn’t be a problem.

Next, you’ll be presented with the option to select the location of your source code. On this landing page the choices are GitHub and Bitbucket, but because this tutorial is using GitLab you’ll need to click the full “list of integratio­ns” link. From here you can select GitLab.

Because of the nature of what Snyk is doing, in order to get it working you need to give it a personal access token with API Scope privileges. This is the highest level of privilege that you can grant a token in Gitlab. So if you’re working with confidenti­al code (in other words, it’s not an open source project) it’s best to set up a dedicated account for it so that Snyk is acting as its own user. This is best practice when dealing with any thirdparty integratio­ns in case the API key is leaked somehow and you need to identify unauthoris­ed modificati­ons easily.

To create a personal access token in GitLab go to https://gitlab.com/-/profile/personal_access_tokens

(GitLab account required) in a new tab. Give the token a name in the Token Name field, check the box to grant it API scope, and click ‘Create personal access token’. Confusingl­y this inserts an element (that contains the token) into the page just above the section where you entered the details. This only shows up once, so if you refresh or leave the page before you’ve copied it you’ll need to delete the token and recreate it.

Now that you have your token, head back over to the Snyk tab, paste it in the box and hit Save. Now you’ve got it linked you’ll need to add the project to Snyk, so go ahead and hit the button, which will take you to a page where you can select any of the projects the GitLab user has access to. Select your fork of the Web App Security tutorial code and click the ‘Add selected repositori­es’ button. This should find the requiremen­ts.txt file and start scanning.

Looking at the results from the scan you can see that the version of Django used has a known vulnerabil­ity – specifical­ly a SQL injection pathway known as CVE2021-35042. This vulnerabil­ity was found in the time between the time of writing the previous tutorial and this one, which exemplifie­s the importance of having good SCA tools in your pipeline!

Plug the vulnerabil­ity gap

Hitting the ‘Fix this vulnerabil­ity’ button finds the smallest upgrade that resolves the vulnerabil­ity. If it’s a major version (assuming semantic versioning – ..) bump then Snyk lets you know that there’s a potential breaking change, so you’d better check the release notes.

Alternativ­ely, you could wait for Snyk to create the Merge Request for you when it next does its scheduled scan. Every time it does a scheduled scan Snyk automatica­lly creates a Merge Request (if one doesn’t exist already) for any problems found.

By default, Snyk scans your Repo once a day (and once per Merge Request) which is pretty helpful because it means that, unlike a pipeline workflow, the Merge Requests are still being created even if you haven’t worked on a project in a while.

Hopping over to the Dependenci­es tab and hitting All Dependenci­es shows that not only is Snyk detecting the dependenci­es that you’ve defined, but also the dependenci­es of those dependenci­es (so-called transitive dependenci­es) by constructi­ng a graph of each library’s requiremen­ts. This ensures that you’re also protected for issues that are deep in the tree. When using pip and requiremen­ts.txt this is more of a best guess solution though, so using a locking dependency manager like Poetry or PIPEnv can improve the reliabilit­y here by providing Snyk with all the informatio­n it needs to know exactly which versions you’re using.

That old Dependabot

So Snyk is certainly useful, but perhaps you work at an organisati­on where giving a third party complete access to your source code isn’t acceptable for whatever reason. This is where Dependabot comes in, because it can run as a standalone in your CI/CD pipeline where you can keep everything isolated.

A short while after the original Dependabot was bought out by Microsoft’s GitHub, a GitLab-flavoured fork (https://gitlab.com/dependabot-gitlab/ dependabot) was created by Andrejs Cunskis which has since been sponsored by JetBrains. There are a number of supported ways to get Dependabot up and running, but in this case you’ll be needing the standalone mode because it has to be built into the CI/CD pipeline. To do this the first thing you’ll need to do is create a folder called .gitlab in the project directory. In that folder create a file called dependabot.yml.

For this project, the minimum that you’ll need to define in the dependabot.yml file is

version: 2 updates:

- package-ecosystem: “pip” directory: “/src” schedule:

interval: “daily” ```

While required by the file spec, the schedule

directive isn’t going to limit the run to once a day when using Dependabot in the Ci/CD pipeline (despite being set to “daily” ).

Next, you’ll need to provide Dependabot with a Personal Access Token with API scope credential­s –it’s exactly the same process as with Snyk. You’ll need to go to https://gitlab.com/-/profile/personal_access_tokens

in a new tab. Give the token a name in the Token Name field, check the box to grant it API scope, and click ‘Create personal access token’. This will insert an element (that contains the token) into the page just above the section where you entered the details.

Copy this token and head back over to the project, then select Settings>CI/CD and under the Variables section click the Add variable button. Paste the token into the Value field, set the key to SETTINGS__GITLAB_ ACCESS_TOKEN and ensure that both the ‘Protect variable’ and ‘Mask variable’ boxes are checked. Then press the ‘Add variable’ button.

These checkboxes tell GitLab to redact the variable if it’s detected in CI/CD logs and to only provide it when

the CI/CD Pipeline is running on a protected branch. This is important because failing to protect this variable like this would mean that anyone who can push to your project (such as external contributo­rs) could get a hold of your token, and use it to do whatever they wanted through the GitLab API as if they were you.

Finally, you’ll need to update the .gitlab-ci.yml. To the stages section add the line - compositio­n-analysis and then below that you’ll need to add the .dependabot­gitlab template, which is as follows: .dependabot-gitlab: stage: compositio­n-analysis image: name: docker.io/andrcuns/dependabot-gitlab:0.4.4 entrypoint: [""] variables:

GIT_STRATEGY: none

RAILS_ENV: production SETTINGS__STANDALONE: “true” SETTINGS__GITLAB_URL: $CI_SERVER_URL only:

- main

- merge_requests before_script:

- cd /home/dependabot/app script:

- bundle exec rake “dependabot:update[$CI_ PROJECT_PATH,$PACKAGE_ MANAGER,$DIRECTORY]”

Then add the dependabot-pip job as follows: dependabot-pip: extends: .dependabot-gitlab variables: PACKAGE_MANAGER: pip DIRECTORY: /src

When you’re all done the file should look something like this: stages:

- static-analysis

- compositio­n-analysis .dependabot-gitlab: stage: compositio­n-analysis image: name: docker.io/andrcuns/dependabot-gitlab:0.4.4 entrypoint: [""] variables:

GIT_STRATEGY: none

RAILS_ENV: production SETTINGS__STANDALONE: “true” SETTINGS__GITLAB_URL: $CI_SERVER_URL before_script:

- cd /home/dependabot/app

script:

- bundle exec rake “dependabot:update[$CI_ PROJECT_PATH,$PACKAGE_ MANAGER,$DIRECTORY]” dependabot-pip: extends: .dependabot-gitlab variables: PACKAGE_MANAGER: pip DIRECTORY: /src . . .

(Where the ... is the static analysis jobs from the previous tutorial)

Finally, git add the .gitlab-ci.yml and the .gitlab/ dependabot.yml files, then git commit -m “Added Dependabot SCA stage” and git push the changes up GitLab.

What’s interestin­g to note here is that the variables in the .gitlab-ci.yml file that point to the same values as the ones in the dependabot.yml are selectors for that configurat­ion. This means that if you wanted to extend this to support scanning a Docker Container you would need to add a directive to the dependabot.yml like in the following code:

- package-ecosystem: “docker” directory: “/” schedule:

interval: “daily”

and a selector to the gitlab-ci.yml:

dependabot-docker: extends: .dependabot-gitlab variables: PACKAGE_MANAGER: docker DIRECTORY: /

which would then ensure that your docker files are kept up to date with the latest security patches.

Dependabot or depend-on-Snyk?

Out of the box, this Dependabot implementa­tion is going to check for dependenci­es with known vulnerabil­ities on every commit to the ‘main’ branch and any commit to a branch referenced in a Merge Request. If there’s a patch required and there isn’t an open Merge Request created by it to fix the issue, then it’ll create Merge Requests containing the minimum possible change.

It’s also possible to configure a scheduled pipeline run to trigger the Dependabot scan jobs periodical­ly. This means that, just like Snyk, Merge Requests will still be opened to keep you up to date even if you’re not actively working on the project at the time.

No doubt as the original Dependabot gets more integrated with GitHub, this will diverge from the GitLab fork, which is likely to remain an outside project with low integratio­n despite the maintainer now working at GitLab. This is because GitLab has been working on its own fully integrated paid solution, Gemnasium, since 2018. Unfortunat­ely however, at the time of writing GitLab doesn’t offer a free tier of Gemnasium, so if you wanted to use it for your open-source projects then you need to build everything from source and set up a similar pipeline to the Dependabot one that’s shown in this article.

Signs of the Dependabot divergence are already starting to be visible, with the GitHub integratio­n now

being a single click operation and the inline reports in its Pull Requests providing informatio­n about the vulnerabil­ity along with a “Compatabil­ity score”. GitHub’s version of Dependabot and the tooling being built around it is becoming a significan­t threat to GitLab’s claim of being the “leading integrated product for the entire DevOps lifecycle”. GitLab’s failure to offer free access to the whole stack for open source projects could cost them a lot of marketing power.

It will be interestin­g to see whether or not GitLab will allow the continued work on the Dependabot fork since hiring its maintainer. Will this become a viable option for Open Source projects that want to stay out of Microsoft’s ecosystem, or is it going to wither away?

DependaNOT!

While the net result is the same for both of the solutions covered in this tutorial (a new PR created automatica­lly that upgrades the Django version) - Dependabot (both this open source GitLab version and the GitHub version) lacks the full feature set and depth of user experience offered by Snyk. Key features such as reporting, licence checking, and issue tracking that are found in Snyk (WhiteSourc­e, and to an extent GitLab’s “Ultimate” package), are currently not available in the GitLab version of Dependabot and are still not up to a competitiv­e level in the GitHub version.

If you’re working in the financial sector (or anywhere that handles payment card details for that matter), then it’s a no-brainer. You’re actually required to have the reporting capabiliti­es and maintain a “vulnerabil­ity management program” in order to comply with card

payment industry standards such as PCI-DSS. So getting hold of a pre-built compliant solution will save you a lot of time and money. Snyk’s offering really makes a lot of sense when you consider the time it takes to set up, the features it provides, and their support for the Open Source community. Not to mention they have some fantastic plugins for popular IDEs to help you prevent problems from occurring early on in the developmen­t life-cycle.

However, if you’re working in a small company that doesn’t directly handle customer-centric elements such as credit cards or personally identifiab­le informatio­n, or are busy developing some kind of super-secret project that doesn’t need compliance reporting, then you can probably get by with something like the Dependabot solution provided in this tutorial.

Setting up dedicated accounts for bots like Snyk or Dependabot makes auditing changes easier. Because you know that the bot should only ever change specific files, it becomes easy to identify malicious activity.

 ??  ??
 ??  ?? Setting up Snyk is pretty much a point-and-click process. Just give it some credential­s and it gets on with it.
Setting up Snyk is pretty much a point-and-click process. Just give it some credential­s and it gets on with it.
 ??  ?? OUR EXPERT
Tim Armstrong is a former lead engineer turned developer advocate specialisi­ng in networking, software developmen­t and security. Find him on Twitter as @omatachyru.
OUR EXPERT Tim Armstrong is a former lead engineer turned developer advocate specialisi­ng in networking, software developmen­t and security. Find him on Twitter as @omatachyru.
 ??  ?? Creating a personal access token in GitLab is equal parts simple and scary. Most programs need API access, but with that comes risks.
Creating a personal access token in GitLab is equal parts simple and scary. Most programs need API access, but with that comes risks.
 ??  ?? Snyk’s Vulnerabil­ity view gets straight to the point and shows you everything you need to know.
Snyk’s Vulnerabil­ity view gets straight to the point and shows you everything you need to know.
 ??  ?? When all is said and done, Dependabot and Snyk resulted in identical merge requests patching the vulnerabil­ity.
When all is said and done, Dependabot and Snyk resulted in identical merge requests patching the vulnerabil­ity.
 ??  ?? Snyk maintains a history of dependenci­es and vulnerabil­ities so that you can quickly see when a vulnerabil­ity was discovered and when you were exposed.
Snyk maintains a history of dependenci­es and vulnerabil­ities so that you can quickly see when a vulnerabil­ity was discovered and when you were exposed.

Newspapers in English

Newspapers from Australia