Epoch

To reduce the amount of data processed and stored, VulnIQ allows administrators to define an epoch time. Data that was created before this time will be ignored. For example storing vulnerability data created in 2002 is probably useless, you can choose to ignore stale data and use those resources for fresh data.

Filtering

VulnIQ filters out irrelevant data. For example in a git repository, there may be 10000 commits but 9990 of them may not be related to security. VulnIQ discards the 9990 irrelevant commits and stores only the remaining 10 relevant commits.

This may lead to some confusion in users at first but without this filtering signal-to-noise-ratio becomes unbearable.

It is possible to fine tune filtering rules when necessary but that's an advanced configuration that should be properly planned and tested.
For example you may want to include any data that contains the name of your company, even if it would be considered irrelevant otherwise.

Supported Data Sources And Types

VulnIQ supports the following data sources and types:
  • NVD CVE Feeds
  • CVRF Vendor Advisories
  • CWE, CAPEC data from Mitre
  • OVAL Data from CIS (or other sources)
  • RSS/Atom Feeds
  • Any web page
  • Tweets
  • Emails (only gmail is supported at the moment)
  • Git data. VulnIQ can also optionally generate the following information from Git data:
    • Commit diffs
    • Change logs between successive versions
    • Tag/release lists
    • Commit-tag mappings

Data Source Configuration

Authorized users can create or modify data sources using the admin UI (/datasource/new page) or the APIs (/api/datasource/new endpoint). For example to add a new github repository:
  1. Click Datasources link on the menu (/datasource/list page).
  2. Click the Add New link at the top of the page.
  3. /datasource/new page will be loaded.
  4. Select Git Repository from the Data Source Type dropdown.
  5. Enter a unique guid for this data type. This field MUST be a short, descriptive alpha-numeric (may also contain - or _) value. This value cannot be changed once created. Choose this value carefully!
  6. Enter a user friendly name into Data Source Name field
  7. Select data source Status
  8. Enter a user friendly description into Description field
  9. Enter an Update Interval. Make sure that the interval is not smaller than required. For example if it's known that the data source is updated every 2 hours, running the data update process every 2 minutes will be pointless.
  10. Enter a reasonable value into Max Processing Time field. Make sure that this value is large enough to give the update process enough time to complete the update. Data processing will be killed if it cannot be completed in Max Processing Time seconds.
  11. Primary Url is the github url of the project.
  12. Click Fetch and Process URLs From This Source if you want URLs discovered in data from this data source to be processed.
  13. Data Type Specific Configuration section contains data source specific configuration and varies based on selected data source type. For example, for a git data source the following options are available:
    1. Commit Link Format String : For example for Apache Cassandara https://github.com/apache/cassandra/commit/%s (%s will be replaced with the commit id).
    2. Tag Link Format String : For example for Apache Cassandara https://github.com/apache/cassandra/releases/tag/%s (%s will be replaced with the tag name).
    3. Do Not Store Content : When true, git commits will be processed, but their contents will not be saved. For example, if the project license does not allow storing source code.
    4. Process Commit Diffs : By default only commit messages will be processed. When enabled, the backend process will create and process diffs for commits.
    5. Save Commit Diffs : When enabled commit diffs will be stored. Only used when processCommitDiffs is also enabled.
    6. Create Change Logs : When enabled change logs for tags will be created. Only the changes between successive versions (e.g between 3.0.1 and 3.0.2) will be included in the generate changelog.
    7. Add Diffs To Search : When enabled diffs will be indexed and will be available to full text searching.
    8. Max Processable Diff Length : When processing diffs, to prevent potential performance issues caused by large diffs, you can set maxProcessableDiffLength to a reasonable value. Only the first maxProcessableDiffLength characters in diffs will be processed.
    9. Max Parent Count For Diffs : Defaults to 1. In git, merge commits may have multiple parents. By default first parent is the most significant and github and git command line client generate diffs only with respect to the first parent. Setting this value to too high may increase CPU and disk usage.

Original Copies

Do you have an existing solution using NVD JSON feeds, OVAL xmls or similar?
You don't have to dump your existing solution, you can just your existing solution with original copies of data from VulnIQ. For certain data types, such as NVD or OVAL , VulnIQ stores original copies of data so that you can download and use them.
VulnIQ will update the data from the source as configured and will always provide you with the latest data.

Data Processing

At the end of the each data update cycle, VulnIQ records the last update timestamp and the next time it only processes data newer than the last update timestamp. For example if the last update of a git repository was completed 3 hours ago, VulnIQ will only process changes that took place in the last 3 hours.

Sometimes you may want to re-process some data, for example due to a configuration change, in that case you can update the last process end timestamp and on the next cycle data processing will start from the updated timestamp.

Performance

VulnIQ backend processor is a high performance application that can handle large amounts of data using minimal resources.
Potentially handling very large git repositories for the first time may take a significant amount of time, up to a couple of hours. Especially when the repository is very large, initial cloning and processing may take a long time.
Please also note that very large git repositories, e.g Chrome, will consume GBs of disk space. Enabling diffs for commits will also significantly increase disk space usage. For example if you enable diffs for a repository that contains 700,000 commits, up to 700,000 diffs need to be created and stored. You may want to disable commit diffs and view them at external sources, i.e github, when necessary.