Best Practices for Using ThreatExchange

As with any API, there are some ways of using ThreatExchange that will help improve throughput and usage.

Downloading Data

When you first get access to ThreatExchange, one of the first things you'll you want to do is start downloading indicators of compromise so you can begin evaluating or applying the data. There are three APIs (and one UI) that can be used, but only some of them should be used for automated integration into your own systems, and the others should be used only for testing.

Sample CSVs from the UI

Some privacy groups have a feature where samples of indicators can be downloaded from the UI, which is the fastest way to evaluate potential data. Learn more at ThreatExchange UI

Sampling from /threat_descriptors API

/threat_descriptors allows you to do complex searches on ThreatDescriptors. This can be useful to generate your own narrow samples, but the API is not guaranteed to be contain all data that matches the filters.

Recommended: Tailing /threat_updates API

/threat_updates allows you to exactly reproduce a ThreatPrivacyGroup's contents. It also allows you to get deletion events as long as you poll within 30 days of the object being deleted. Tailing /threat_updates gives you the lowest latency, complete data, and is the only API that notifies of deletes.

/<TAG_ID>/tagged_objects API

/<TAG_ID>/tagged_objects allows you to reliably download all ThreatDescriptors tagged with those tags. Since most data is tagged, this is a reliable way to get data. However, you must do client-side filtering to remove data that is unwanted but with the same tags (for example, in the wrong privacy group, wrong type, etc). Additionally, since you don't learn of deletions or updates, you may start over and refresh the data from tagged_since=0 at some interval (for example, 30 days).

Tag your data

By tagging your data, it makes it easier for yourself and others to find the indicators they care most about. For example, by tagging descriptors with evil, this will allow others to filter descriptors searches by data with that tag. Another option is that you can then search the threat_tags endpoint by that tag and see all the tagged objects visible to you. The tagging endpoint also supports partial matches on tags, so a query for evil will surface any tags visible to you which are like evil*.

Be descriptive with your tags

Threat-tags (also known as "subjective tags") contain metadata fields describing what they are. If you create the tag foo, others can inspect the metadata to see what means or why the data was tagged. But it's helpful to be more descriptive instead -- tags like campaign_zeusbotnet or malicious_ssl_cert are great examples.

Consider the privacy rules

ThreatTags are visible based on the PrivacyType of the tagged data. For example, if the tag public_tag is on ANY descriptor which is publically visible (privacy type of VISIBLE), then the tag is visible to all members. Conversely, if the tag nonpublic_tag is ONLY on tagged objects which shared to specific members (privacy types HAS_WHITELIST or HAS_PRIVACY_GROUP), then the tag will only be visible to those members. Tag your data accordingly. Please review the PrivacyType documentation for more information on privacy in ThreatExchange.

For more uses cases with ThreatTags, see the ThreatTag documentation.

Use batch requests for improved throughput

Batch requests allow you to make multiple requests to the Graph API using a single HTTP call. For more information on Graph API Batch Requests please review the following:

You can also query for multiple objects by ID using the following syntax.

https://graph.facebook.com/v2.8/?ids=[id1,id2]&amp;access_token=<access_token>

If you need to query for a specific field,

https://graph.facebook.com/v2.8/?ids=[id1,id2]&amp;fields=field1,field2&amp;access_token=<access_token>

Including nested fields and objects in result data

It can sometimes be more efficient to include various nested fields or related objects in your search results. The following syntax shows how, for the facebook.com indicator object, to pull all of its descriptors without issuing additional API calls:

https://graph.facebook.com/v2.8/788497497903212?fields=descriptors{owner,description,status,share_level},indicator,type&amp;access_token=<access_token>

RESULT:
{
  "descriptors": {
    "data": [
      {
        "owner": {
          "id": "820763734618599",
          "name": "Facebook Administrator"
        },
        "description": "Facebook",
        "status": "NON_MALICIOUS",
        "share_level": "GREEN",
        "id": "834469179976904"
      },
      {
        "owner": {
          "id": "588498724619612",
          "name": "Facebook CERT ThreatExchange"
        },
        "description": "Non malicious",
        "status": "NON_MALICIOUS",
        "share_level": "GREEN",
        "id": "1202389109786630"
      }
    ],
    "paging": {
      "cursors": {
        "before": "ODM0NDY5MTc5OTc2OTA0",
        "after": "MTIwMjM4OTEwOTc4NjYzMAZDZD"
      }
    }
  },
  "indicator": "facebook.com",
  "type": "DOMAIN",
  "id": "788497497903212"
}