Guide to bulk comments data

The bulk comments method of retrieving comments allows you to retrieve Facebook or Instagram comments on multiple post or comment IDs using a single asynchronous query. Call the Meta Content Library API client get() method with the facebook/comments/job or instagram/comments/job path, and include the parent_ids parameter.

The maximum number of IDs that can be included in a single query is 250 and they can be any combination of post or comment IDs. No sorting is applied in the response.

This document describes the parameters and shows how to perform basic queries using the method.

Estimating response size for any asynchronous query for comments (bulk or otherwise) has a limit of 1 million. A response size estimate of 1 million should therefore be interpreted as 1 million or more. Remember the 100,000 results upper limit for asynchronous query responses. If your query response size would be over that limit, the asynchronous search would return an error. See Search guide for more details on how to use asynchronous search, including estimating response size.

All of the examples in this document are taken from a Secure Research Environment use case and assume you have created a Python or R Jupyter notebook and a client object. See Getting started to learn more.

See Data dictionary for detailed information about the fields that are available on a Facebook or Instagram comments node.

Parameters

Parameter Type Description

PARENT_IDS

List

Comma-separated list of Facebook or Instagram post or comment IDs for which you want to retrieve bulk comments. This parameter can only be used in asynchronous queries.

FIELDS
Optional

List

Comma-separated list of fields you want included in search results. See Data dictionary for descriptions of all available fields.

FETCH_ALL
Optional

Boolean

Boolean value that allows you to specify whether you want to fetch all levels of comments or not. Available options:

  • True: All of the comments of the entity are returned, both nested and unnested.
  • False: Only top level replies of the entity are returned.

Default value: False

SINCE
Optional

String or Integer

Date in YYYY-MM-DD (date only) or UNIX timestamp (translates to a date and time to the second) format. Comments created on or after this date or timestamp are returned (used with UNTIL to define a time range). SINCE and UNTIL are based on UTC time zone, regardless of the local time zone of the user who made the comment.

  • If both SINCE and UNTIL are included, the query includes the time range defined by those values.
  • If SINCE is included and UNTIL is omitted, the query includes the SINCE time through the present time.
  • If SINCE is omitted and UNTIL is included, the query goes from the beginning of Facebook or Instagram time through the UNTIL time.
  • If SINCE and UNTIL are both omitted, the query goes from the beginning of Facebook or Instagram time to the present time.
  • If SINCE and UNTIL are the same UNIX timestamp, the query includes the SINCE time through the SINCE time plus one hour.
  • If SINCE and UNTIL are the same date (YYYY-MM-DD), the query includes the SINCE date through the SINCE date plus one day.

UNTIL
Optional

String or Integer

Date in YYYY-MM-DD (date only) or UNIX timestamp (translates to a date and time to the second) format. Comments created on or after this date or timestamp are returned (used with UNTIL to define a time range). SINCE and UNTIL are based on UTC time zone, regardless of the local time zone of the user who made the comment.

  • If both SINCE and UNTIL are included, the query includes the time range defined by those values.
  • If SINCE is included and UNTIL is omitted, the query includes the SINCE time through the present time.
  • If SINCE is omitted and UNTIL is included, the query goes from the beginning of Facebook or Instagram time through the UNTIL time.
  • If SINCE and UNTIL are both omitted, the query goes from the beginning of Facebook or Instagram time to the present time.
  • If SINCE and UNTIL are the same UNIX timestamp, the query includes the SINCE time through the SINCE time plus one hour.
  • If SINCE and UNTIL are the same date (YYYY-MM-DD), the query includes the SINCE date through the SINCE date plus one day.

Sample queries

Request bulk comments

This example shows using the parent_ids parameter to specify the post or comment IDs for which to retrieve bulk comments. Note that the IDs would be known to you after querying for posts or comments. Meta Content Library IDs are unique IDs linked to an entity in Content Library. These IDs cannot be used to search on Meta technologies.

This example also shows checking the status of the asynchronous query and retrieving the results of the query once the status is complete. These are functions available for all asynchronous queries.

library(reticulate)
meta_content_library_api <- import("metacontentlibraryapi")
client <- meta_content_library_api$MetaContentLibraryAPIClient

# Set the default version to the latest available
client$set_default_version(client$LATEST_VERSION)

async_utils <- meta_content_library_api$MetaContentLibraryAPIAsyncUtils

# Submit an asynchronous query using either facebook/comments/job or instagram/comments/job
response <- client$post(path="facebook/comments/job", params=list("parent_ids"=list("1711092159470842", "1326576181874783")))
jsonlite::fromJSON(response$text, flatten=TRUE) # Display the query handle


# Check the status of an asynchronous query
status_response <- async_utils$get_status(response=response)
jsonlite::fromJSON(status_response$text, flatten=TRUE) # Display the query status


# Retrieve the results of an asynchronous query when status is "COMPLETE"
get_data_response <- async_utils$get_data(response=response)
jsonlite::fromJSON(get_data_response$text, flatten=TRUE) # Display the query results
from metacontentlibraryapi import (
    MetaContentLibraryAPIClient as client,
    MetaContentLibraryAPIAsyncUtils as async_utils,
)

# Set the default version to the latest available
client.set_default_version(client.LATEST_VERSION)

# Submit an asynchronous query using either facebook/comments/job or instagram/comments/job
response = client.post(
    path="facebook/comments/job",
    params={"parent_ids":["1711092159470842","1326576181874783"]},
)
display(response.json()) # Display the query handle


# Check the status of an asynchronous query
status_response = async_utils.get_status(response=response)
display(status_response.json()) # Display the query status


# Retrieve the results of an asynchronous query when status is "COMPLETE"
get_data_response = async_utils.get_data(response=response)
display(get_data_response.json()) # Display the query results

Request specific fields

To have the API return specific fields of the bulk comments, include the fields parameter as a comma-separated list of fields. If omitted, default fields will be returned.

library(reticulate)
client <- import("metacontentlibraryapi")$MetaContentLibraryAPIClient

client$set_default_version(client$LATEST_VERSION)

        # Fetch comments with specific fields
response <- client$post(
          path="facebook/comments/job",
          params=list("parent_ids"=list("1711092159470842", "1326576181874783"), "fields"="text,creation_time")
)

jsonlite::fromJSON(response$text, flatten=TRUE) # Display the query handle
from metacontentlibraryapi import MetaContentLibraryAPIClient as client

client.set_default_version(client.LATEST_VERSION)

        # Fetch comments with specific fields
response = client.get(
          path="facebook/comments/job",
          params={"parent_ids": ["1711092159470842", "1326576181874783"], "fields": "text,creation_time"}
)
display(response.json()) # Display the query handle

Request comments for a specific time frame

To have the API return comments within a specific time and date range, include the since and until parameters.

library(reticulate)
client <- import("metacontentlibraryapi")$MetaContentLibraryAPIClient

client$set_default_version(client$LATEST_VERSION)

# Will fetch comments that were created after 2025-01-16, 
# but before 1737072000 (epoch timestamp of "2025-01-17 12:00:00 AM")
# Both formats (date and UNIX timestamp) are allowed

response <- client$post(
          path="facebook/comments/job",
          params=list("parent_ids"=list("1711092159470842", "1326576181874783"), "since"="2025-01-16", "until"="1737072000")
)
        jsonlite::fromJSON(response$text, flatten=TRUE) # Display the query handle
from metacontentlibraryapi import MetaContentLibraryAPIClient as client

client.set_default_version(client.LATEST_VERSION)

# Will fetch comments that were created after 2025-01-16, 
# but before 1737072000 (epoch timestamp of "2025-01-17 12:00:00 AM")
# Both formats (date and UNIX timestamp) are allowed

response = client.post( 
          path="facebook/comments/job",
          params={'parent_ids': ["1711092159470842", "1326576181874783"], "since": "2025-01-16", "until": "1713139200"}
)
display(response.json()) # Display the query handle

Request all levels of comments

To have the API return all comments and not just the top level replies, you can use the fetch_all boolean parameter by setting it to true.

library(reticulate)
client <- import("metacontentlibraryapi")$MetaContentLibraryAPIClient

client$set_default_version(client$LATEST_VERSION)

# Fetch ALL comments
response <- client$post(
          path="facebook/comments/job",
          params=list("parent_ids"=list("1711092159470842", "1326576181874783"), "fetch_all"=TRUE))
)
jsonlite::fromJSON(response$text, flatten=TRUE) # Display the query handle
from metacontentlibraryapi import MetaContentLibraryAPIClient as client

client.set_default_version(client.LATEST_VERSION)

# Fetch ALL comments 
response = client.post(
        path="facebook/comments/job", 
        params={"parent_ids":["1711092159470842", "1326576181874783"], "fetch_all": 1}
)

display(response.json())  # Display the query handle

Estimating the response size

Use the estimate resource to get a rough idea of how much data would be returned from your query. Since the API can only return up to 100,000 results from a single asynchronous query, it can be helpful to know in advance if your query is likely to fail because the response size is too large. If the estimate comes out higher than 100,000, consider modifying the parameters to reduce the response size. You can continue to modify the query parameters and get new estimates until the results are predicted to fall below the maximum allowed.

This example is typically most useful for posts with many comments because the number of results tend to be higher, but it can be used to estimate the size of data that would be returned by any query.

library(reticulate)
client <- import("metacontentlibraryapi")$MetaContentLibraryAPIClient

client$set_default_version(client$LATEST_VERSION)

# Get estimate of results by making sure to pass /estimate resource
response <- client$get(
        path = "facebook/comments/estimate", 
        params=list("parent_ids"="1927859627616307")
)

# Fetch results
jsonlite::fromJSON(response$text, flatten=TRUE)
from metacontentlibraryapi import MetaContentLibraryAPIClient as client

client.set_default_version(client.LATEST_VERSION)

# Get estimate of results by making sure to pass /estimate resource
response = client.get(
        path="facebook/comments/estimate",
        params={"parent_ids": [1927859627616307]},
)

# Display the json response
display(response.json())