The Facebook Crawler scrapes the HTML of a website that was shared on Facebook via copying and pasting the link or by a Facebook social plugins on the website. The crawler gathers, caches, and displays information about the website such as its title, description, and thumbnail image.
Range header of the crawler request or it should ignore the Range header altogether.The Facebook crawler user agent strings:
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)facebookexternalhit/1.1To get a current list of IP addresses the crawler uses, run the following command.
whois -h whois.radb.net -- '-i origin AS32934' | grep ^route
These IP addresses change often.
... route: 69.63.176.0/21 route: 69.63.184.0/21 route: 66.220.144.0/20 route: 69.63.176.0/20 route6: 2620:0:1c00::/40 route6: 2a03:2880::/32 route6: 2a03:2880:fffe::/48 route6: 2a03:2880:ffff::/48 route6: 2620:0:1cff::/48 ...
If your website content is not available at the time of scraping, you can force a scrape once it becomes available either by passing the URL through the Sharing Debugger or by using the Graph API.
You can simulate a crawler request with the following code if you need to troubleshoot your website:
curl -v --compressed -H "Range: bytes=0-524288" -H "Connection: close" -A "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)" "$URL"