Facebook Post Crawler

Purpose

This crawler automatically fetches the first post from a Facebook fanpage using Selenium WebDriver and saves the result to facebook_first_post.json.

System Requirements

Node.js >= 16
Google Chrome browser

Installation

Clone the repository:
```
git clone <repo-url>
cd Crawler-FB
```

Install required packages:

npm install selenium-webdriver chromedriver dotenv

Environment Setup

Create a .env file in the project root with the following content:

FB_USERNAME=your_facebook_email_or_phone
FB_PASSWORD=your_facebook_password

Note: Do not share your .env or facebook_cookies.json files in git to protect your credentials.

Running the Crawler

Navigate to the script directory:

cd crawler-fb

Run the crawler:

node crawler-fb/crawler-post.js "<fanpage_url>"

Example:

node crawler-fb/crawler-post.js "http://exmaple.com" (website yout want crawler)

If no URL is provided, the script will crawl the default fanpage logisticsarena.bacib.tdtu.
Username and password can be set via environment variables or passed as command-line arguments:
```
node crawler-post.js <fanpage_url> <unused> <unused> <username> <password>
```

Output

The first post will be saved to facebook_first_post.json.
Login cookies will be saved to facebook_cookies.json for future sessions, so you do not need to log in every time.

Security Notice

Each user should log in and generate their own cookies; do not share cookie files between machines.
Do not commit .env or facebook_cookies.json to git.

Support

For issues or support, please contact the project administrator.

1.7 KiB Raw Blame History