Crawler/README.md


# Facebook Post Crawler

## Purpose
This crawler automatically fetches the first post from a Facebook fanpage using Selenium WebDriver and saves the result to `facebook_first_post.json`.

## System Requirements
- Node.js >= 16
- Google Chrome browser

## Installation
1. Clone the repository:
   ```sh
   git clone <repo-url>
   cd Crawler-FB
   ```
2. Install required packages:
   ```sh
   npm install selenium-webdriver chromedriver dotenv
   ```

## Environment Setup
Create a `.env` file in the project root with the following content:
```
FB_USERNAME=your_facebook_email_or_phone
FB_PASSWORD=your_facebook_password
```

> **Note:** Do not share your `.env` or `facebook_cookies.json` files in git to protect your credentials.

## Running the Crawler
Navigate to the script directory:
```sh
cd crawler-fb
```
Run the crawler:
```sh
node crawler-fb/crawler-post.js "<fanpage_url>"
```
Example:
```sh
node crawler-fb/crawler-post.js "http://exmaple.com" (website yout want crawler)
```

- If no URL is provided, the script will crawl the default fanpage `logisticsarena.bacib.tdtu`.
- Username and password can be set via environment variables or passed as command-line arguments:
  ```sh
  node crawler-post.js <fanpage_url> <unused> <unused> <username> <password>
  ```

## Output
- The first post will be saved to `facebook_first_post.json`.
- Login cookies will be saved to `facebook_cookies.json` for future sessions, so you do not need to log in every time.

## Security Notice
- Each user should log in and generate their own cookies; do not share cookie files between machines.
- Do not commit `.env` or `facebook_cookies.json` to git.

## Support
For issues or support, please contact the project administrator.