Files
Crawler/README.md
2025-11-21 10:52:42 +07:00

61 lines
1.7 KiB
Markdown

# Facebook Post Crawler
## Purpose
This crawler automatically fetches the first post from a Facebook fanpage using Selenium WebDriver and saves the result to `facebook_first_post.json`.
## System Requirements
- Node.js >= 16
- Google Chrome browser
## Installation
1. Clone the repository:
```sh
git clone <repo-url>
cd Crawler-FB
```
2. Install required packages:
```sh
npm install selenium-webdriver chromedriver dotenv
```
## Environment Setup
Create a `.env` file in the project root with the following content:
```
FB_USERNAME=your_facebook_email_or_phone
FB_PASSWORD=your_facebook_password
```
> **Note:** Do not share your `.env` or `facebook_cookies.json` files in git to protect your credentials.
## Running the Crawler
Navigate to the script directory:
```sh
cd crawler-fb
```
Run the crawler:
```sh
node crawler-fb/crawler-post.js "<fanpage_url>"
```
Example:
```sh
node crawler-fb/crawler-post.js "http://exmaple.com" (website yout want crawler)
```
- If no URL is provided, the script will crawl the default fanpage `logisticsarena.bacib.tdtu`.
- Username and password can be set via environment variables or passed as command-line arguments:
```sh
node crawler-post.js <fanpage_url> <unused> <unused> <username> <password>
```
## Output
- The first post will be saved to `facebook_first_post.json`.
- Login cookies will be saved to `facebook_cookies.json` for future sessions, so you do not need to log in every time.
## Security Notice
- Each user should log in and generate their own cookies; do not share cookie files between machines.
- Do not commit `.env` or `facebook_cookies.json` to git.
## Support
For issues or support, please contact the project administrator.