61 lines
1.7 KiB
Markdown
61 lines
1.7 KiB
Markdown
|
|
# Facebook Post Crawler
|
|
|
|
## Purpose
|
|
This crawler automatically fetches the first post from a Facebook fanpage using Selenium WebDriver and saves the result to `facebook_first_post.json`.
|
|
|
|
## System Requirements
|
|
- Node.js >= 16
|
|
- Google Chrome browser
|
|
|
|
## Installation
|
|
1. Clone the repository:
|
|
```sh
|
|
git clone <repo-url>
|
|
cd Crawler-FB
|
|
```
|
|
2. Install required packages:
|
|
```sh
|
|
npm install selenium-webdriver chromedriver dotenv
|
|
```
|
|
|
|
## Environment Setup
|
|
Create a `.env` file in the project root with the following content:
|
|
```
|
|
FB_USERNAME=your_facebook_email_or_phone
|
|
FB_PASSWORD=your_facebook_password
|
|
```
|
|
|
|
> **Note:** Do not share your `.env` or `facebook_cookies.json` files in git to protect your credentials.
|
|
|
|
## Running the Crawler
|
|
Navigate to the script directory:
|
|
```sh
|
|
cd crawler-fb
|
|
```
|
|
Run the crawler:
|
|
```sh
|
|
node crawler-fb/crawler-post.js "<fanpage_url>"
|
|
```
|
|
Example:
|
|
```sh
|
|
node crawler-fb/crawler-post.js "http://exmaple.com" (website yout want crawler)
|
|
```
|
|
|
|
- If no URL is provided, the script will crawl the default fanpage `logisticsarena.bacib.tdtu`.
|
|
- Username and password can be set via environment variables or passed as command-line arguments:
|
|
```sh
|
|
node crawler-post.js <fanpage_url> <unused> <unused> <username> <password>
|
|
```
|
|
|
|
## Output
|
|
- The first post will be saved to `facebook_first_post.json`.
|
|
- Login cookies will be saved to `facebook_cookies.json` for future sessions, so you do not need to log in every time.
|
|
|
|
## Security Notice
|
|
- Each user should log in and generate their own cookies; do not share cookie files between machines.
|
|
- Do not commit `.env` or `facebook_cookies.json` to git.
|
|
|
|
## Support
|
|
For issues or support, please contact the project administrator.
|