Skip to content
This repository was archived by the owner on Feb 10, 2025. It is now read-only.

添加nginx防止爬虫爬取配置 #1187

Merged
merged 4 commits into from
Apr 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
- [Docker 参数示例](#docker-参数示例)
- [Docker build \& Run](#docker-build--run)
- [Docker compose](#docker-compose)
- [防止爬虫抓取](#防止爬虫抓取)
- [使用 Railway 部署](#使用-railway-部署)
- [Railway 环境变量](#railway-环境变量)
- [手动打包](#手动打包)
Expand Down Expand Up @@ -234,6 +235,21 @@ services:
```
- `OPENAI_API_BASE_URL` 可选,设置 `OPENAI_API_KEY` 时可用
- `OPENAI_API_MODEL` 可选,设置 `OPENAI_API_KEY` 时可用

#### 防止爬虫抓取

**nginx**

将下面配置填入nginx配置文件中,可以参考 `docker-compose/nginx/nginx.conf` 文件中添加反爬虫的方法

```
# 防止爬虫抓取
if ($http_user_agent ~* "360Spider|JikeSpider|Spider|spider|bot|Bot|2345Explorer|curl|wget|webZIP|qihoobot|Baiduspider|Googlebot|Googlebot-Mobile|Googlebot-Image|Mediapartners-Google|Adsbot-Google|Feedfetcher-Google|Yahoo! Slurp|Yahoo! Slurp China|YoudaoBot|Sosospider|Sogou spider|Sogou web spider|MSNBot|ia_archiver|Tomato Bot|NSPlayer|bingbot")
{
return 403;
}
```

### 使用 Railway 部署

[![Deploy on Railway](https://railway.app/button.svg)](https://railway.app/new/template/yytmgc)
Expand Down
7 changes: 7 additions & 0 deletions docker-compose/nginx/nginx.conf
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,13 @@ server {
server_name localhost;
charset utf-8;
error_page 500 502 503 504 /50x.html;

# 防止爬虫抓取
if ($http_user_agent ~* "360Spider|JikeSpider|Spider|spider|bot|Bot|2345Explorer|curl|wget|webZIP|qihoobot|Baiduspider|Googlebot|Googlebot-Mobile|Googlebot-Image|Mediapartners-Google|Adsbot-Google|Feedfetcher-Google|Yahoo! Slurp|Yahoo! Slurp China|YoudaoBot|Sosospider|Sogou spider|Sogou web spider|MSNBot|ia_archiver|Tomato Bot|NSPlayer|bingbot")
{
return 403;
}

location / {
root /usr/share/nginx/html;
try_files $uri /index.html;
Expand Down