Nginx安装配置
可以直接看到最下面的HTTPS.
Nginx安装
我的系统如下:
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.3 LTS
Release: 16.04
Codename: xenial
安装(如果有apache服务器, 建议卸载了, 或者改Nginx的默认端口):
sudo apt-get install nginx
此时已经开启了80
端口, 并且配置处在etc/nginx
lsof -i:80
cd /etc/nginx
Nginx服务一般配置
将配置放于conf.d/*
PHP配置(可忽视)
server{
listen 80;
server_name php.youdomain.com;
charset utf-8;
access_log /data/logs/nginx/www.youdomain.com.log;
#error_log /data/logs/nginx/www.youdomain.com.err;
location / {
root /data/www/php/blog;
index index.html index.php;
#访问路径的文件不存在则重写URL转交给ThinkPHP处理
if (!-e $request_filename) {
rewrite ^/(.*)$ /index.php/$1 last;
break;
}
}
## Images and static content is treated different
location ~* ^.+.(jpg|jpeg|gif|css|png|js|ico|xml)$ {
access_log off;
expires 30d;
root /data/www/php/blog;
}
location ~\.php/?.*$ {
root /data/www/php/blog;
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
#加载Nginx默认"服务器环境变量"配置
include fastcgi.conf;
#设置PATH_INFO并改写SCRIPT_FILENAME,SCRIPT_NAME服务器环境变量
set $fastcgi_script_name2 $fastcgi_script_name;
if ($fastcgi_script_name ~ "^(.+\.php)(/.+)$") {
set $fastcgi_script_name2 $1;
set $path_info $2;
}
fastcgi_param PATH_INFO $path_info;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name2;
fastcgi_param SCRIPT_NAME $fastcgi_script_name2;
}
}
反向代理配置
通过server_name
, 用域名访问, 全部会到80端口, 根据域名会转发到8080
域名请A记录到该机器IP地址.
vim /etc/nginx/conf.d/www.youdomain.com.conf
server{
listen 80;
# 本地测试时可以将域名改为: 127.0.0.1
server_name www.youdomain.com;
charset utf-8;
access_log /root/logs/nginx/www.youdomain.com.log;
#error_log /data/logs/nginx/www.youdomain.com.err;
location / {
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $http_host;
proxy_redirect off;
proxy_pass http://localhost:8080;
}
# 这个就是反爬虫文件了
include /etc/nginx/anti_spider.conf;
}
日志文件要先建立:
sudo mkdir -p /root/logs/nginx
查看配置是否无误, 并重启:
sudo nginx -t
sudo service nginx restart
sudo nginx -s reload
访问127.0.0.1
会发现502错误, 因为8080
端口我们没开! 此时访问localhost
会发现, 这时Nginx欢迎页面出来了, 这是默认80端口页面!
反爬虫配置
增加反爬虫配额文件:
sudo vim /etc/nginx/anti_spider.conf
#禁止Scrapy等工具的抓取
if ($http_user_agent ~* (Scrapy|Curl|HttpClient)) {
return 403;
}
#禁止指定UA及UA为空的访问
if ($http_user_agent ~ "WinHttp|WebZIP|FetchURL|node-superagent|java/|FeedDemon|Jullo|JikeSpider|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|Java|Feedly|Apache-HttpAsyncClient|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|oBot|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|HttpClient|MJ12bot|heritrix|EasouSpider|Ezooms|BOT/0.1|YandexBot|FlightDeckReports|Linguee Bot|^$" ) {
return 403;
}
#禁止非GET|HEAD|POST方式的抓取
if ($request_method !~ ^(GET|HEAD|POST)$) {
return 403;
}
#屏蔽单个IP的命令是
#deny 123.45.6.7
#封整个段即从123.0.0.1到123.255.255.254的命令
#deny 123.0.0.0/8
#封IP段即从123.45.0.1到123.45.255.254的命令
#deny 124.45.0.0/16
#封IP段即从123.45.6.1到123.45.6.254的命令是
#deny 123.45.6.0/24
# 以下IP皆为流氓
deny 58.95.66.0/24;
在网站配置server
段中都插入include /etc/nginx/anti_spider.conf
, 见上文. 你可以在默认的80端口配置上加上此句:sudo vim sites-available/default
重启:
sudo nginx -s reload
爬虫UA常见:
FeedDemon 内容采集
BOT/0.1 (BOT for JCE) sql注入
CrawlDaddy sql注入
Java 内容采集
Jullo 内容采集
Feedly 内容采集
UniversalFeedParser 内容采集
ApacheBench cc攻击器
Swiftbot 无用爬虫
YandexBot 无用爬虫
AhrefsBot 无用爬虫
YisouSpider 无用爬虫(已被UC神马搜索收购,此蜘蛛可以放开!)
jikeSpider 无用爬虫
MJ12bot 无用爬虫
ZmEu phpmyadmin 漏洞扫描
WinHttp 采集cc攻击
EasouSpider 无用爬虫
HttpClient tcp攻击
Microsoft URL Control 扫描
YYSpider 无用爬虫
jaunty wordpress爆破扫描器
oBot 无用爬虫
Python-urllib 内容采集
Indy Library 扫描
FlightDeckReports Bot 无用爬虫
Linguee Bot 无用爬虫
使用curl -A 模拟抓取即可,比如:
# -A表示User-Agent
# -X表示方法: POST/GET
# -I表示只显示响应头部
curl -X GET -I -A 'YYSpider' localhost
HTTP/1.1 403 Forbidden
Server: nginx/1.10.3 (Ubuntu)
Date: Fri, 08 Dec 2017 10:07:15 GMT
Content-Type: text/html
Content-Length: 178
Connection: keep-alive
模拟UA为空的抓取:
curl -I -A ' ' localhost
模拟百度蜘蛛的抓取:
curl -I -A 'Baiduspider' localhost
重定向或者静态配置
# 静态资源的根目录
root /data/index/;
# 静态
location /cn {
index index.html;
try_files $uri $uri/ /cn/index.html;
}
# 重定向
location / {
rewrite ^(.*)$ https://${server_name}/cn permanent;
}
支持HTTPS
生成免费证书,根据提示需要进行域名解析,加一个DNS txt解析。
certbot certonly --preferred-challenges dns --manual -d "你的域名.com" --server https://acme-v02.api.letsencrypt.org/directory
IMPORTANT NOTES:
- Congratulations! Your certificate and chain have been saved at:
/etc/letsencrypt/live/你的域名.com/fullchain.pem
Your key file has been saved at:
/etc/letsencrypt/live/你的域名.com/privkey.pem
Your cert will expire on 2019-11-05. To obtain a new or tweaked
version of this certificate in the future, simply run certbot
again. To non-interactively renew *all* of your certificates, run
"certbot renew"
- If you like Certbot, please consider supporting our work by:
Donating to ISRG / Let's Encrypt: https://letsencrypt.org/donate
Donating to EFF: https://eff.org/donate-le
重新续期。
certbot renew
生成的证书和密钥:
/etc/letsencrypt/live/你的域名.com/fullchain.pem
/etc/letsencrypt/live/你的域名.com/privkey.pem
随便进一个目录生成一些强有力的辅助配置:
cd /data/cert
openssl rand 48 > session_ticket.key
openssl dhparam -out dhparam.pem 2048
最安全的Nginx配置你的域名.conf
:
server {
listen 443 ssl http2;
server_name 你的域名;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
#ssl on;
ssl_certificate /etc/letsencrypt/live/你的域名.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/你的域名.com/privkey.pem;
ssl_dhparam /data/cert/dhparam.pem;
ssl_session_timeout 5m;
ssl_session_cache builtin:1000 shared:SSL:10m;
ssl_session_tickets on;
# openssl rand 48 > session_ticket.key
ssl_session_ticket_key /data/cert/session_ticket.key;
#ssl_protocols SSLv2 SSLv3 TLSv1;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_ciphers "ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA:ECDHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES128-SHA256:DHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES256-GCM-SHA384:AES128-GCM-SHA256:AES256-SHA256:AES128-SHA256:AES256-SHA:AES128-SHA:DES-CBC3-SHA:HIGH:!aNULL:!eNULL:!EXPORT:!DES:!MD5:!PSK:!RC4";
#ssl_ciphers ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP;
ssl_prefer_server_ciphers on;
ssl_stapling on;
ssl_stapling_verify on;
ssl_trusted_certificate /etc/letsencrypt/live/你的域名.com/fullchain.pem;
resolver 8.8.8.8 8.8.4.4 valid=300s;
resolver_timeout 10s;
add_header Strict-Transport-Security "max-age=31536000; includeSubdomains;";
# 其他的一些配置放在这里
access_log /root/logs/nginx/www.youdomain.com.log;
#error_log /data/logs/nginx/www.youdomain.com.err;
# 静态资源的根目录
root /data/index/;
# 静态
location /cn {
index index.html;
try_files $uri $uri/ /cn/index.html;
}
# 重定向
location / {
rewrite ^(.*)$ https://${server_name}/cn permanent;
}
# 反向代理
location /api {
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $http_host;
proxy_redirect off;
proxy_pass http://localhost:8080;
}
# 这个就是反爬虫文件了
include /etc/nginx/anti_spider.conf;
}
原文地址:https://www.cnblogs.com/nima/p/11751206.html
时间: 2024-10-08 00:07:18