使用nginx作為前端代理,在reload nginx的時(shí)候,發(fā)現(xiàn)nginx會(huì)一直處于shutting狀態(tài),當(dāng)reload nginx次數(shù)多了以后,cup會(huì)飆升到100%, gateway的狀態(tài)也會(huì)處于busy狀態(tài)。
現(xiàn)象如下:
//nginx進(jìn)程
$ ps aux | grep nginx
**www 12384 0.6 0.2 110752 37424 ? SN Jan20 12:51 nginx: worker process is shutting down**
www 12385 0.1 0.1 102508 29260 ? SN Jan20 3:18 nginx: worker process is shutting down
www 12386 0.5 0.2 112744 39616 ? SN Jan20 12:45 nginx: worker process is shutting down
www 12387 0.2 0.1 104556 31228 ? SN Jan20 5:56 nginx: worker process is shutting down
www 27928 1.0 0.1 102508 28252 ? SN 11:25 0:08 nginx: worker process
www 27929 0.5 0.1 102508 27932 ? SN 11:25 0:04 nginx: worker process
www 27930 1.2 0.1 102508 28512 ? SN 11:25 0:10 nginx: worker process
www 27931 0.2 0.1 102508 27900 ? SN 11:25 0:02 nginx: worker process
www 29369 0.1 0.1 102508 27712 ? SN Jan21 0:52 nginx: worker process is shutting down
www 29370 0.5 0.1 102804 29400 ? SN Jan21 3:42 nginx: worker process is shutting down
www 29371 0.2 0.1 102508 28460 ? SN Jan21 1:39 nginx: worker process is shutting down
www 29372 0.4 0.1 102804 29360 ? SN Jan21 3:23 nginx: worker process is shutting down
再查id為12384的進(jìn)程發(fā)現(xiàn),這個(gè)進(jìn)程一直和gateway在連接
$ lsof -i :60877 //60877是從lsof的最后一行取的
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
php 11789 root 65u IPv4 2550865341 0t0 TCP mt-web1:8282->mt-web1:60877 (ESTABLISHED)
nginx 12384 www 1184u IPv4 2550863681 0t0 TCP mt-web1:60877->mt-web1:8282 (ESTABLISHED)
gateway的狀態(tài)如下:發(fā)現(xiàn)會(huì)有11789這個(gè)進(jìn)程(當(dāng)nginx reload的次數(shù)多的時(shí)候,gateway就會(huì)處于busy狀態(tài)。)
Workerman version:3.5.1 PHP version:7.1.6
start time:2017-12-07 09:03:05 run 46 days 4 hours
load average: 0.14, 0, 0 event-loop:\Workerman\Events\Select
1 workers 4 processes
worker_name exit_status exit_count
zhibo-gateway-1 0 0
----------------------------------------------PROCESS STATUS---------------------------------------------------
pid memory listening worker_name connections total_request send_fail timers status
11785 8M websocket://ip:8282 zhibo-gateway-1 436 32233007 579 3
11786 8M websocket://ip:8282 zhibo-gateway-1 431 36117127 644 3
11788 8M websocket://ip:8282 zhibo-gateway-1 432 36397854 592 3
11789 8M websocket://ip:8282 zhibo-gateway-1 447 33917464 642 3
因?yàn)槲覀兩暇€新功能后,一般只會(huì)reload nginx, 所以懷疑是gateway和nginx鏈接導(dǎo)致的,但是gateway reload后,gateway的進(jìn)程還是那幾個(gè)并沒(méi)有重新啟動(dòng),這正常嗎?
ps:如果把gateway restart了,gateway的進(jìn)程id會(huì)改變,nginx中的shutting狀態(tài)的進(jìn)程也會(huì)消失,但是gateway restart是不是會(huì)斷掉和客戶(hù)端的連接呀?
看起來(lái)nginx reload后不會(huì)斷開(kāi)連接,然后gateway reload也不會(huì)斷開(kāi)連接(gateway restart會(huì)斷開(kāi)連接)。
然后二者仍然保持著連接。
你好, walkor,是的,現(xiàn)在就是這里遇到問(wèn)題了,nginx和gateway都reload,那么nginx會(huì)產(chǎn)生新的worker進(jìn)程,但是應(yīng)該shutdown的老進(jìn)程因?yàn)楹蚲ateway還有連接,所以也不會(huì)銷(xiāo)毀,這樣時(shí)間長(zhǎng)了會(huì)有很多處于shutting狀態(tài)的進(jìn)程,這些進(jìn)程都會(huì)占用資源。
請(qǐng)問(wèn)您有什么推薦的解決辦法嗎?
重啟的話(huà)就會(huì)把客戶(hù)端的連接斷開(kāi)重連,會(huì)把連接初始化時(shí)給客戶(hù)端發(fā)送的消息重新發(fā)一遍,有沒(méi)有辦法把shutting狀態(tài)的nginx進(jìn)程和gateway的連接斷掉呢
理論上說(shuō),nginx 的進(jìn)程不會(huì)再接受新的連接,那么也就是說(shuō)當(dāng)原來(lái)連接到nginx的客戶(hù)端都斷開(kāi)后,這個(gè)nginx的進(jìn)程應(yīng)該會(huì)和gateway就斷開(kāi)了,但是為什么還會(huì)一直有連接呢,我網(wǎng)絡(luò)方面不太好,這個(gè)問(wèn)題沒(méi)搞明白,呵呵
為什么cpu 100%,這個(gè)需要 strace 下對(duì)應(yīng)的cpu 100%的進(jìn)程才知道
cpu100%的問(wèn)題還不確定是不是因?yàn)閚ginx有太多的shutting進(jìn)程導(dǎo)致的,這個(gè)等碰到了再查一下,但是有次是重啟了gateway后就好了,shutting的nginx進(jìn)程也沒(méi)有了,所以懷疑是這里出現(xiàn)了問(wèn)題