Nginx accept_mute 惊群问题

网友投稿 270 2022-11-02


Nginx accept_mute 惊群问题

Nginx官方怎么说的,来看看

Syntax:

accept_mutex on | off;​

Default:

accept_mutex off;

Context:

​events​

If ​​accept_mutex​​ is enabled, worker processes will accept new connections by turn. Otherwise, all worker processes will be notified about new connections, and if volume of new connections is low, some of the worker processes may just waste system resources.

如果​​accept_mutex​​启用,则工作进程将依次接受新的连接。否则,将通知所有worker进程有关新连接的信息,如果新连接的数量很少,则某些worker进程可能会浪费系统资源。

Connection Handling

There are several tuning options related to connection handling. Please refer to the linked reference documentation for details on proper syntax, applicable configuration blocks (​​and so on.

​​accept_mutex​​​​off​​​ – All worker processes are notified about new connections (the default in NGINX 1.11.3 and later, and NGINX Plus R10 and later). If enabled, worker processes accept new connections by turns.所有worker进程都会收到有关新连接的通知(NGINX 1.11.3和更高版本以及NGINX Plus R10和更高版本中的默认设置)。如果启用,worker进程将依次接受新的连接。We recommend keeping the default value (​​off​​) unless you have extensive knowledge of your app’s performance and the opportunity to test under a variety of conditions, but it can lead to inefficient use of system resources if the volume of new connections is low. Changing the value to ​​on​​ might be beneficial under some high loads.

我们建议保留默认值(​​off​​​),除非您对应用程序的性能有广泛的了解,并且有机会在各种条件下进行测试,但是如果新连接的数量很少,则可能导致系统资源的低效使用。在某些高负载下将其值设置为​​on​​可能是有益的。

惊群的定义

首先,来看惊群的定义:

The thundering herd problem occurs when a large number of processes waiting for an event are awoken when that event occurs, but only one process is able to proceed at a time. After the processes wake up, they all demand the resource and a decision must be made as to which process can continue. After the decision is made, the remaining processes are put back to sleep, only to all wake up again to request access to the resource.This occurs repeatedly, until there are no more processes to be woken up. Because all the processes use system resources upon waking, it is more efficient if only one process was woken up at a time.This may render the computer unusable, but it can also be used as a technique if there is no other way to decide which process should continue (for example when programming with semaphores).

简而言之,惊群现象就是当多个进程/线程等待同一个事件,如果这个事件发生,会唤醒所有的进程/线程,但最终只可能有一个进程/线程能对该事件进行处理,其他进程/线程会在获取事件失败后重新休眠。

惊群现象非常像把食物丢进鸡群,引起所有的鸡一起哄抢食物。如果这个食物只是一粒米的话,唤醒所有的鸡一起来抢的话则非常没有必要。

惊群通常发生在网络服务器上。父进程首先绑定一个端口监听socket,然后fork出多个子进程,子进程们开始循环等待处理(比如accept)这个socket。每当用户发起一个TCP连接时,多个子进程同时被唤醒,然后其中一个子进程accept新连接成功,余者皆失败,重新休眠。

如何解决惊群问题呢?

另:高版本的Linux中,accept不存在惊群问题,不过epoll_wait等操作还有。

解决的惊群的方法也很简单,每次把进程(鸡)排个序,来了新的请求(食物)只唤醒排在第一位的就好。

其实早在linux2.6,accept系统调用的惊群问题已经被解决了:worker进程开始用epoll_wait来处理新事件(使用epoll模型),如果不加任何保护,一个新连接来临时,会有多个worker进程在epoll_wait后被唤醒。也就是说,我们还要解决epoll的惊群问题。

Nginx解决惊群问题的配置是accept_mute。开启 accept_mutex,只有一个子进程会将监听套接字添加到epoll中,这样当一个新的连接来到时,就只有一个 worker 子进程会被唤醒了。Nginx在1.11.3版本以前是默认开启accept_mutex的。

Linux4.5以后的版本中增加了EPOLLEXCLUSIVE支持以解决epoll的惊群问题(-w)或者负载上升,但是如果你的网站访问量比较大,为了系统的吞吐量,我还是建议大家关闭它。

最后附上一张图: 章亦春,OpenResty Inc. 创始人兼 CEO,OpenResty 开源项目创建者


版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:基于 SpringBoot 实现 MySQL 读写分离的问题
下一篇:Linux 为什么CPU访问硬盘的速度巨慢
相关文章

 发表评论

暂时没有评论,来抢沙发吧~