最近一直在解决线上一个问题,表现是:
Tomcat每到凌晨会有一个高峰,峰值的并发达到了3000以上,最后的结果是Tomcat线程池满了,日志看很多请求超过了1s。
服务器性能很好,Tomcat版本是7.0.54,配置如下:
<Executor name="tomcatThreadPool" namePrefix="catalina-exec-" maxThreads="3000" minSpareThreads="800"/> <Connector executor="tomcatThreadPool" port="8084" protocol="org.apache.coyote.http11.Http11AprProtocol" connectionTimeout="60000" keepAliveTimeout="30000" maxKeepAliveRequests="8000" maxHttpHeaderSize="8192" URIEncoding="UTF-8" enableLookups="false" acceptCount="1000" disableUploadTimeout="true" redirectPort="8443" />
事后thread dump看其实真正处于RUNNABLE状态的线程很少,绝大部分线程都处于TIMED_WAITING状态:
于是大伙都开始纠结为什么线程会涨到3000,而且发现即使峰值过了线程数并不会降下来。我们首先想到的是:
后端应用的处理瞬间比较慢,“堵住了”导致前端线程数涨了起来。
但是优化一个版本上线后发现虽然涨的情况有所好转,但是最终线程池还是会达到3000这个最大值。
==================================分割线=========================================
以上是大背景,中间的过程省略,直接跟各位说下目前我得到的结论:
1、首先是为什么线程不释放的问题?
简单说下我验证的Tomcat(7.0.54)线程池大概的工作机制
- Tomcat启动时如果没有请求过来,那么线程数(都是指线程池的)为0;
- 一旦有请求,Tomcat会初始化minSapreThreads设置的线程数;
- Tomcat不会主动对线程池进行收缩,除非确定没有任何请求的时候,Tomcat才会将线程池收缩到minSpareThreads设置的大小;
- Tomcat6之前的版本有一个maxSpareThreads参数,但是在7中已经移除了,所以只要前面哪怕只有一个请求,Tomcat也不会释放多于空闲的线程。
至于Tomcat为什么移除maxSpareThreads这个参数,我想也是出于性能的考虑,不停的收缩线程池性能肯定不高,而多余的线程处于等待状态的好处是一有新请求过来立刻可以处理。
- 而且大量的Tomcat线程处于等待状态不会消耗CPU,但是会消耗一些JVM存储。
补充:上面标红的一句有点问题,进一步验证发现只有使用Keep-Alive(客户端和服务端都支持)时才是这种表现,如果客户端没有使用Keep-Alive那么线程会随着TCP连接的释放而回收。
Tomcat中Keep-Alive相关的参数:
maxKeepAliveRequests:
The maximum number of HTTP requests which can be pipelined until the connection is closed by the server. Setting this attribute to 1 will disable HTTP/1.0 keep-alive, as well as HTTP/1.1 keep-alive and pipelining. Setting this to -1 will allow an unlimited amount of pipelined or keep-alive HTTP requests. If not specified, this attribute is set to 100.
keepAliveTimeout:
The number of milliseconds this Connector will wait for another HTTP request before closing the connection. The default value is to use the value that has been set for the connectionTimeout attribute. Use a value of -1 to indicate no (i.e. infinite) timeout.
2、为什么线程池会满?
这是我现在纠结的核心。到底是不是应用的性能慢导致的,我现在的结论是有关系,但关键是并发。
- Tomcat的线程池的线程数跟你的瞬间并发有关系,比如maxThreads设置为1000,当瞬间并发达到1000那么Tomcat就会起1000个线程来处理,这时候跟你应用的快慢关系不大。
那么是不是并发多少Tomcat就会起多少个线程呢?这里还跟Tomcat的这几个参数设置有关系,看官方的解释是最靠谱的:
maxThreads:
The maximum number of request processing threads to be created by this Connector, which therefore determines the maximum number of simultaneous requests that can be handled. If not specified, this attribute is set to 200. If an executor is associated with this connector, this attribute is ignored as the connector will execute tasks using the executor rather than an internal thread pool.
maxConnections:
The maximum number of connections that the server will accept and process at any given time. When this number has been reached, the server will accept, but not process, one further connection. This additional connection be blocked until the number of connections being processed falls below maxConnections at which point the server will start accepting and processing new connections again. Note that once the limit has been reached, the operating system may still accept connections based on the
acceptCount
setting. The default value varies by connector type. For BIO the default is the value of maxThreads unless an Executor is used in which case the default will be the value of maxThreads from the executor. For NIO the default is10000
. For APR/native, the default is8192
.……
acceptCount:
The maximum queue length for incoming connection requests when all possible request processing threads are in use. Any requests received when the queue is full will be refused. The default value is 100.
minSpareThreads:
The minimum number of threads always kept running. If not specified, the default of
10
is used.
我简单理解就是:
maxThreads:Tomcat线程池最多能起的线程数;
maxConnections:Tomcat最多能并发处理的请求(连接);
acceptCount:Tomcat维护最大的对列数;
minSpareThreads:Tomcat初始化的线程池大小或者说Tomcat线程池最少会有这么多线程。
比较容易弄混的是maxThreads和maxConnections这两个参数:
maxThreads是指Tomcat线程池做多能起的线程数,而maxConnections则是Tomcat一瞬间做多能够处理的并发连接数。比如maxThreads=1000,maxConnections=800,假设某一瞬间的并发时1000,那么最终Tomcat的线程数将会是800,即同时处理800个请求,剩余200进入队列“排队”,如果acceptCount=100,那么有100个请求会被拒掉。
注意:根据前面所说,只是并发那一瞬间Tomcat会起800个线程处理请求,但是稳定后,某一瞬间可能只有很少的线程处于RUNNABLE状态,大部分线程是TIMED_WAITING,如果你的应用处理时间够快的话。所以真正决定Tomcat最大可能达到的线程数是maxConnections这个参数和并发数,当并发数超过这个参数则请求会排队,这时响应的快慢就看你的程序性能了。