背景
公司有個渠道系統(tǒng),專門對接三方渠道使用,沒有什么業(yè)務(wù)邏輯,主要是轉(zhuǎn)換報文和參數(shù)校驗(yàn)之類的工作,起著一個承上啟下的作用。
最近在優(yōu)化接口的響應(yīng)時間,優(yōu)化了代碼之后,但是時間還是達(dá)不到要求;有一個詭異的100ms左右的耗時問題,在接口中打印了請求處理時間后,和調(diào)用方的響應(yīng)時間還有差了100ms左右。比如程序里記錄150ms,但是調(diào)用方等待時間卻為250ms左右。
下面記錄下當(dāng)時詳細(xì)的定位&解決流程(其實(shí)解決很簡單,關(guān)鍵在于怎么定位并找到解決問題的方法)
基于 Spring Boot + MyBatis Plus + Vue & Element 實(shí)現(xiàn)的后臺管理系統(tǒng) + 用戶小程序,支持 RBAC 動態(tài)權(quán)限、多租戶、數(shù)據(jù)權(quán)限、工作流、三方登錄、支付、短信、商城等功能
定位過程
分析代碼
渠道系統(tǒng)是一個常見的spring-boot web工程,使用了集成的tomcat。分析了代碼之后,發(fā)現(xiàn)并沒有特殊的地方,沒有特殊的過濾器或者攔截器,所以初步排除是業(yè)務(wù)代碼問題
分析調(diào)用流程
出現(xiàn)這個問題之后,首先確認(rèn)了下接口的調(diào)用流程。由于是內(nèi)部測試,所以調(diào)用流程較少。
Nginx-反向代理->渠道系統(tǒng)
公司是云服務(wù)器,網(wǎng)絡(luò)走的也是云的內(nèi)網(wǎng)。由于不明確問題的原因,所以用排除法,首先確認(rèn)服務(wù)器網(wǎng)絡(luò)是否有問題。
先確認(rèn)發(fā)送端到Nginx Host是否有問題:
[jboss@VM_0_139_centos~]$ping10.0.0.139
PING10.0.0.139(10.0.0.139)56(84)bytesofdata.
64bytesfrom10.0.0.139:icmp_seq=1ttl=64time=0.029ms
64bytesfrom10.0.0.139:icmp_seq=2ttl=64time=0.041ms
64bytesfrom10.0.0.139:icmp_seq=3ttl=64time=0.040ms
64bytesfrom10.0.0.139:icmp_seq=4ttl=64time=0.040ms
從ping結(jié)果上看,發(fā)送端到Nginx主機(jī)的延遲是無問題的,接下來查看Nginx到渠道系統(tǒng)的網(wǎng)絡(luò)。
>基于SpringCloudAlibaba+Gateway+Nacos+RocketMQ+Vue&Element實(shí)現(xiàn)的后臺管理系統(tǒng)+用戶小程序,支持RBAC動態(tài)權(quán)限、多租戶、數(shù)據(jù)權(quán)限、工作流、三方登錄、支付、短信、商城等功能
>
>*項(xiàng)目地址://gitee.com/zhijiantianya/yudao-cloud>
>*視頻教程://doc.iocoder.cn/video/>
#由于日志是沒問題的,這里直接復(fù)制上面日志了
[jboss@VM_0_139_centos~]$ping10.0.0.139
PING10.0.0.139(10.0.0.139)56(84)bytesofdata.
64bytesfrom10.0.0.139:icmp_seq=1ttl=64time=0.029ms
64bytesfrom10.0.0.139:icmp_seq=2ttl=64time=0.041ms
64bytesfrom10.0.0.139:icmp_seq=3ttl=64time=0.040ms
64bytesfrom10.0.0.139:icmp_seq=4ttl=64time=0.040ms
從ping結(jié)果上看,Nginx到渠道系統(tǒng)服務(wù)器網(wǎng)絡(luò)延遲也是沒問題的
既然網(wǎng)絡(luò)看似沒問題,那么可以繼續(xù)排除法,砍掉Nginx,客戶端直接再渠道系統(tǒng)的服務(wù)器上,通過回環(huán)地址(localhost)直連,避免經(jīng)過網(wǎng)卡/dns,縮小問題范圍看看能否復(fù)現(xiàn)(這個應(yīng)用和地址是我后期模擬的,測試的是一個空接口):
[jboss@VM_10_91_centostmp]$curl-w"@curl-time.txt"http://127.0.0.1:7744/send
success
http:200
dns:0.001s
redirect:0.000s
time_connect:0.001s
time_appconnect:0.000s
time_pretransfer:0.001s
time_starttransfer:0.073s
size_download:7bytes
speed_download:95.000B/s
----------
time_total:0.073s請求總耗時
從curl日志上看,通過回環(huán)地址調(diào)用一個空接口耗時也有73ms。這就奇怪了,跳過了中間所有調(diào)用節(jié)點(diǎn)(包括過濾器&攔截器之類),直接請求應(yīng)用一個空接口,都有73ms的耗時,再請求一次看看:
[jboss@VM_10_91_centostmp]$curl-w"@curl-time.txt"http://127.0.0.1:7744/send
success
http:200
dns:0.001s
redirect:0.000s
time_connect:0.001s
time_appconnect:0.000s
time_pretransfer:0.001s
time_starttransfer:0.003s
size_download:7bytes
speed_download:2611.000B/s
----------
time_total:0.003s
更奇怪的是,第二次請求耗時就正常了,變成了3ms。經(jīng)查閱資料,linux curl是默認(rèn)開啟http keep-alive的。就算不開啟keep-alive,每次重新handshake,也不至于需要70ms。
經(jīng)過不斷分析測試發(fā)現(xiàn),連續(xù)請求的話時間就會很短,每次請求只需要幾毫秒,但是如果隔一段時間再請求,就會花費(fèi)70ms以上。
從這個現(xiàn)象猜想,可能是某些緩存機(jī)制導(dǎo)致的,連續(xù)請求因?yàn)橛芯彺?,所以速度快,時間長緩存失效后導(dǎo)致時間長。
那么這個問題點(diǎn)到底在哪一層呢?tomcat層還是spring-webmvc呢?
光猜想定位不了問題,還是得實(shí)際測試一下,把渠道系統(tǒng)的代碼放到本地ide里啟動測試能否復(fù)現(xiàn)
但是導(dǎo)入本地Ide后,在Ide中啟動后并不能復(fù)現(xiàn)問題,并沒有70+ms的延遲問題。這下頭疼了,本地?zé)o法復(fù)現(xiàn),不能Debug,由于問題點(diǎn)不在業(yè)務(wù)代碼,也不能通過加日志的方式來Debug這時候可以祭出神器Arthas了
Arthas分析問題
Arthas 是Alibaba開源的Java診斷工具,深受開發(fā)者喜愛。當(dāng)你遇到以下類似問題而束手無策時,Arthas可以幫助你解決:
- 這個類從哪個 jar 包加載的?為什么會報各種類相關(guān)的 Exception?
- 我改的代碼為什么沒有執(zhí)行到?難道是我沒 commit?分支搞錯了?
- 遇到問題無法在線上 debug,難道只能通過加日志再重新發(fā)布嗎?
- 線上遇到某個用戶的數(shù)據(jù)處理有問題,但線上同樣無法 debug,線下無法重現(xiàn)!
- 是否有一個全局視角來查看系統(tǒng)的運(yùn)行狀況?
- 有什么辦法可以監(jiān)控到JVM的實(shí)時運(yùn)行狀態(tài)?
上面是Arthas的官方簡介,這次我只需要用他的一個小功能trace。動態(tài)計(jì)算方法調(diào)用路徑和時間,這樣我就可以定位時間在哪個地方被消耗了。
- trace 方法內(nèi)部調(diào)用路徑,并輸出方法路徑上的每個節(jié)點(diǎn)上耗時
- trace 命令能主動搜索 class-pattern/method-pattern
- 對應(yīng)的方法調(diào)用路徑,渲染和統(tǒng)計(jì)整個調(diào)用鏈路上的所有性能開銷和追蹤調(diào)用鏈路。
有了神器,那么該追蹤什么方法呢?由于我對Tomcat源碼不是很熟,所以只能從spring mvc下手,先來trace一下spring mvc的入口:
[arthas@24851]$traceorg.springframework.web.servlet.DispatcherServlet*
PressQorCtrl+Ctoabort.
Affect(class-cnt:1,method-cnt:44)costin508ms.
`---ts=2019-09-1421:07:44;thread_name=http-nio-7744-exec-2;id=11;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@7c136917
`---[2.952142ms]org.springframework.web.servlet.DispatcherServlet:buildLocaleContext()
`---ts=2019-09-1421:07:44;thread_name=http-nio-7744-exec-2;id=11;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@7c136917
`---[18.08903ms]org.springframework.web.servlet.DispatcherServlet:doService()
+---[0.041346ms]org.apache.commons.logging.Log:isDebugEnabled()#889
+---[0.022398ms]org.springframework.web.util.WebUtils:isIncludeRequest()#898
+---[0.014904ms]org.springframework.web.servlet.DispatcherServlet:getWebApplicationContext()#910
+---[1.071879ms]javax.servlet.http.HttpServletRequest:setAttribute()#910
+---[0.020977ms]javax.servlet.http.HttpServletRequest:setAttribute()#911
+---[0.017073ms]javax.servlet.http.HttpServletRequest:setAttribute()#912
+---[0.218277ms]org.springframework.web.servlet.DispatcherServlet:getThemeSource()#913
|`---[0.137568ms]org.springframework.web.servlet.DispatcherServlet:getThemeSource()
|`---[min=0.00783ms,max=0.014251ms,total=0.022081ms,count=2]org.springframework.web.servlet.DispatcherServlet:getWebApplicationContext()#782
+---[0.019363ms]javax.servlet.http.HttpServletRequest:setAttribute()#913
+---[0.070694ms]org.springframework.web.servlet.FlashMapManager:retrieveAndUpdate()#916
+---[0.01839ms]org.springframework.web.servlet.FlashMap:()#920
+---[0.016943ms]javax.servlet.http.HttpServletRequest:setAttribute()#920
+---[0.015268ms]javax.servlet.http.HttpServletRequest:setAttribute()#921
+---[15.050124ms]org.springframework.web.servlet.DispatcherServlet:doDispatch()#925
|`---[14.943477ms]org.springframework.web.servlet.DispatcherServlet:doDispatch()
|+---[0.019135ms]org.springframework.web.context.request.async.WebAsyncUtils:getAsyncManager()#953
|+---[2.108373ms]org.springframework.web.servlet.DispatcherServlet:checkMultipart()#960
||`---[2.004436ms]org.springframework.web.servlet.DispatcherServlet:checkMultipart()
||`---[1.890845ms]org.springframework.web.multipart.MultipartResolver:isMultipart()#1117
|+---[2.054361ms]org.springframework.web.servlet.DispatcherServlet:getHandler()#964
||`---[1.961963ms]org.springframework.web.servlet.DispatcherServlet:getHandler()
||+---[0.02051ms]java.util.List:iterator()#1183
||+---[min=0.003805ms,max=0.009641ms,total=0.013446ms,count=2]java.util.Iterator:hasNext()#1183
||+---[min=0.003181ms,max=0.009751ms,total=0.012932ms,count=2]java.util.Iterator:next()#1183
||+---[min=0.005841ms,max=0.015308ms,total=0.021149ms,count=2]org.apache.commons.logging.Log:isTraceEnabled()#1184
||`---[min=0.474739ms,max=1.19145ms,total=1.666189ms,count=2]org.springframework.web.servlet.HandlerMapping:getHandler()#1188
|+---[0.013071ms]org.springframework.web.servlet.HandlerExecutionChain:getHandler()#971
|+---[0.372236ms]org.springframework.web.servlet.DispatcherServlet:getHandlerAdapter()#971
||`---[0.280073ms]org.springframework.web.servlet.DispatcherServlet:getHandlerAdapter()
||+---[0.004804ms]java.util.List:iterator()#1224
||+---[0.003668ms]java.util.Iterator:hasNext()#1224
||+---[0.003038ms]java.util.Iterator:next()#1224
||+---[0.006451ms]org.apache.commons.logging.Log:isTraceEnabled()#1225
||`---[0.012683ms]org.springframework.web.servlet.HandlerAdapter:supports()#1228
|+---[0.012848ms]javax.servlet.http.HttpServletRequest:getMethod()#974
|+---[0.013132ms]java.lang.String:equals()#975
|+---[0.003025ms]org.springframework.web.servlet.HandlerExecutionChain:getHandler()#977
|+---[0.008095ms]org.springframework.web.servlet.HandlerAdapter:getLastModified()#977
|+---[0.006596ms]org.apache.commons.logging.Log:isDebugEnabled()#978
|+---[0.018024ms]org.springframework.web.context.request.ServletWebRequest:()#981
|+---[0.017869ms]org.springframework.web.context.request.ServletWebRequest:checkNotModified()#981
|+---[0.038542ms]org.springframework.web.servlet.HandlerExecutionChain:applyPreHandle()#986
|+---[0.00431ms]org.springframework.web.servlet.HandlerExecutionChain:getHandler()#991
|+---[4.248493ms]org.springframework.web.servlet.HandlerAdapter:handle()#991
|+---[0.014805ms]org.springframework.web.context.request.async.WebAsyncManager:isConcurrentHandlingStarted()#993
|+---[1.444994ms]org.springframework.web.servlet.DispatcherServlet:applyDefaultViewName()#997
||`---[0.067631ms]org.springframework.web.servlet.DispatcherServlet:applyDefaultViewName()
|+---[0.012027ms]org.springframework.web.servlet.HandlerExecutionChain:applyPostHandle()#998
|+---[0.373997ms]org.springframework.web.servlet.DispatcherServlet:processDispatchResult()#1008
||`---[0.197004ms]org.springframework.web.servlet.DispatcherServlet:processDispatchResult()
||+---[0.007074ms]org.apache.commons.logging.Log:isDebugEnabled()#1075
||+---[0.005467ms]org.springframework.web.context.request.async.WebAsyncUtils:getAsyncManager()#1081
||+---[0.004054ms]org.springframework.web.context.request.async.WebAsyncManager:isConcurrentHandlingStarted()#1081
||`---[0.011988ms]org.springframework.web.servlet.HandlerExecutionChain:triggerAfterCompletion()#1087
|`---[0.004015ms]org.springframework.web.context.request.async.WebAsyncManager:isConcurrentHandlingStarted()#1018
+---[0.005055ms]org.springframework.web.context.request.async.WebAsyncUtils:getAsyncManager()#928
`---[0.003422ms]org.springframework.web.context.request.async.WebAsyncManager:isConcurrentHandlingStarted()#928
~
[jboss@VM_10_91_centostmp]$curl-w"@curl-time.txt"http://127.0.0.1:7744/send
success
http:200
dns:0.001s
redirect:0.000s
time_connect:0.001s
time_appconnect:0.000s
time_pretransfer:0.001s
time_starttransfer:0.115s
size_download:7bytes
speed_download:60.000B/s
----------
time_total:0.115s
本次調(diào)用,調(diào)用端時間花費(fèi)115ms,但是從arthas trace上看,spring mvc只消耗了18ms,那么剩下的97ms去哪了呢?
本地測試后已經(jīng)可以排除spring mvc的問題了,最后也是唯一可能出問題的點(diǎn)就是tomcat
可是本人并不熟悉tomcat中的源碼,就連請求入口都不清楚,tomcat里需要trace的類都不好找。。。
不過沒關(guān)系,有神器Arthas,可以通過stack命令來反向查找調(diào)用路徑,以org.springframework.web.servlet.DispatcherServlet
作為參數(shù):
stack 輸出當(dāng)前方法被調(diào)用的調(diào)用路徑
很多時候我們都知道一個方法被執(zhí)行,但這個方法被執(zhí)行的路徑非常多,或者你根本就不知道這個方法是從那里被執(zhí)行了,此時你需要的是 stack 命令。
[arthas@24851]$stackorg.springframework.web.servlet.DispatcherServlet*
PressQorCtrl+Ctoabort.
Affect(class-cnt:1,method-cnt:44)costin495ms.
ts=2019-09-1421:15:19;thread_name=http-nio-7744-exec-5;id=14;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@7c136917
@org.springframework.web.servlet.FrameworkServlet.processRequest()
atorg.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:866)
atjavax.servlet.http.HttpServlet.service(HttpServlet.java:635)
atorg.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:851)
atjavax.servlet.http.HttpServlet.service(HttpServlet.java:742)
atorg.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
atorg.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
atorg.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
atorg.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
atorg.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
atorg.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:99)
atorg.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
atorg.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
atorg.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
atorg.springframework.web.filter.HttpPutFormContentFilter.doFilterInternal(HttpPutFormContentFilter.java:109)
atorg.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
atorg.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
atorg.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
atorg.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:81)
atorg.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
atorg.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
atorg.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
atorg.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:200)
atorg.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
atorg.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
atorg.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
atorg.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198)
atorg.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
atorg.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:496)
atorg.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)
atorg.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)
atorg.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
atorg.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)
atorg.apache.coyote.http11.Http11Processor.service(Http11Processor.java:803)
atorg.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
atorg.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:790)
atorg.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1468)
atorg.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
atorg.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
atjava.lang.Thread.run(Thread.java:748)
ts=2019-09-1421:15:19;thread_name=http-nio-7744-exec-5;id=14;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@7c136917
@org.springframework.web.servlet.DispatcherServlet.doService()
atorg.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:974)
atorg.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:866)
atjavax.servlet.http.HttpServlet.service(HttpServlet.java:635)
atorg.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:851)
atjavax.servlet.http.HttpServlet.service(HttpServlet.java:742)
atorg.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
atorg.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
atorg.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
atorg.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
atorg.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
atorg.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:99)
atorg.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
atorg.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
atorg.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
atorg.springframework.web.filter.HttpPutFormContentFilter.doFilterInternal(HttpPutFormContentFilter.java:109)
atorg.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
atorg.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
atorg.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
atorg.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:81)
atorg.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
atorg.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
atorg.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
atorg.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:200)
atorg.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
atorg.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
atorg.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
atorg.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198)
atorg.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
atorg.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:496)
atorg.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)
atorg.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)
atorg.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
atorg.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)
atorg.apache.coyote.http11.Http11Processor.service(Http11Processor.java:803)
atorg.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
atorg.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:790)
atorg.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1468)
atorg.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
atorg.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
atjava.lang.Thread.run(Thread.java:748)
從stack日志上可以很直觀的看出DispatchServlet的調(diào)用棧,那么這么長的路徑,該trace哪個類呢(這里跳過spring mvc中的過濾器的trace過程,實(shí)際排查的時候也trace了一遍,但這詭異的時間消耗不是由這里過濾器產(chǎn)生的)?
有一定經(jīng)驗(yàn)的老司機(jī)從名字上大概也能猜出來從哪里下手比較好,那就是org.apache.coyote.http11.Http11Processor.service
,從名字上看,http1.1處理器,這可能是一個比較好的切入點(diǎn)。下面來trace一下:
[arthas@24851]$traceorg.apache.coyote.http11.Http11Processorservice
PressQorCtrl+Ctoabort.
Affect(class-cnt:1,method-cnt:1)costin269ms.
`---ts=2019-09-1421:22:51;thread_name=http-nio-7744-exec-8;id=17;is_daemon=true;priority=5;TCCL=org.springframework.boot.loader.LaunchedURLClassLoader@20ad9418
`---[131.650285ms]org.apache.coyote.http11.Http11Processor:service()
+---[0.036851ms]org.apache.coyote.Request:getRequestProcessor()#667
+---[0.009986ms]org.apache.coyote.RequestInfo:setStage()#668
+---[0.008928ms]org.apache.coyote.http11.Http11Processor:setSocketWrapper()#671
+---[0.013236ms]org.apache.coyote.http11.Http11InputBuffer:init()#672
+---[0.00981ms]org.apache.coyote.http11.Http11OutputBuffer:init()#673
+---[min=0.00213ms,max=0.007317ms,total=0.009447ms,count=2]org.apache.coyote.http11.Http11Processor:getErrorState()#683
+---[min=0.002098ms,max=0.008888ms,total=0.010986ms,count=2]org.apache.coyote.ErrorState:isError()#683
+---[min=0.002448ms,max=0.007149ms,total=0.009597ms,count=2]org.apache.coyote.http11.Http11Processor:isAsync()#683
+---[min=0.002399ms,max=0.00852ms,total=0.010919ms,count=2]org.apache.tomcat.util.net.AbstractEndpoint:isPaused()#683
+---[min=0.033587ms,max=0.11832ms,total=0.151907ms,count=2]org.apache.coyote.http11.Http11InputBuffer:parseRequestLine()#687
+---[0.005384ms]org.apache.tomcat.util.net.AbstractEndpoint:isPaused()#695
+---[0.007924ms]org.apache.coyote.Request:getMimeHeaders()#702
+---[0.006744ms]org.apache.tomcat.util.net.AbstractEndpoint:getMaxHeaderCount()#702
+---[0.012574ms]org.apache.tomcat.util.http.MimeHeaders:setLimit()#702
+---[0.14319ms]org.apache.coyote.http11.Http11InputBuffer:parseHeaders()#703
+---[0.003997ms]org.apache.coyote.Request:getMimeHeaders()#743
+---[0.026561ms]org.apache.tomcat.util.http.MimeHeaders:values()#743
+---[min=0.002869ms,max=0.01203ms,total=0.014899ms,count=2]java.util.Enumeration:hasMoreElements()#745
+---[0.070114ms]java.util.Enumeration:nextElement()#746
+---[0.010921ms]java.lang.String:toLowerCase()#746
+---[0.008453ms]java.lang.String:contains()#746
+---[0.002698ms]org.apache.coyote.http11.Http11Processor:getErrorState()#775
+---[0.00307ms]org.apache.coyote.ErrorState:isError()#775
+---[0.002708ms]org.apache.coyote.RequestInfo:setStage()#777
+---[0.171139ms]org.apache.coyote.http11.Http11Processor:prepareRequest()#779
+---[0.009349ms]org.apache.tomcat.util.net.SocketWrapperBase:decrementKeepAlive()#794
+---[0.002574ms]org.apache.coyote.http11.Http11Processor:getErrorState()#800
+---[0.002696ms]org.apache.coyote.ErrorState:isError()#800
+---[0.002499ms]org.apache.coyote.RequestInfo:setStage()#802
+---[0.005641ms]org.apache.coyote.http11.Http11Processor:getAdapter()#803
+---[129.868916ms]org.apache.coyote.Adapter:service()#803
+---[0.003859ms]org.apache.coyote.http11.Http11Processor:getErrorState()#809
+---[0.002365ms]org.apache.coyote.ErrorState:isError()#809
+---[0.003844ms]org.apache.coyote.http11.Http11Processor:isAsync()#809
+---[0.002382ms]org.apache.coyote.Response:getStatus()#809
+---[0.002476ms]org.apache.coyote.http11.Http11Processor:statusDropsConnection()#809
+---[0.002284ms]org.apache.coyote.RequestInfo:setStage()#838
+---[0.00222ms]org.apache.coyote.http11.Http11Processor:isAsync()#839
+---[0.037873ms]org.apache.coyote.http11.Http11Processor:endRequest()#843
+---[0.002188ms]org.apache.coyote.RequestInfo:setStage()#845
+---[0.002112ms]org.apache.coyote.http11.Http11Processor:getErrorState()#849
+---[0.002063ms]org.apache.coyote.ErrorState:isError()#849
+---[0.002504ms]org.apache.coyote.http11.Http11Processor:isAsync()#853
+---[0.009808ms]org.apache.coyote.Request:updateCounters()#854
+---[0.002008ms]org.apache.coyote.http11.Http11Processor:getErrorState()#855
+---[0.002192ms]org.apache.coyote.ErrorState:isIoAllowed()#855
+---[0.01968ms]org.apache.coyote.http11.Http11InputBuffer:nextRequest()#856
+---[0.010065ms]org.apache.coyote.http11.Http11OutputBuffer:nextRequest()#857
+---[0.002576ms]org.apache.coyote.RequestInfo:setStage()#870
+---[0.016599ms]org.apache.coyote.http11.Http11Processor:processSendfile()#872
+---[0.008182ms]org.apache.coyote.http11.Http11InputBuffer:getParsingRequestLinePhase()#688
+---[0.0075ms]org.apache.coyote.http11.Http11Processor:handleIncompleteRequestLineRead()#690
+---[0.001979ms]org.apache.coyote.RequestInfo:setStage()#875
+---[0.001981ms]org.apache.coyote.http11.Http11Processor:getErrorState()#877
+---[0.001934ms]org.apache.coyote.ErrorState:isError()#877
+---[0.001995ms]org.apache.tomcat.util.net.AbstractEndpoint:isPaused()#877
+---[0.002403ms]org.apache.coyote.http11.Http11Processor:isAsync()#879
`---[0.006176ms]org.apache.coyote.http11.Http11Processor:isUpgrade()#881
日志里有一個129ms的耗時點(diǎn)(時間比沒開arthas的時候更長是因?yàn)閍rthas本身帶來的性能消耗,所以生產(chǎn)環(huán)境小心使用),這個就是要找的問題點(diǎn)。
打問題點(diǎn)找到了,那怎么定位是什么導(dǎo)致的問題呢,又如何解決呢?
繼續(xù)trace吧,細(xì)化到具體的代碼塊或者內(nèi)容。trace由于性能考慮,不會展示所有的調(diào)用路徑,如果調(diào)用路徑過深,只有手動深入trace,原則就是trace耗時長的那個方法:
[arthas@24851]$traceorg.apache.coyote.Adapterservice
PressQorCtrl+Ctoabort.
Affect(class-cnt:1,method-cnt:1)costin608ms.
`---ts=2019-09-1421:34:33;thread_name=http-nio-7744-exec-1;id=10;is_daemon=true;priority=5;TCCL=org.springframework.boot.loader.LaunchedURLClassLoader@20ad9418
`---[81.70999ms]org.apache.catalina.connector.CoyoteAdapter:service()
+---[0.032546ms]org.apache.coyote.Request:getNote()#302
+---[0.007148ms]org.apache.coyote.Response:getNote()#303
+---[0.007475ms]org.apache.catalina.connector.Connector:getXpoweredBy()#324
+---[0.00447ms]org.apache.coyote.Request:getRequestProcessor()#331
+---[0.007902ms]java.lang.ThreadLocal:get()#331
+---[0.006522ms]org.apache.coyote.RequestInfo:setWorkerThreadName()#331
+---[73.793798ms]org.apache.catalina.connector.CoyoteAdapter:postParseRequest()#336
+---[0.001536ms]org.apache.catalina.connector.Connector:getService()#339
+---[0.004469ms]org.apache.catalina.Service:getContainer()#339
+---[0.007074ms]org.apache.catalina.Engine:getPipeline()#339
+---[0.004334ms]org.apache.catalina.Pipeline:isAsyncSupported()#339
+---[0.002466ms]org.apache.catalina.connector.Request:setAsyncSupported()#339
+---[6.01E-4ms]org.apache.catalina.connector.Connector:getService()#342
+---[0.001859ms]org.apache.catalina.Service:getContainer()#342
+---[9.65E-4ms]org.apache.catalina.Engine:getPipeline()#342
+---[0.005231ms]org.apache.catalina.Pipeline:getFirst()#342
+---[7.239154ms]org.apache.catalina.Valve:invoke()#342
+---[0.006904ms]org.apache.catalina.connector.Request:isAsync()#345
+---[0.00509ms]org.apache.catalina.connector.Request:finishRequest()#372
+---[0.051461ms]org.apache.catalina.connector.Response:finishResponse()#373
+---[0.007244ms]java.util.concurrent.atomic.AtomicBoolean:()#379
+---[0.007314ms]org.apache.coyote.Response:action()#380
+---[0.004518ms]org.apache.catalina.connector.Request:isAsyncCompleting()#382
+---[0.001072ms]org.apache.catalina.connector.Request:getContext()#394
+---[0.007166ms]java.lang.System:currentTimeMillis()#401
+---[0.004367ms]org.apache.coyote.Request:getStartTime()#401
+---[0.011483ms]org.apache.catalina.Context:logAccess()#401
+---[0.0014ms]org.apache.coyote.Request:getRequestProcessor()#406
+---[min=8.0E-4ms,max=9.22E-4ms,total=0.001722ms,count=2]java.lang.Integer:()#406
+---[0.001082ms]java.lang.reflect.Method:invoke()#406
+---[0.001851ms]org.apache.coyote.RequestInfo:setWorkerThreadName()#406
+---[0.035805ms]org.apache.catalina.connector.Request:recycle()#410
`---[0.007849ms]org.apache.catalina.connector.Response:recycle()#411
一段無聊的手動深入trace之后………………
[arthas@24851]$traceorg.apache.catalina.webresources.AbstractArchiveResourceSetgetArchiveEntries
PressQorCtrl+Ctoabort.
Affect(class-cnt:4,method-cnt:2)costin150ms.
`---ts=2019-09-1421:36:26;thread_name=http-nio-7744-exec-3;id=12;is_daemon=true;priority=5;TCCL=org.springframework.boot.loader.LaunchedURLClassLoader@20ad9418
`---[75.743681ms]org.apache.catalina.webresources.JarWarResourceSet:getArchiveEntries()
+---[0.025731ms]java.util.HashMap:()#106
+---[0.097729ms]org.apache.catalina.webresources.JarWarResourceSet:openJarFile()#109
+---[0.091037ms]java.util.jar.JarFile:getJarEntry()#110
+---[0.096325ms]java.util.jar.JarFile:getInputStream()#111
+---[0.451916ms]org.apache.catalina.webresources.TomcatJarInputStream:()#113
+---[min=0.001175ms,max=0.001176ms,total=0.002351ms,count=2]java.lang.Integer:()#114
+---[0.00104ms]java.lang.reflect.Method:invoke()#114
+---[0.045105ms]org.apache.catalina.webresources.TomcatJarInputStream:getNextJarEntry()#114
+---[min=5.02E-4ms,max=0.008531ms,total=0.028864ms,count=31]java.util.jar.JarEntry:getName()#116
+---[min=5.39E-4ms,max=0.022805ms,total=0.054647ms,count=31]java.util.HashMap:put()#116
+---[min=0.004452ms,max=34.479307ms,total=74.206249ms,count=31]org.apache.catalina.webresources.TomcatJarInputStream:getNextJarEntry()#117
+---[0.018358ms]org.apache.catalina.webresources.TomcatJarInputStream:getManifest()#119
+---[0.006429ms]org.apache.catalina.webresources.JarWarResourceSet:setManifest()#120
+---[0.010904ms]org.apache.tomcat.util.compat.JreCompat:isJre9Available()#121
+---[0.003307ms]org.apache.catalina.webresources.TomcatJarInputStream:getMetaInfEntry()#133
+---[5.5E-4ms]java.util.jar.JarEntry:getName()#135
+---[6.42E-4ms]java.util.HashMap:put()#135
+---[0.001981ms]org.apache.catalina.webresources.TomcatJarInputStream:getManifestEntry()#137
+---[0.064484ms]org.apache.catalina.webresources.TomcatJarInputStream:close()#141
+---[0.007961ms]org.apache.catalina.webresources.JarWarResourceSet:closeJarFile()#151
`---[0.004643ms]java.io.InputStream:close()#155
發(fā)現(xiàn)了一個值得暫停思考的點(diǎn):
+---[min=0.004452ms,max=34.479307ms,total=74.206249ms,count=31]org.apache.catalina.webresources.TomcatJarInputStream:getNextJarEntry()#117
這行代碼加載了31次,一共耗時74ms;從名字上看,應(yīng)該是tomcat加載jar包時的耗時,那么是加載了31個jar包的耗時,還是加載了jar包內(nèi)的某些資源31次耗時呢?
TomcatJarInputStream這個類源碼的注釋寫到:
Thepurposeofthissub-classistoobtainreferencestotheJarEntryobjectsforMETA-INF/andMETA-INF/MANIFEST.MFthatareotherwiseswallowedbytheJarInputStreamimplementation.
大概意思也就是,獲取jar包內(nèi)META-INF/,META-INF/MANIFEST的資源,這是一個子類,更多的功能在父類JarInputStream里。
其實(shí)看到這里大概也能猜到問題了,tomcat加載jar包內(nèi)META-INF/,META-INF/MANIFEST的資源導(dǎo)致的耗時,至于為什么連續(xù)請求不會耗時,應(yīng)該是tomcat的緩存機(jī)制(下面介紹源碼分析)
不著急定位問題,試著通過Arthas最終定位問題細(xì)節(jié),繼續(xù)手動深入trace
[arthas@24851]$traceorg.apache.catalina.webresources.TomcatJarInputStream*
PressQorCtrl+Ctoabort.
Affect(class-cnt:1,method-cnt:4)costin44ms.
`---ts=2019-09-1421:37:47;thread_name=http-nio-7744-exec-5;id=14;is_daemon=true;priority=5;TCCL=org.springframework.boot.loader.LaunchedURLClassLoader@20ad9418
`---[0.234952ms]org.apache.catalina.webresources.TomcatJarInputStream:createZipEntry()
+---[0.039455ms]java.util.jar.JarInputStream:createZipEntry()#43
`---[0.007827ms]java.lang.String:equals()#44
`---ts=2019-09-1421:37:47;thread_name=http-nio-7744-exec-5;id=14;is_daemon=true;priority=5;TCCL=org.springframework.boot.loader.LaunchedURLClassLoader@20ad9418
`---[0.050222ms]org.apache.catalina.webresources.TomcatJarInputStream:createZipEntry()
+---[0.001889ms]java.util.jar.JarInputStream:createZipEntry()#43
`---[0.001643ms]java.lang.String:equals()#46
#這里一共31個trace日志,刪減了剩下的
從方法名上看,還是加載資源之類的意思。都已經(jīng)到j(luò)dk源碼了,這時候來看一下TomcatJarInputStream
這個類的源碼:
/**
*CreatesanewJarEntry
(ZipEntry
)forthe
*specifiedJARfileentryname.Themanifestattributesof
*thespecifiedJARfileentrynamewillbecopiedtothenew
*JarEntry
.
*
*@paramnamethenameoftheJAR/ZIPfileentry
*@returntheJarEntry
objectjustcreated
*/
protectedZipEntrycreateZipEntry(Stringname){
JarEntrye=newJarEntry(name);
if(man!=null){
e.attr=man.getAttributes(name);
}
returne;
}
這個createZipEntry
有個name參數(shù),從注釋上看,是jar/zip文件名,如果能得到文件名這種關(guān)鍵信息,就可以直接定位問題了;還是通過Arthas,使用watch命令,動態(tài)監(jiān)測方法調(diào)用數(shù)據(jù)
watch方法執(zhí)行數(shù)據(jù)觀測
讓你能方便的觀察到指定方法的調(diào)用情況。能觀察到的范圍為:返回值、拋出異常、入?yún)?,通過編寫 OGNL 表達(dá)式進(jìn)行對應(yīng)變量的查看。
watch 該方法的入?yún)?/p>
[arthas@24851]$watchorg.apache.catalina.webresources.TomcatJarInputStreamcreateZipEntry"{params[0]}"
PressQorCtrl+Ctoabort.
Affect(class-cnt:1,method-cnt:1)costin27ms.
ts=2019-09-1421:51:14;[cost=0.14547ms]result=@ArrayList[
@String[META-INF/],
]
ts=2019-09-1421:51:14;[cost=0.048028ms]result=@ArrayList[
@String[META-INF/MANIFEST.MF],
]
ts=2019-09-1421:51:14;[cost=0.046071ms]result=@ArrayList[
@String[META-INF/resources/],
]
ts=2019-09-1421:51:14;[cost=0.033855ms]result=@ArrayList[
@String[META-INF/resources/swagger-ui.html],
]
ts=2019-09-1421:51:14;[cost=0.039138ms]result=@ArrayList[
@String[META-INF/resources/webjars/],
]
ts=2019-09-1421:51:14;[cost=0.033701ms]result=@ArrayList[
@String[META-INF/resources/webjars/springfox-swagger-ui/],
]
ts=2019-09-1421:51:14;[cost=0.033644ms]result=@ArrayList[
@String[META-INF/resources/webjars/springfox-swagger-ui/favicon-16x16.png],
]
ts=2019-09-1421:51:14;[cost=0.033976ms]result=@ArrayList[
@String[META-INF/resources/webjars/springfox-swagger-ui/springfox.css],
]
ts=2019-09-1421:51:14;[cost=0.032818ms]result=@ArrayList[
@String[META-INF/resources/webjars/springfox-swagger-ui/swagger-ui-standalone-preset.js.map],
]
ts=2019-09-1421:51:14;[cost=0.04651ms]result=@ArrayList[
@String[META-INF/resources/webjars/springfox-swagger-ui/swagger-ui.css],
]
ts=2019-09-1421:51:14;[cost=0.034793ms]result=@ArrayList[
@String[META-INF/resources/webjars/springfox-swagger-ui/swagger-ui.js.map],
這下直接看到了具體加載的資源名,這么熟悉的名字:swagger-ui,一個國外的rest接口文檔工具,又有國內(nèi)開發(fā)者基于swagger-ui做了一套spring mvc的集成工具,通過注解就可以自動生成swagger-ui需要的接口定義json文件,用起來還比較方便,就是侵入性較強(qiáng)。
刪除swagger的jar包后問題,詭異的70+ms就消失了
<dependency>
<groupId>io.springfoxgroupId>
<artifactId>springfox-swagger2artifactId>
<version>2.9.2version>
dependency>
<dependency>
<groupId>io.springfoxgroupId>
<artifactId>springfox-swagger-uiartifactId>
<version>2.9.2version>
dependency>
那么為什么swagger會導(dǎo)致請求耗時呢,為什么每次請求偶讀會加載swagger內(nèi)部的靜態(tài)資源呢?
其實(shí)這是tomcat-embed的一個bug吧,下面詳細(xì)介紹一下該Bug
Tomcat embed Bug分析&解決
源碼分析過程實(shí)在太漫長,而且也不是本文的重點(diǎn),所以就不介紹了, 下面直接介紹下分析結(jié)果
順便貼一張tomcat處理請求的核心類圖
為什么每次請求會加載Jar包內(nèi)的靜態(tài)資源
關(guān)鍵在于org.apache.catalina.mapper.Mapper#internalMapWrapper
這個方法,該版本下處理請求的方式有問題,導(dǎo)致每次都校驗(yàn)靜態(tài)資源。
為什么連續(xù)請求不會出現(xiàn)問題
因?yàn)門omcat對于這種靜態(tài)資源的解析是有緩存的,優(yōu)先從緩存查找,緩存過期后再重新解析。具體參考org.apache.catalina.webresources.Cache
,默認(rèn)過期時間ttl是5000ms。
為什么本地不會復(fù)現(xiàn)
其實(shí)確切的說,是通過spring-boot打包插件后不能復(fù)現(xiàn)。由于啟動方式的不同,tomcat使用了不同的類去處理靜態(tài)資源,所以沒問題
如何解決
升級tomcat-embed版本即可
當(dāng)前出現(xiàn)Bug的版本為:
spring-boot:2.0.2.RELEASE,內(nèi)置的tomcat embed版本為8.5.31
升級tomcat embed版本至8.5.40+即可解決此問題,新版本已經(jīng)修復(fù)了
通過替換springboot pom properties方式
如果項(xiàng)目是maven是繼承的springboot,即parent配置為springboot的,或者dependencyManagement中import spring boot包的
<parent>
<groupId>org.springframework.bootgroupId>
<artifactId>spring-boot-starter-parentartifactId>
<version>2.0.2.RELEASEversion>
<relativePath/>
parent>
pom中直接覆蓋properties即可:
<properties>
<tomcat.version>8.5.40tomcat.version>
properties>
升級spring boot版本
springboot 2.1.0.RELEASE中的tomcat embed版本已經(jīng)大于8.5.31了,所以直接將springboot升級至該版本及以上版本就可以解決此問題
審核編輯 :李倩
-
接口
+關(guān)注
關(guān)注
33文章
8459瀏覽量
150748 -
代碼
+關(guān)注
關(guān)注
30文章
4728瀏覽量
68251 -
spring
+關(guān)注
關(guān)注
0文章
338瀏覽量
14298 -
SpringBoot
+關(guān)注
關(guān)注
0文章
173瀏覽量
161
原文標(biāo)題:快速定位SpringBoot接口超時問題的神器!
文章出處:【微信號:芋道源碼,微信公眾號:芋道源碼】歡迎添加關(guān)注!文章轉(zhuǎn)載請注明出處。
發(fā)布評論請先 登錄
相關(guān)推薦
評論