使用方法
可用来分析 java线程的锁等待或者死锁情况。
Usage:
jstack [-l] <pid>
(to connect to running process)
jstack -F [-m] [-l] <pid>
(to connect to a hung process)
jstack [-m] [-l] <executable> <core>
(to connect to a core file)
jstack [-m] [-l] [server_id@]<remote server IP or hostname>
(to connect to a remote debug server)
Options:
-F to force a thread dump. Use when jstack <pid> does not respond (process is hung)
-m to print both java and native frames (mixed mode)
-l long listing. Prints additional information about locks
-h or -help to print this help message
jstack -m -l tid > 1.txt
示例
2017-06-23 08:57:30 Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.111-b01 mixed mode): "Attach Listener" daemon prio=10 tid=0x00007f028c00e000 nid=0x17f7 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "shellexecutor-7" prio=10 tid=0x00007f028c00d800 nid=0x5eb0 waiting on condition [0x00007f02659cc000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000786ff3210> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:957) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:917) at java.lang.Thread.run(Thread.java:682) Locked ownable synchronizers: - None
上面的stack trace一共可以拆解成5个部分。
第一部分: Full thread dump identifier
Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.111-b01 mixed mode):
可以看到dump时间,虚拟机的信息。
第二部分: Java EE middleware, third party & custom application Threads
"shellexecutor-7" prio=10 tid=0x00007f028c00d800 nid=0x5eb0 waiting on condition [0x00007f02659cc000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000786ff3210> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:957) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:917) at java.lang.Thread.run(Thread.java:682) Locked ownable synchronizers: - None
此处是我们最为关心的地方,也是通过stack trace查找问题的地方,每个item的具体含义是
-Thread name: “shellexecutor-7”
–线程优先级: prio=10
-java线程的identifier:tid=0x00007f028c00d800
-native线程的identifier: nid=0x5eb0
–线程的状态: java.lang.Thread.State: WAITING (parking)
–线程栈起始地址:[0x00007f02659cc000]
第三部分:HotSpot VM Thread
"Low Memory Detector" daemon prio=10 tid=0x00007f02b0092000 nid=0x1db5 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "C2 CompilerThread1" daemon prio=10 tid=0x00007f02b008f800 nid=0x1db4 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "C2 CompilerThread0" daemon prio=10 tid=0x00007f02b008c800 nid=0x1db3 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "Signal Dispatcher" daemon prio=10 tid=0x00007f02b008a800 nid=0x1db2 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "Finalizer" daemon prio=10 tid=0x00007f02b006c000 nid=0x1db1 in Object.wait() [0x00007f02ac829000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000007855b45c8> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) - locked <0x00000007855b45c8> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:171) Locked ownable synchronizers: - None "Reference Handler" daemon prio=10 tid=0x00007f02b006a000 nid=0x1db0 in Object.wait() [0x00007f02ac92a000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000007855b4588> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:485) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) - locked <0x00000007855b4588> (a java.lang.ref.Reference$Lock) Locked ownable synchronizers: - None
第四部分: HotSpot GC Thread
"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007f02b001c800 nid=0x1dad runnable "GC task thread#1 (ParallelGC)" prio=10 tid=0x00007f02b001e000 nid=0x1dae runnable
第五部分:JNI global references count
JNI global references: 1660
问题定位分析步骤
jstack可以定位到线程堆栈,根据堆栈信息我们可以定位到具体代码,所以它在JVM性能调优中使用得非常多。下面我们来一个实例找出某个Java进程中最耗费CPU的Java线程并定位堆栈信息,用到的命令有ps、top、printf、jstack、grep。
第一步先找出Java进程ID,我部署在服务器上的Java应用名称为mrf-center:
root@ubuntu:/# ps -ef | grep mrf-center | grep -v grep root 21711 1 1 14:47 pts/3 00:02:10 java -jar mrf-center.jar
得到进程ID为21711,第二步找出该进程内最耗费CPU的线程,可以使用ps -Lfp pid或者ps -mp pid -o THREAD, tid, time或者top -Hp pid,我这里用第三个,输出如下:
TIME列就是各个Java线程耗费的CPU时间,CPU时间最长的是线程ID为21742的线程,用
printf "%x\n" 21742
得到21742的十六进制值为54ee,下面会用到。
OK,下一步终于轮到jstack上场了,它用来输出进程21711的堆栈信息,然后根据线程ID的十六进制值grep,如下:
root@ubuntu:/# jstack 21711 | grep 54ee "PollIntervalRetrySchedulerThread" prio=10 tid=0x00007f950043e000 nid=0x54ee in Object.wait() [0x00007f94c6eda000]
可以看到CPU消耗在PollIntervalRetrySchedulerThread这个类的Object.wait(),我找了下我的代码,定位到下面的代码:
// Idle wait getLog().info("Thread [" + getName() + "] is idle waiting..."); schedulerThreadState = PollTaskSchedulerThreadState.IdleWaiting; long now = System.currentTimeMillis(); long waitTime = now + getIdleWaitTime(); long timeUntilContinue = waitTime - now; synchronized(sigLock) { try { if(!halted.get()) { sigLock.wait(timeUntilContinue); } } catch (InterruptedException ignore) { } }
它是轮询任务的空闲等待代码,上面的sigLock.wait(timeUntilContinue)就对应了前面的Object.wait()。