目录
简单介绍jvm的相关工具,例如 jps, jstack, jstat, jmap, jinfo.
前言
jvm提供了一些工具用来用来查看运行java程序的一些状态。
Jps
首先就是jps,用来查看运行的java线程。
hadoop@spark6:~$ jps
131649 KyuubiSubmit
39087 KyuubiSubmit
144342 Jps
136470 KyuubiSubmit
在知道java线程的信息之后,可以使用ps -ef查看一些详细的启动信息.
其中 -e 代表所有线程同A, -f代表 full-format, 包括command-line.
hadoop@spark6:~$ ps -ef|grep KyuubiSubmit
hadoop 39087 1 99 05:29 ? 13:15:44 /home/hadoop/java-current/bin/java -cp /home/hadoop/kyuubi_hz_cluster_10/kyuubi-0.6.2-bin-spark-2.1.3/lib/kyuubi-server-0.6.2.jar:/home/hadoop/kyuubi_hz_cluster_10/spark-2.3.2-bin-ne-0.1.0/conf/:/home/hadoop/kyuubi_hz_cluster_10/spark-2.3.2-bin-ne-0.1.0/jars/*:/home/hadoop/hadoop-client4cluster10/etc/hadoop/ -Xmx164g -XX:+PrintFlagsFinal -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled -XX:+UseCondCardMark -XX:PermSize=1024m -XX:MaxPermSize=1024m -XX:MaxDirectMemorySize=8192m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./logs -XX:OnOutOfMemoryError=kill -9 %p -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:./logs/kyuubi-server-gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=50 -XX:GCLogFileSize=5M -XX:NewRatio=3 -Dio.netty.noPreferDirect=true -Dio.netty.recycler.maxCapacity=0 -Dio.netty.noUnsafe=true org.apache.spark.deploy.KyuubiSubmit --class yaooqinn.kyuubi.server.KyuubiServer /home/hadoop/kyuubi_hz_cluster_10/kyuubi-0.6.2-bin-spark-2.1.3/lib/kyuubi-server-0.6.2.jar
如果我们使用 ps -aux 命令,可以看到更多的信息。
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
VSZ 进程使用的虚拟內存量(KB);
RSS 该进程占用的固定內存量(KB)(驻留中页的数量);
TTY 该进程在哪个终端上进行(登陆者的终端位置),若与终端无关,則显示(?)。
START 该进程被启动时间;
TIME 该进程实际使用CPU运行时间;
COMMAND 命令的名称和参数;
其他信息参考PS -aux 命令详解
hadoop@spark6:~$ ps -aux|grep KyuubiSubmit
hadoop 39087 179 3.2 197330936 8695348 ? Sl 05:29 828:00 /home/hadoop/java-current/bin/java -cp /home/hadoop/kyuubi_hz_cluster_10/kyuubi-0.6.2-bin-spark-2.1.3/lib/kyuubi-server-0.6.2.jar:/home/hadoop/kyuubi_hz_cluster_10/spark-2.3.2-bin-ne-0.1.0/conf/:/home/hadoop/kyuubi_hz_cluster_10/spark-2.3.2-bin-ne-0.1.0/jars/*:/home/hadoop/hadoop-client4cluster10/etc/hadoop/ -Xmx164g -XX:+PrintFlagsFinal -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled -XX:+UseCondCardMark -XX:PermSize=1024m -XX:MaxPermSize=1024m -XX:MaxDirectMemorySize=8192m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./logs -XX:OnOutOfMemoryError=kill -9 %p -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:./logs/kyuubi-server-gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=50 -XX:GCLogFileSize=5M -XX:NewRatio=3 -Dio.netty.noPreferDirect=true -Dio.netty.recycler.maxCapacity=0 -Dio.netty.noUnsafe=true org.apache.spark.deploy.KyuubiSubmit --class yaooqinn.kyuubi.server.KyuubiServer /home/hadoop/kyuubi_hz_cluster_10/kyuubi-0.6.2-bin-spark-2.1.3/lib/kyuubi-server-0.6.2.jar
Jstack
Jstack命令可以用来打印当前java 线程的线程栈,比如如果java程序长时间无响应,可以使用Jstack命令查看当前线程是否卡在了哪里,看是否存在死锁等情况。
jstack命令生成的thread dump信息包含了JVM中所有存活的线程.
在dump中,线程一般存在如下几种状态: 1、RUNNABLE,线程处于执行中 2、BLOCKED,线程被阻塞 3、WAITING,线程正在等待
下面是一个示例.
可以看到一个java程序里面有很多线程。一部分是JVM内部的功能线程,另一部分是用户自己的线程,可以参考JVM内部线程。
$ jstack 29047
2019-06-29 16:51:28
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.111-b14 mixed mode):
"Attach Listener" #13 daemon prio=9 os_prio=31 tid=0x00007fc168804800 nid=0xe13 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"NGSession 1: (idle)" #12 prio=5 os_prio=31 tid=0x00007fc168146000 nid=0x4603 in Object.wait() [0x000070000ec8b000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007aae55e68> (a java.lang.Object)
at java.lang.Object.wait(Object.java:502)
at com.martiansoftware.nailgun.NGSession.nextSocket(NGSession.java:167)
- locked <0x00000007aae55e68> (a java.lang.Object)
at com.martiansoftware.nailgun.NGSession.run(NGSession.java:186)
"DestroyJavaVM" #11 prio=5 os_prio=31 tid=0x00007fc168112800 nid=0x2803 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"NGServer(localhost/127.0.0.1, 65319,27633241-7c9d-4458-9831-527f37863f1d)" #9 prio=5 os_prio=31 tid=0x00007fc169095800 nid=0x4703 runnable [0x000070000eb88000]
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
at java.net.ServerSocket.implAccept(ServerSocket.java:545)
at java.net.ServerSocket.accept(ServerSocket.java:513)
at com.martiansoftware.nailgun.NGServer.run(NGServer.java:418)
at java.lang.Thread.run(Thread.java:745)
"Service Thread" #8 daemon prio=9 os_prio=31 tid=0x00007fc168002000 nid=0x3a03 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread2" #7 daemon prio=9 os_prio=31 tid=0x00007fc16883f000 nid=0x3803 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" #6 daemon prio=9 os_prio=31 tid=0x00007fc169816800 nid=0x3603 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #5 daemon prio=9 os_prio=31 tid=0x00007fc16900f000 nid=0x3503 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" #4 daemon prio=9 os_prio=31 tid=0x00007fc16803b800 nid=0x3403 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" #3 daemon prio=8 os_prio=31 tid=0x00007fc168026800 nid=0x2f03 in Object.wait() [0x000070000e473000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007aab08e98> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked <0x00000007aab08e98> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
"Reference Handler" #2 daemon prio=10 os_prio=31 tid=0x00007fc169020000 nid=0x2e03 in Object.wait() [0x000070000e370000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007aab06b40> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x00000007aab06b40> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
"VM Thread" os_prio=31 tid=0x00007fc169001000 nid=0x2c03 runnable
"GC task thread#0 (ParallelGC)" os_prio=31 tid=0x00007fc168004800 nid=0x1d07 runnable
"GC task thread#1 (ParallelGC)" os_prio=31 tid=0x00007fc168005800 nid=0x1e03 runnable
"GC task thread#2 (ParallelGC)" os_prio=31 tid=0x00007fc169002000 nid=0x5403 runnable
"GC task thread#3 (ParallelGC)" os_prio=31 tid=0x00007fc169003000 nid=0x5303 runnable
"VM Periodic Task Thread" os_prio=31 tid=0x00007fc16804d000 nid=0x3c03 waiting on condition
JNI global references: 47
Jstat
jstat用法如下:
jstat -<option> [-t] [-h<lines>] <vmid> [<interval> [<count>]]
- -option 是可选选项
-
-class
跟class 加载,占用消耗有关的状态.
hadoop@spark6:~$ jstat -class 131649 Loaded Bytes Unloaded Bytes Time 21038 42844.6 904 1394.4 28.32
-
-compiler
应该是 即时编译有关吧.
hadoop@spark6:~$ jstat -compiler 131649 Compiled Failed Invalid Time FailedType FailedMethod 39298 3 0 216.39 1 org/apache/spark/ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1 apply$mcV$sp
-
-gc 垃圾回收统计
hadoop@spark6:~$ jstat -gc 131649 Warning: Unresolved Symbol: sun.gc.generation.2.space.0.capacity substituted NaN Warning: Unresolved Symbol: sun.gc.generation.2.space.0.used substituted NaN S0C S1C S0U S1U EC EU OC OU PC PU YGC YGCT FGC FGCT GCT 186240.0 186240.0 186240.0 0.0 1490240.0 326860.8 5587220.0 3700631.8 � � 32818 2921.766 11684 4006.977 6928.743
-
-gccapacity 堆内存统计
-
-gccause
-
-gcnew
-
-gcnewcapacity
-
-gcold
-
-gcoldcapacity
-
-gcpermcapacity
-
-gcutil 总结垃圾回收统计
-
-printcompilation
-
-
-t 是用于显示timeStamp,示例如下。
-
hadoop@spark6:~$ jstat -gc 131649 Warning: Unresolved Symbol: sun.gc.generation.2.space.0.capacity substituted NaN Warning: Unresolved Symbol: sun.gc.generation.2.space.0.used substituted NaN S0C S1C S0U S1U EC EU OC OU PC PU YGC YGCT FGC FGCT GCT 186240.0 186240.0 186240.0 0.0 1490240.0 1341107.0 5587220.0 3388251.2 � � 32748 2914.187 11656 3990.174 6904.362 hadoop@spark6:~$ jstat -gc -t 131649 Warning: Unresolved Symbol: sun.gc.generation.2.space.0.capacity substituted NaN Warning: Unresolved Symbol: sun.gc.generation.2.space.0.used substituted NaN Timestamp S0C S1C S0U S1U EC EU OC OU PC PU YGC YGCT FGC FGCT GCT 247190.2 186240.0 186240.0 186240.0 0.0 1490240.0 1341242.8 5587220.0 3102583.1 � � 32748 2914.187 11657 3990.174 6904.362
-
-
-h 是每行之间的样本数,用于求平均值吧
-
vmid 是java进程Id
-
interval 采样间距,单位为毫秒
- 采样次数
Jmap
用法如下:
hadoop@spark6:~$ jmap
Usage:
jmap [option] <pid>
(to connect to running process)
jmap [option] <executable <core>
(to connect to a core file)
jmap [option] [server_id@]<remote server IP or hostname>
(to connect to remote debug server)
where <option> is one of:
<none> to print same info as Solaris pmap
-heap to print java heap summary
-histo[:live] to print histogram of java object heap; if the "live"
suboption is specified, only count live objects
-permstat to print permanent generation statistics
-finalizerinfo to print information on objects awaiting finalization
-dump:<dump-options> to dump java heap in hprof binary format
dump-options:
live dump only live objects; if not specified,
all objects in the heap are dumped.
format=b binary format
file=<file> dump heap to <file>
Example: jmap -dump:live,format=b,file=heap.bin <pid>
-F force. Use with -dump:<dump-options> <pid> or -histo
to force a heap dump or histogram when <pid> does not
respond. The "live" suboption is not supported
in this mode.
-h | -help to print this help message
-J<flag> to pass <flag> directly to the runtime system
首先介绍一下core dump。当程序运行的过程中异常终止或崩溃,操作系统会将程序当时的内存状态记录下来,保存在一个文件中,这种行为就叫做Core Dump。我们可以认为 core dump 是“内存快照”,但实际上,除了内存信息之外,还有些关键的程序运行状态也会同时 dump 下来,例如寄存器信息(包括程序指针、栈指针等)、内存管理信息、其他处理器和操作系统状态和信息。core dump 对于编程人员诊断和调试程序是非常有帮助的,因为对于有些程序错误是很难重现的,例如指针异常,而 core dump 文件可以再现程序出错时的情景。在linux中可以使用 ulimit -c
查看目前是否打开core dump 功能,如果显示为0则代表未打开,可以使用ulimit -c unlimited
打开core dump功能,当core dump是打开时,才会在程序崩溃时保存内存快照。
mat
在实际生产中,我们通常使用 jmap -dump:live,format=b,file=heap.bin <pid>
命令来将core dump到文件中,然后使用 mat(eclipse memory analyzer)来分析这个dump文件,会生成一些html文件,然后将这些文件下载下来,点击其index.html来查看分析结果。
mat的下载地址为:https://www.eclipse.org/mat/downloads.php。
解压之后是一个mat
文件夹,进入这个文件夹.
./ParseHeapDump.sh jmap.info org.eclipse.mat.api:suspects org.eclipse.mat.api:overview org.eclipse.mat.api:top_components
结果会生产如下三个zip文件,很小可以直接拷贝到本机.
jmap_Leak_Suspects.zip
jmap_System_Overview.zip
jmap_Top_Components.zip
之后就可以解压查看其对应的index.html.
Jinfo(Java Configuration Info)
使用方法如下:
hadoop@spark6:~$ jinfo
Usage:
jinfo [option] <pid>
(to connect to running process)
jinfo [option] <executable <core>
(to connect to a core file)
jinfo [option] [server_id@]<remote server IP or hostname>
(to connect to remote debug server)
where <option> is one of:
-flag <name> to print the value of the named VM flag
-flag [+|-]<name> to enable or disable the named VM flag
-flag <name>=<value> to set the named VM flag to the given value
-flags to print VM flags
-sysprops to print Java system properties
<no option> to print both of the above
-h | -help to print this help message
看到上面可以更改一些参数,那么哪些参数是可以动态更改?
JVM官方文档说明如下,也就是说,标记为manageable的参数或者通过com.sun.management.HotSpotDiagnosticMXBean
这个类的接口得到;
Flags marked as manageable are dynamically writeable through the JDK management interface (com.sun.management.HotSpotDiagnosticMXBean API) and also through JConsole.
通过manageable方法更加方便,命令如下:
hadoop@spark6:~$ java -XX:+PrintFlagsInitial | grep manageable
intx CMSAbortablePrecleanWaitMillis = 100 {manageable}
intx CMSWaitDuration = 2000 {manageable}
bool HeapDumpAfterFullGC = false {manageable}
bool HeapDumpBeforeFullGC = false {manageable}
bool HeapDumpOnOutOfMemoryError = false {manageable}
ccstr HeapDumpPath = {manageable}
uintx MaxHeapFreeRatio = 70 {manageable}
uintx MinHeapFreeRatio = 40 {manageable}
bool PrintClassHistogram = false {manageable}
bool PrintClassHistogramAfterFullGC = false {manageable}
bool PrintClassHistogramBeforeFullGC = false {manageable}
bool PrintConcurrentLocks = false {manageable}
bool PrintGC = false {manageable}
bool PrintGCDateStamps = false {manageable}
bool PrintGCDetails = false {manageable}
bool PrintGCTimeStamps = false {manageable}
所以只有这几个参数是可以通过 jinfo -flag [+|-]<name> pid
或者jinfo -flag <name>=<value> pid
动态更改的.