Java线程泄露问题分析

现象
开发反馈登陆不了服务器了,默认开发是work账号登陆,root账号还没受影响。

[root@VM_101_65_centos ~]# su -  work
Last login: Mon Apr  8 10:20:16 CST 2019 from 10.2.8.60 on pts/11
su: failed to execute /bin/bash: Resource temporarily unavailable

分析
1.网上分析了下,大都说是和打开文件数与打开进程数有关,其实默认都做过优化的,如下:

[root@VM_101_65_centos ~]# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 31216
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 100001
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 31216
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

2.分析可能是程序本身的原因,内存泄露,线程不释放等,前期观察cpu,内存使用率都不高,多半和线程有关,特别是java程序线程递增的问题比较普遍

top - 11:05:19 up 29 min,  5 users,  load average: 0.00, 0.02, 0.05
Tasks: 111 total,   2 running, 109 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.2 us,  0.3 sy,  0.0 ni, 99.3 id,  0.2 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  8010308 total,  5559104 free,  1809964 used,   641240 buff/cache
KiB Swap:        0 total,        0 free,        0 used.  5937736 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                             
10924 work      20   0 5066300 975784  23676 S   1.3 12.2   1:54.09 java                                                
11120 work      20   0 4202736 577132  19704 S   0.0  7.2   1:10.61 java                                                
11357 work      20   0   95996  29560   1184 S   0.0  0.4   0:00.83 nginx                                               
11359 work      20   0   95936  29560   1232 S   0.0  0.4   0:00.88 nginx                                               
11358 work      20   0   95728  29292   1156 S   0.0  0.4   0:00.77 nginx                                               
11360 work      20   0   95964  29284   1152 S   0.7  0.4   0:01.04 nginx      

3.通过top或者ps可以查进程的线程信息,发现线程一直在递增,默认达到3万多的限制只是时间的问题了,后面就交给研发处理了

[root@VM_101_65_centos ~]# top -H -p 10924
top - 10:50:22 up 14 min,  5 users,  load average: 0.13, 0.12, 0.13
Threads: 120 total,   0 running, 120 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.0 us,  0.1 sy,  0.0 ni, 98.8 id,  0.0 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem :  8010308 total,  5682612 free,  1722432 used,   605264 buff/cache
KiB Swap:        0 total,        0 free,        0 used.  6032076 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                                              
10940 work      20   0 4979964 930064  23660 S  1.0 11.6   0:08.97 java                                                 
10927 work      20   0 4979964 930064  23660 S  0.3 11.6   0:17.75 java                                                 
10933 work      20   0 4979964 930064  23660 S  0.3 11.6   0:00.67 java                                                 
11017 work      20   0 4979964 930064  23660 S  0.3 11.6   0:00.11 java                                                 
11025 work      20   0 4979964 930064  23660 S  0.3 11.6   0:00.09 java                                                 
11202 work      20   0 4979964 930064  23660 S  0.3 11.6   0:00.03 java                                                 
10924 work      20   0 4979964 930064  23660 S  0.0 11.6   0:00.00 java                                                 
10928 work      20   0 4979964 930064  23660 S  0.0 11.6   0:00.36 java                                                 
10929 work      20   0 4979964 930064  23660 S  0.0 11.6   0:00.39 java                                                 
10930 work      20   0 4979964 930064  23660 S  0.0 11.6   0:00.35 java                                                 
10931 work      20   0 4979964 930064  23660 S  0.0 11.6   0:00.34 java                                                 
10932 work      20   0 4979964 930064  23660 S  0.0 11.6   0:02.64 java                                                 
10934 work      20   0 4979964 930064  23660 S  0.0 11.6   0:00.01 java                                                 
10935 work      20   0 4979964 930064  23660 S  0.0 11.6   0:00.01 java                                                 
10936 work      20   0 4979964 930064  23660 S  0.0 11.6   0:00.00 java                                                 
10937 work      20   0 4979964 930064  23660 S  0.0 11.6   0:00.00 java                                                 
10938 work      20   0 4979964 930064  23660 S  0.0 11.6   0:25.85 java                                                 
10939 work      20   0 4979964 930064  23660 S  0.0 11.6   0:28.59 java                                                 
10941 work      20   0 4979964 930064  23660 S  0.0 11.6   0:00.00 java                                                 
10942 work      20   0 4979964 930064  23660 S  0.0 11.6   0:00.42 java                                                 
10992 work      20   0 4979964 930064  23660 S  0.0 11.6   0:00.00 java                                                 
11000 work      20   0 4979964 930064  23660 S  0.0 11.6   0:00.01 java                                                 
11008 work      20   0 4979964 930064  23660 S  0.0 11.6   0:00.00 java                                                 
[root@VM_101_65_centos ~]# ps huH 10924|wc -l
120
[root@VM_101_65_centos ~]# ps huH 10924|wc -l
130
[root@VM_101_65_centos ~]# ps huH 10924|wc -l