解决java.lang.OutOfMemoryError: unable to create new native thread


描述项 内容
java版本 java version "1.8.0_161"

1. 故障现象与分析

1.1. 故障发生的时间与现象(SYMPTOMS)


Caused by: java.lang.OutOfMemoryError: unable to create new native thread
    at java.lang.Thread.start0(Native Method)
    at java.lang.Thread.start(
    at java.util.concurrent.ThreadPoolExecutor.addWorker(
    at java.util.concurrent.ThreadPoolExecutor.execute(
    at org.apache.dubbo.remoting.transport.dispatcher.all.AllChannelHandler.received(
    ... 28 common frames omitted


1.2. 分析过程(ANALYSIS)

1) 找到诱发原因,一般由于两个原因导致的

  • 内存空间不足以满足创建线程所需的stack size
     virtual memory < stack size*the number of threads

  • 线程数已达到操作系统设定的max user processes上限

2) 查看虚拟机jvm

~]$ java -XX:+PrintFlagsFinal -version | grep ThreadStackSize
     intx CompilerThreadStackSize                   = 0                                   {pd product}
     intx ThreadStackSize                           = 1024                                {pd product}
     intx VMThreadStackSize                         = 1024                                {pd product}
java version "1.8.0_351"
Java(TM) SE Runtime Environment (build 1.8.0_351-b10)
Java HotSpot(TM) 64-Bit Server VM (build 25.351-b10, mixed mode)


ulimit -v


ulimit -s


ulimit -u

结果: 1024

ulimit -a

core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 514740
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1048576
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited



# 查看当前用户所有进程打开的文件数
lsof | wc -l
# 查看某一进程打开的文件数
lsof -p pid | wc -l
# 查看某一进程下的线程
pstree -p pid
# 统计某一进程下的线程数
pstree -p pid | wc -l


5) 查看内存情况

free -m


  • 查看占用swap前N的程序的PID和占用内存
    for i in $( cd /proc;ls |grep "^[0-9]"|awk ' $0 >100') ;do awk '/Swap:/{a=a+$2}END{print '"$i"',a/1024"M"}' /proc/$i/smaps 2>/dev/null ; done | sort -k2nr | head -10


1.3. 根本原因(CAUSE)


2. 解决方案与遗留问题

2.1. 解决方法(SOLUTION)

/etc/security/limits.d/ 目录下创建文件 20-nproc.conf,其内容如下:

# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.

# 所有的用户默认可以打开最大的进程数为 8192
*          soft    nproc     8192

# root 用户默认可以打开最大的进程数 无限制的。
root       soft    nproc     unlimited


  1. 覆盖点问题:/etc/security/limits.d/ 下文件的相同配置可以覆盖 /etc/security/limits.conf 里的内容。
  2. nofile不能设置 unlimited
  3. nofile可以设置的最大值为 1048576(2**20),设置的值大于该数,就会进行登录不了。
  4. soft 设置的值一定要小于或等于 hard 的值。
  5. 设置了资源限制后,需要linux用户重新登录或重启服务器才会生效。

2.2. 后续跟踪(TRACKING)

  • 查看用户当前的资源配置
    ~]$ ulimit -a
    core file size          (blocks, -c) 0
    data seg size           (kbytes, -d) unlimited
    scheduling priority             (-e) 0
    file size               (blocks, -f) unlimited
    pending signals                 (-i) 128305
    max locked memory       (kbytes, -l) 64
    max memory size         (kbytes, -m) unlimited
    open files                      (-n) 204800
    pipe size            (512 bytes, -p) 8
    POSIX message queues     (bytes, -q) 819200
    real-time priority              (-r) 0
    stack size              (kbytes, -s) 10240
    cpu time               (seconds, -t) unlimited
    max user processes              (-u) 8192
    virtual memory          (kbytes, -v) unlimited
    file locks                      (-x) unlimited

3. 经验总结(SUMMARY)

附录C. /etc/security/limits.conf 配置解析

# /etc/security/limits.conf
#This file sets the resource limits for the users logged in via PAM.
#It does not affect resource limits of the system services.
#Also note that configuration files in /etc/security/limits.d directory,
#which are read in alphabetical order, override the settings in this
#file in case the domain is the same or more specific.
请注意/etc/security/limits.d下按照字母顺序排列的配置文件会覆盖 /etc/security/limits.conf中的
#That means for example that setting a limit for wildcard domain here
#can be overriden with a wildcard setting in a config file in the
#subdirectory, but a user specific setting here can be overriden only
#with a user specific setting in the subdirectory.
/etc/security/limits.d子目录下配置文件也有用户A的配置时,那么A中某些配置会被覆盖。最终取的值是 /etc/security/limits.d 下的配置文件的配置。

#Each line describes a limit for a user in the form:
#<domain> <type> <item> <value>

#<domain> can be:
# - a user name    一个用户名
# - a group name, with @group syntax    用户组格式为@GROUP_NAME
# - the wildcard *, for default entry    默认配置为*,代表所有用户
# - the wildcard %, can be also used with %group syntax,
# for maxlogin limit
#<type> can have the two values:
# - "soft" for enforcing the soft limits
# - "hard" for enforcing hard limits
#<item> can be one of the following:    <item>可以使以下选项中的一个
# - core - limits the core file size (KB)    限制内核文件的大小。
# - data - max data size (KB)    最大数据大小
# - fsize - maximum filesize (KB)    最大文件大小
# - memlock - max locked-in-memory address space (KB)    最大锁定内存地址空间
# - nofile - max number of open file descriptors 最大打开的文件数(以文件描叙符,file descripter计数)
# - rss - max resident set size (KB) 最大持久设置大小
# - stack - max stack size (KB) 最大栈大小
# - cpu - max CPU time (MIN)    最多CPU占用时间,单位为MIN分钟
# - nproc - max number of processes 进程的最大数目
# - as - address space limit (KB) 地址空间限制
# - maxlogins - max number of logins for this user    此用户允许登录的最大数目
# - maxsyslogins - max number of logins on the system    系统最大同时在线用户数
# - priority - the priority to run user process with    运行用户进程的优先级
# - locks - max number of file locks the user can hold    用户可以持有的文件锁的最大数量
# - sigpending - max number of pending signals
# - msgqueue - max memory used by POSIX message queues (bytes)
# - nice - max nice priority allowed to raise to values: [-20, 19] max nice优先级允许提升到值
# - rtprio - max realtime pr iority
#<domain> <type> <item> <value>

#* soft core 0
#* hard rss 10000
#@student hard nproc 20
#@faculty soft nproc 20
#@faculty hard nproc 50
#ftp hard nproc 0