lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Y+o6HKUY/OzThJe/@dhcp22.suse.cz>
Date:   Mon, 13 Feb 2023 14:24:44 +0100
From:   Michal Hocko <mhocko@...e.com>
To:     huyd12@...natelecom.cn
Cc:     liuq131@...natelecom.cn, akpm@...ux-foundation.org,
        agruenba@...hat.com, 'Christian Brauner' <christian@...uner.io>,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: 回复: [PATCH] pid: add handling
 of too many zombie processes

On Thu 09-02-23 15:14:57, huyd12@...natelecom.cn wrote:
> 
> Any comments will be appreciated.
> 
> 
> 
> -----邮件原件-----
> 发件人: liuq131@...natelecom.cn <liuq131@...natelecom.cn> 
> 发送时间: 2023年2月8日 17:49
> 收件人: akpm@...ux-foundation.org
> 抄送: agruenba@...hat.com; linux-mm@...ck.org; linux-kernel@...r.kernel.org;
> huyd12@...natelecom.cn; liuq <liuq131@...natelecom.cn>
> 主题: [PATCH] pid: add handling of too many zombie processes
> 
> There is a common situation that a parent process forks many child processes
> to execute tasks, but the parent process does not execute wait/waitpid when
> the child process exits, resulting in a large number of child processes
> becoming zombie processes.
> 
> At this time, if the number of processes in the system out of
> kernel.pid_max, the new fork syscall will fail, and the system will not be
> able to execute any command at this time (unless an old process exits)
> 
> eg:
> [root@...workstation ~]# ls
> -bash: fork: retry: Resource temporarily unavailable
> -bash: fork: retry: Resource temporarily unavailable
> -bash: fork: retry: Resource temporarily unavailable
> -bash: fork: retry: Resource temporarily unavailable
> -bash: fork: Resource temporarily unavailable [root@...workstation ~]#
> reboot
> -bash: fork: retry: Resource temporarily unavailable
> -bash: fork: retry: Resource temporarily unavailable
> -bash: fork: retry: Resource temporarily unavailable
> -bash: fork: retry: Resource temporarily unavailable
> -bash: fork: Resource temporarily unavailable
> 
> I dealt with this situation in the alloc_pid function, and found a process
> with the most zombie subprocesses, and more than 10(or other reasonable
> values?) zombie subprocesses, so I tried to kill this process to release the
> pid resources.

Abusing oom_kill_process is not the right approach. Also any hard coded limit
fir the number of zombies can turn out to be really tricky and it can
cause regressions.

Is there any reason you cannot contain those misbehaving workloads in a
pid controller?
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ