lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 25 Jan 2019 08:52:11 +0100
From:   Arkadiusz Miśkiewicz <a.miskiewicz@...il.com>
To:     cgroups@...r.kernel.org
Cc:     Aleksa Sarai <asarai@...e.de>, Jay Kamat <jgkamat@...com>,
        Roman Gushchin <guro@...com>, Michal Hocko <mhocko@...e.com>,
        Johannes Weiner <hannes@...xchg.org>,
        linux-kernel@...r.kernel.org
Subject: Re: pids.current with invalid value for hours [5.0.0 rc3 git]

On 24/01/2019 12:21, Arkadiusz Miśkiewicz wrote:
> On 17/01/2019 14:17, Arkadiusz Miśkiewicz wrote:
>> On 17/01/2019 13:25, Aleksa Sarai wrote:
>>> On 2019-01-17, Arkadiusz Miśkiewicz <a.miskiewicz@...il.com> wrote:
>>>> Using kernel 4.19.13.
>>>>
>>>> For one cgroup I noticed weird behaviour:
>>>>
>>>> # cat pids.current
>>>> 60
>>>> # cat cgroup.procs
>>>> #
>>>
>>> Are there any zombies in the cgroup? pids.current is linked up directly
>>> to __put_task_struct (so exit(2) won't decrease it, only the task_struct
>>> actually being freed will decrease it).
>>>
>>
>> There are no zombie processes.
>>
>> In mean time the problem shows on multiple servers and so far saw it
>> only in cgroups that were OOMed.
>>
>> What has changed on these servers (yesterday) is turning on
>> memory.oom.group=1 for all cgroups and changing memory.high from 1G to
>> "max" (leaving memory.max=2G limit only).
>>
>> Previously there was no such problem.
>>
> 
> I'm attaching reproducer. This time tried on different distribution
> kernel (arch linux).
> 
> After 60s pids.current still shows 37 processes even if there are no
> processes running (according to ps aux).


The same test on 5.0.0-rc3-00104-gc04e2a780caf and it's easy to
reproduce bug. No processes in cgroup but pids.current reports 91.

memory.oom.group=0 - everything works fine, pids are counted properly
memory.oom.group=1 - bug becomes visible

[root@xps test]# python3 cg.py
Created cgroup: /sys/fs/cgroup/test_5277
Start: pids.current: 0
Start: cgroup.procs:
0: pids.current: 103
0: cgroup.procs:
1: pids.current: 91
1: cgroup.procs:
2: pids.current: 91
2: cgroup.procs:
3: pids.current: 91
3: cgroup.procs:
4: pids.current: 91
4: cgroup.procs:
5: pids.current: 91
5: cgroup.procs:
6: pids.current: 91
6: cgroup.procs:
7: pids.current: 91
7: cgroup.procs:
8: pids.current: 91
8: cgroup.procs:
9: pids.current: 91
9: cgroup.procs:
10: pids.current: 91
10: cgroup.procs:
11: pids.current: 91
11: cgroup.procs:
[root@xps test]# uname -a
Linux xps 5.0.0-rc3-00104-gc04e2a780caf #288 SMP PREEMPT Thu Jan 24
19:00:32 CET 2019 x86_64 Intel(R)_Core(TM)_i9-8950HK_CPU_@...90GHz PLD Linux


cc relevant people

script is here: https://www.spinics.net/lists/cgroups/msg21330.html

> 
> [root@...m ~]# uname -a
> Linux warm 4.20.3-arch1-1-ARCH #1 SMP PREEMPT Wed Jan 16 22:38:58 UTC
> 2019 x86_64 GNU/Linux
> [root@...m ~]# python3 cg.py
> Created cgroup: /sys/fs/cgroup/test_26207
> Start: pids.current: 0
> Start: cgroup.procs:
> 0: pids.current: 62
> 0: cgroup.procs:
> 1: pids.current: 37
> 1: cgroup.procs:
> 2: pids.current: 37
> 2: cgroup.procs:
> 3: pids.current: 37
> 3: cgroup.procs:
> 4: pids.current: 37
> 4: cgroup.procs:
> 5: pids.current: 37
> 5: cgroup.procs:
> 6: pids.current: 37
> 6: cgroup.procs:
> 7: pids.current: 37
> 7: cgroup.procs:
> 8: pids.current: 37
> 8: cgroup.procs:
> 9: pids.current: 37
> 9: cgroup.procs:
> 10: pids.current: 37
> 10: cgroup.procs:
> 11: pids.current: 37
> 11: cgroup.procs:
> 


-- 
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ