linux-kernel - Re: Question about kill a process group

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87zgk5v148.ffs@tglx>
Date:   Thu, 28 Apr 2022 14:33:27 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>,
        Zhang Qiao <zhangqiao22@...wei.com>
Cc:     lkml <linux-kernel@...r.kernel.org>, keescook@...omium.org,
        Peter Zijlstra <peterz@...radead.org>, elver@...gle.com,
        legion@...nel.org, oleg@...hat.com, brauner@...nel.org
Subject: Re: Question about kill a process group

On Thu, Apr 21 2022 at 11:12, Eric W. Biederman wrote:
> Zhang Qiao <zhangqiao22@...wei.com> writes:
>>> How many children are being created in this test?  Several million?
>>
>>   There are about 300,000+ processes.
>
> Not as many as I was guessing, but still enough to cause a huge
> wait on locks.

Indeed. It's about 4-5us per process to send the signal on a 2GHz
SKL-X. So with 20000k processes tasklist lock is read held for 1 second.

> I do agree over 1 second for holding a spin lock is ridiculous and a
> denial of service attack.

Exactly. Even holding it for 100ms (20k forks) is daft.

So unless the number of PIDs for a user is limited this _is_ an
unpriviledged DoS vector.

> Anyway I am very curious why you are the only one seeing a problem with
> fork12.

It's fully reproducible. It's just a question how big the machine is and
what the PID limits are on the box you are testing on.

>>> I suspect the issue really is the thundering hurd of a million+
>>> processes synchronizing on a single lock.

There are several issues:

 1) The parent sending the signal is holding the lock for an
    obscene long time.

 2) Every signaled child runs into tasklist lock contention as all of
    them need to aquire it for write in do_exit(). That means within
    (NR_CPUS - 1) * 5usec all CPUs are spinning on tasklist lock with
    interrupts disabled up to the point where #1 has finished.

So depending on the number of childs and the configured limits of a
lockup detector this is sufficient to trigger a warning.

Thanks,

        tglx