linux-kernel - Re: [PATCH] fs/exec.c: Add fast path for ENOENT on PATH search before allocating mm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <87a5rmw54w.fsf@email.froward.int.ebiederm.org>
Date:   Thu, 09 Nov 2023 23:26:23 -0600
From:   "Eric W. Biederman" <ebiederm@...ssion.com>
To:     Mateusz Guzik <mjguzik@...il.com>
Cc:     Peter Zijlstra <peterz@...radead.org>, Kees Cook <kees@...nel.org>,
        Josh Triplett <josh@...htriplett.org>,
        Alexander Viro <viro@...iv.linux.org.uk>, linux-mm@...ck.org,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] fs/exec.c: Add fast path for ENOENT on PATH search
 before allocating mm

Mateusz Guzik <mjguzik@...il.com> writes:

> On 11/9/23, Eric W. Biederman <ebiederm@...ssion.com> wrote:
>> Mateusz Guzik <mjguzik@...il.com> writes:
>>> sched_exec causes migration only for only few % of execs in the bench,
>>> but when it does happen there is tons of overhead elsewhere.
>>>
>>> I expect real programs which get past execve will be prone to
>>> migrating anyway, regardless of what sched_exec is doing.
>>>
>>> That is to say, while sched_exec buggering off here would be nice, I
>>> think for real-world wins the thing to investigate is the overhead
>>> which comes from migration to begin with.
>>
>> I have a vague memory that the idea is that there is a point during exec
>> when it should be much less expensive than normal to allow migration
>> between cpus because all of the old state has gone away.
>>
>> Assuming that is the rationale, if we are getting lock contention
>> then either there is a global lock in there, or there is the potential
>> to pick a less expensive location within exec.
>>
>
> Given the commit below I think the term "migration cost" is overloaded here.
>
> By migration cost in my previous mail I meant the immediate cost
> (stop_one_cpu and so on), but also the aftermath -- for example tlb
> flushes on another CPU when tearing down your now-defunct mm after you
> switched.
>
> For testing purposes I verified commenting out sched_exec and not
> using taskset still gives me about 9.5k ops/s.
>
> I 100% agree should the task be moved between NUMA domains, it makes
> sense to do it when it has the smallest footprint. I don't know what
> the original patch did, the current code just picks a CPU and migrates
> to it, regardless of NUMA considerations. I will note that the goal
> would still be achieved by comparing domains and doing nothing if they
> match.
>
> I think this would be nice to fix, but it is definitely not a big
> deal. I guess the question is to Peter Zijlstra if this sounds
> reasonable.

Perhaps I misread the trace. My point was simply that the sched_exec
seemed to be causing lock contention because what was on one cpu is
now on another cpu, and we are now getting cross cpu lock ping-pongs.

If the sched_exec is causing exec to cause cross cpu lock ping-pongs,
then we can move sched_exec to a better place within exec.  It has
already happened once, shortly after it was introduced.

Ultimately we want the sched_exec to be in the cheapest place within
exec that we can find.

Eric