linux-kernel - Re: [PATCH 0/2] execve scalability issues, part 1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAGudoHFuBwq78nZOJJ8itg0Kj8B2K1z5uRh2VEVNuBM=6wp0Wg@mail.gmail.com>
Date:   Wed, 23 Aug 2023 14:01:31 +0200
From:   Mateusz Guzik <mjguzik@...il.com>
To:     David Laight <David.Laight@...lab.com>
Cc:     Jan Kara <jack@...e.cz>, Dennis Zhou <dennis@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "tj@...nel.org" <tj@...nel.org>, "cl@...ux.com" <cl@...ux.com>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "shakeelb@...gle.com" <shakeelb@...gle.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [PATCH 0/2] execve scalability issues, part 1

On 8/23/23, David Laight <David.Laight@...lab.com> wrote:
> From: Jan Kara
>> Sent: Wednesday, August 23, 2023 10:49 AM
> ....
>> > --- a/include/linux/mm_types.h
>> > +++ b/include/linux/mm_types.h
>> > @@ -737,7 +737,11 @@ struct mm_struct {
>> >
>> >                 unsigned long saved_auxv[AT_VECTOR_SIZE]; /* for
>> > /proc/PID/auxv */
>> >
>> > -               struct percpu_counter rss_stat[NR_MM_COUNTERS];
>> > +               union {
>> > +                       struct percpu_counter rss_stat[NR_MM_COUNTERS];
>> > +                       u64 *rss_stat_single;
>> > +               };
>> > +               bool    magic_flag_stuffed_elsewhere;
>
> I wouldn't use a union to save a pointer - it is asking for trouble.
>

I may need to abandon this bit anyway -- counter init adds counters to
a global list and I can't call easily call it like that.

>> >
>> >                 struct linux_binfmt *binfmt;
>> >
>> >
>> > Then for single-threaded case an area is allocated for NR_MM_COUNTERS
>> > countes * 2 -- first set updated without any synchro by current
>> > thread. Second set only to be modified by others and protected with
>> > mm->arg_lock. The lock protects remote access to the union to begin
>> > with.
>>
>> arg_lock seems a bit like a hack. How is it related to rss_stat? The
>> scheme
>> with two counters is clever but I'm not 100% convinced the complexity is
>> really worth it. I'm not sure the overhead of always using an atomic
>> counter would really be measurable as atomic counter ops in local CPU
>> cache
>> tend to be cheap. Did you try to measure the difference?
>
> A separate lock is worse than atomics.
> (Although some 32bit arch may have issues with 64bit atomics.)
>

But in my proposal the separate lock is used to facilitate *NOT* using
atomics by the most common consumer -- the only thread.

The lock is only used for the transition to multithreaded state for
updated by remote parties (both rare compared to updated by current).

> I think you'll be surprised just how slow atomic ops are.
> Even when present in the local cache.
> (Probably because any other copies have to be invalidated.)
>

Agreed. They have always been super expensive on x86-64 (and continue
to be). I keep running to claims they are not, I don't know where
that's coming from.

-- 
Mateusz Guzik <mjguzik gmail.com>