lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGudoHEU62Rpp+uJakCFgKC36y3th=Qx+jE+b7KJh6_jeH7cNg@mail.gmail.com>
Date:   Wed, 9 Aug 2023 23:07:45 +0200
From:   Mateusz Guzik <mjguzik@...il.com>
To:     Suren Baghdasaryan <surenb@...gle.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        akpm@...ux-foundation.org, regressions@...mhuis.info,
        bagasdotme@...il.com, jacobly.alt@...il.com, willy@...radead.org,
        liam.howlett@...cle.com, david@...hat.com, peterx@...hat.com,
        ldufour@...ux.ibm.com, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, linuxppc-dev@...ts.ozlabs.org,
        linux-arm-kernel@...ts.infradead.org, gregkh@...uxfoundation.org,
        regressions@...ts.linux.dev, Jiri Slaby <jirislaby@...nel.org>,
        Holger Hoffstätte <holger@...lied-asynchrony.com>,
        stable@...r.kernel.org
Subject: Re: [PATCH v2 3/3] fork: lock VMAs of the parent process when forking

On 8/5/23, Suren Baghdasaryan <surenb@...gle.com> wrote:
> On Fri, Aug 4, 2023 at 6:06 PM Mateusz Guzik <mjguzik@...il.com> wrote:
>>
>> On 8/5/23, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
>> > On Fri, 4 Aug 2023 at 16:25, Mateusz Guzik <mjguzik@...il.com> wrote:
>> >>
>> >> I know of these guys, I think they are excluded as is -- they go
>> >> through access_remote_vm, starting with:
>> >>         if (mmap_read_lock_killable(mm))
>> >>                 return 0;
>> >>
>> >> while dup_mmap already write locks the parent's mm.
>> >
>> > Oh, you're only worried about vma_start_write()?
>> >
>> > That's a non-issue. It doesn't take the lock normally, since it starts
>> > off
>> > with
>> >
>> >         if (__is_vma_write_locked(vma, &mm_lock_seq))
>> >                 return;
>> >
>> > which catches on the lock sequence number already being set.
>> >
>> > So no extra locking there.
>> >
>> > Well, technically there's extra locking because the code stupidly
>> > doesn't initialize new vma allocations to the right sequence number,
>> > but that was talked about here:
>> >
>> >
>> > https://lore.kernel.org/all/CAHk-=wiCrWAoEesBuoGoqqufvesicbGp3cX0LyKgEvsFaZNpDA@mail.gmail.com/
>> >
>> > and it's a separate issue.
>> >
>>
>> I'm going to bet one beer this is the issue.
>>
>> The patch I'm responding to only consists of adding the call to
>> vma_start_write and claims the 5% slowdown from it, while fixing
>> crashes if the forking process is multithreaded.
>>
>> For the fix to work it has to lock something against the parent.
>>
>>         VMA_ITERATOR(old_vmi, oldmm, 0);
>> [..]
>>         for_each_vma(old_vmi, mpnt) {
>> [..]
>>                 vma_start_write(mpnt);
>>
>> the added line locks an obj in the parent's vm space.
>>
>> The problem you linked looks like pessimization for freshly allocated
>> vmas, but that's what is being operated on here.
>
> Sorry, now I'm having trouble understanding the problem you are
> describing. We are locking the parent's vma before copying it and the
> newly created vma is locked before it's added into the vma tree. What
> is the problem then?
>

Sorry for the late reply!

Looks there has been a bunch of weird talking past one another in this
thread and I don't think trying to straighten it all out is worth any
time.

I think at least the two of us agree that if a single-threaded process
enters dup_mmap an
down_writes the mmap semaphore, then no new thread can pop up in said
process, thus no surprise page faults from that angle. 3rd parties are
supposed to interfaces like access_remote_vm, which down_read said
semaphore and are consequently also not a problem. The only worry here
is that someone is messing with another process memory without the
semaphore, but is very unlikely and patchable in the worst case -- but
someone(tm) has to audit. With all these conditions satisfied one can
elide vma_start_write for a perf win.

Finally, I think we agreed you are going to do the audit ;)

Cheers,
-- 
Mateusz Guzik <mjguzik gmail.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ