[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YNlapAKObfeVPoQO@gmail.com>
Date: Mon, 28 Jun 2021 07:14:12 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Christian Brauner <christian.brauner@...ntu.com>,
Oleg Nesterov <oleg@...hat.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [GIT PULL] sigqueue cache fix
* Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> On Sun, Jun 27, 2021 at 11:52 AM Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > Ok, I may have confused myself looking at all this, but it does all
> > make me think this is dodgy.
>
> I also couldn't convince myself that the memory ordering is correct
> for the _contents_ of the sigqueue entry that had its pointer cached,
> although I suspect that is purely a theoretical concern (certainly a
> non-issue on x86).
>
> So I've reverted the sigqueue cache code, in that I haven't heard
> anything back and I'm not going to delay 5.13 over something small and
> easily undone like this.
I concur that it was the safest to revert this, because it was close to the
final release.
I think the code is safe, but only by accident. The most critical data race
isn't well-documented, unless I missed something.
The most fundamental race we can have is this:
CPU#0
__sigqueue_alloc()
[ holds sighand->siglock ]
[ IRQs off. ]
q = READ_ONCE(t->sigqueue_cache);
if (!q || sigqueue_flags)
q = kmem_cache_alloc(sigqueue_cachep, gfp_flags);
else
WRITE_ONCE(t->sigqueue_cache, NULL);
CPU#1
__sigqueue_free()
[ IRQs off. ]
if (!READ_ONCE(current->sigqueue_cache))
WRITE_ONCE(current->sigqueue_cache, q);
else
kmem_cache_free(sigqueue_cachep, q);
( Let's assume exit_task_sigqueue_cache() happens while there's no new
signal sending going on, so that angle is safe. )
Someone confusingly, *alloc() is the consumer and *free() is the producer
of the sigqueue_cache.
Here's how I see the 3 fundamental races these two pieces of code may have:
- Producer <-> producer: The producer cannot race with itself, because it
only ever produces into current->sigqueue_cache and has interrupts
disabled. We don't send signals from NMI context.
- Consumer <-> consumer: multiple consumers cannot race with themselves,
because they serialize on sighand->siglock.
- Producer <-> consumer: this is the most interesting race, and I think
it's unsafe in theory, because the producer doesn't make sure that any
previous writes to the actual queue entry (struct sigqueue *q) have
reached storage before the new 'free' entry is advertised to consumers.
So in principle CPU#0 could see a new sigqueue entry and use it, before
it's fully freed.
In *practice* it's probably safe by accident (or by undocumented
intent), because there's an atomic op we have shortly before putting the
queue entry into the sigqueue_cache, in __sigqueue_free():
if (atomic_dec_and_test(&q->user->sigpending))
free_uid(q->user);
And atomic_dec_and_test() implies a full barrier - although I haven't
found the place where we document it and
Documentation/memory-ordering.txt is silent on it. We should probably
fix that too.
At minimum the patch adding the ->sigqueue_cache should include a
well-documented race analysis firmly documenting the implicit barrier after
the atomic_dec_and_test().
Anyway, I agree with the revert.
Thanks,
Ingo
Powered by blists - more mailing lists