[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+YKmhz-+sugKt86CF2715YtoVz4ee+sRfDp2Wu==A4zPQ@mail.gmail.com>
Date: Thu, 17 Sep 2015 18:41:46 +0200
From: Dmitry Vyukov <dvyukov@...gle.com>
To: Oleg Nesterov <oleg@...hat.com>
Cc: ebiederm@...ssion.com, Al Viro <viro@...iv.linux.org.uk>,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...nel.org>,
Paul McKenney <paulmck@...ux.vnet.ibm.com>, mhocko@...e.cz,
LKML <linux-kernel@...r.kernel.org>, ktsan@...glegroups.com,
Kostya Serebryany <kcc@...gle.com>,
Andrey Konovalov <andreyknvl@...gle.com>,
Alexander Potapenko <glider@...gle.com>,
Hans Boehm <hboehm@...gle.com>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] kernel: fix data race in put_pid
What happens here exactly matches what is described in CONTROL
DEPENDENCIES section of Documentation/memory-barriers.txt. So all the
bad things described there are possible here. The document explicitly
requires usage of rmb/acquire/READ_ONCE_CTRL in such cases. I don't
know what to add to that.
Regarding reordering of "ns = pid->numbers[pid->level].ns". If we are
talking about the thread that releases the last reference via the
fast-path atomic_read check, then, yes, they can be reordered by both
compiler and hardware, but only in a non-observable way, and so it
does not matter. Only in a non-observable way because the freeing
thread is the same as the thread that does "ns =
pid->numbers[pid->level].ns" and both compilers and hardware visibly
preserve program order for single-thread.
If we are talking about threads that release all but last reference,
then, no, they can't be reordered, because those threads execute
atomic_dec_and_test which is a release operation.
On Thu, Sep 17, 2015 at 6:08 PM, Oleg Nesterov <oleg@...hat.com> wrote:
> Honestly, I can not see how this can happen. So I do not really
> understand the problem and the fix.
>
> And if this can happen I can't understand how this patch can help.
> What about "ns = pid->numbers[pid->level].ns" ? It can be reordered
> with atomic_read_acquire().
>
> I leave this to other reviewers, but perhaps you can spell the
> "For example" part of the changelog.
>
>
> On 09/17, Dmitry Vyukov wrote:
>>
>> put_pid checks whether the current thread has the only reference
>> to the pid with atomic_read() which does not have any memory
>> barriers, and if so proceeds directly to kmem_cache_free().
>> As the result memory accesses to the object in kmem_cache_free()
>> or user accesses to the object after reallocation (again without
>> any memory barriers on fast path) can hoist above the atomic_read()
>> check and conflict with memory accesses to the pid object in other
>> threads before they released their references.
>>
>> There is a control dependency between the atomic_read() check and
>> kmem_cache_free(), but control dependencies are disregarded by some
>> architectures. Documentation/memory-barriers.txt explicitly states:
>> "A load-load control dependency requires a full read memory barrier.
>> ... please note that READ_ONCE_CTRL() is not optional! [even for stores]"
>> in the CONTROL DEPENDENCIES section.
>>
>> For example, if store to the first word of the object to build a freelist
>> in kmem_cache_free() hoists above the check, stores to the first word
>> in other threads can corrupt the memory allocator freelist.
>>
>> Use atomic_read_acquire() for the fast path check to hand off properly
>> acquired object to memory allocator.
>>
>> The data race was found with KernelThreadSanitizer (KTSAN).
>>
>> Signed-off-by: Dmitry Vyukov <dvyukov@...gle.com>
>> ---
>> kernel/pid.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/pid.c b/kernel/pid.c
>> index ca36879..3b0b13d 100644
>> --- a/kernel/pid.c
>> +++ b/kernel/pid.c
>> @@ -242,7 +242,7 @@ void put_pid(struct pid *pid)
>> return;
>>
>> ns = pid->numbers[pid->level].ns;
>> - if ((atomic_read(&pid->count) == 1) ||
>> + if ((atomic_read_acquire(&pid->count) == 1) ||
>> atomic_dec_and_test(&pid->count)) {
>> kmem_cache_free(ns->pid_cachep, pid);
>> put_pid_ns(ns);
>> --
>> 2.6.0.rc0.131.gf624c3d
>>
>
--
Dmitry Vyukov, Software Engineer, dvyukov@...gle.com
Google Germany GmbH, Dienerstraße 12, 80331, München
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists