lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <5DE00B41-835C-4E68-B192-2A3C7ACB4392@gmail.com>
Date:   Sun, 9 Dec 2018 16:57:43 -0800
From:   Nadav Amit <nadav.amit@...il.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Matthew Wilcox <willy@...radead.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Linux-MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Andy Lutomirski <luto@...nel.org>
Subject: Re: Should this_cpu_read() be volatile?

> On Dec 8, 2018, at 2:52 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> 
> On Fri, Dec 07, 2018 at 04:40:52PM -0800, Nadav Amit wrote:
> 
>>> I'm actually having difficulty finding the this_cpu_read() in any of the
>>> functions you mention, so I cannot make any concrete suggestions other
>>> than pointing at the alternative functions available.
>> 
>> 
>> So I got deeper into the code to understand a couple of differences. In the
>> case of select_idle_sibling(), the patch (Peter’s) increase the function
>> code size by 123 bytes (over the baseline of 986). The per-cpu variable is
>> called through the following call chain:
>> 
>> 	select_idle_sibling()
>> 	=> select_idle_cpu()
>> 	=> local_clock()
>> 	=> raw_smp_processor_id()
>> 
>> And results in 2 more calls to sched_clock_cpu(), as the compiler assumes
>> the processor id changes in between (which obviously wouldn’t happen).
> 
> That is the thing with raw_smp_processor_id(), it is allowed to be used
> in preemptible context, and there it _obviously_ can change between
> subsequent invocations.
> 
> So again, this change is actually good.
> 
> If we want to fix select_idle_cpu(), we should maybe not use
> local_clock() there but use sched_clock_cpu() with a stable argument,
> this code runs with IRQs disabled and therefore the CPU number is stable
> for us here.
> 
>> There may be more changes around, which I didn’t fully analyze. But
>> the very least reading the processor id should not get “volatile”.
>> 
>> As for finish_task_switch(), the impact is only few bytes, but still
>> unnecessary. It appears that with your patch preempt_count() causes multiple
>> reads of __preempt_count in this code:
>> 
>>       if (WARN_ONCE(preempt_count() != 2*PREEMPT_DISABLE_OFFSET,
>>                     "corrupted preempt_count: %s/%d/0x%x\n",
>>                     current->comm, current->pid, preempt_count()))
>>               preempt_count_set(FORK_PREEMPT_COUNT);
> 
> My patch proposed here:
> 
>  https://marc.info/?l=linux-mm&m=154409548410209
> 
> would actually fix that one I think, preempt_count() uses
> raw_cpu_read_4() which will loose the volatile with that patch.

Sorry for the spam from yesterday. That what happens when I try to write
emails on my phone while I’m distracted.

I tested the patch you referenced, and it certainly improves the situation
for reads, but there are still small and big issues lying around.

The biggest one is that (I think) smp_processor_id() should apparently use
__this_cpu_read(). Anyhow, when preemption checks are on, it is validated
that smp_processor_id() is not used while preemption is enabled, and IRQs
are not supposed to change its value. Otherwise the generated code of many
functions is affected.

There are all kind of other smaller issues, such as set_irq_regs() and
get_irq_regs(), which should run with disabled interrupts. They affect the
generated code in do_IRQ() and others.

But beyond that, there are so many places in the code that use
this_cpu_read() while IRQs are guaranteed to be disabled. For example
arch/x86/mm/tlb.c is full with this_cpu_read/write() and almost(?) all
should be running with interrupts disabled. Having said that, in my build
only flush_tlb_func_common() was affected.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ