lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1770596925.11115.1562455450167.JavaMail.zimbra@efficios.com>
Date:   Sat, 6 Jul 2019 19:24:10 -0400 (EDT)
From:   Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Ingo Molnar <mingo@...nel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        x86 <x86@...nel.org>, Nadav Amit <namit@...are.com>,
        paulmck <paulmck@...ux.ibm.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Will Deacon <will.deacon@....com>
Subject: Re: [PATCH] cpu/hotplug: Cache number of online CPUs

----- On Jul 5, 2019, at 5:00 PM, Thomas Gleixner tglx@...utronix.de wrote:

> On Fri, 5 Jul 2019, Thomas Gleixner wrote:
>> On Fri, 5 Jul 2019, Mathieu Desnoyers wrote:
>> > ----- On Jul 5, 2019, at 4:49 AM, Ingo Molnar mingo@...nel.org wrote:
>> > > * Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
>> > >> The semantic I am looking for here is C11's relaxed atomics.
>> > > 
>> > > What does this mean?
>> > 
>> > C11 states:
>> > 
>> > "Atomic operations specifying memory_order_relaxed are  relaxed  only  with
>> > respect
>> > to memory ordering.  Implementations must still guarantee that any given atomic
>> > access
>> > to a particular atomic object be indivisible with respect to all other atomic
>> > accesses
>> > to that object."
>> > 
>> > So I am concerned that num_online_cpus() as proposed in this patch
>> > try to access __num_online_cpus non-atomically, and without using
>> > READ_ONCE().
>> >
>> > 
>> > Similarly, the update-side should use WRITE_ONCE(). Protecting with a mutex
>> > does not provide mutual exclusion against concurrent readers of that variable.
>> 
>> Again. This is nothing new. The current implementation of num_online_cpus()
>> has no guarantees whatsoever.
>> 
>> bitmap_hweight() can be hit by a concurrent update of the mask it is
>> looking at.
>> 
>> num_online_cpus() gives you only the correct number if you invoke it inside
>> a cpuhp_lock held section. So why do we need that fuzz about atomicity now?
>> 
>> It's racy and was racy forever and even if we add that READ/WRITE_ONCE muck
>> then it still wont give you a reliable answer unless you hold cpuhp_lock at
>> least for read. So fore me that READ/WRITE_ONCE is just a cosmetic and
>> misleading reality distortion.
> 
> That said. If it makes everyone happy and feel better, I'm happy to add it
> along with a bit fat comment which explains that it's just preventing a
> theoretical store/load tearing issue and does not provide any guarantees
> other than that.

One example where the lack of READ_ONCE() makes me uneasy is found
in drivers/net/wireless/intel/iwlwifi/pcie/trans.c: iwl_pcie_set_interrupt_capa():

static void iwl_pcie_set_interrupt_capa(struct pci_dev *pdev,
                                        struct iwl_trans *trans)
{
        struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans);
        int max_irqs, num_irqs, i, ret;
[...]
        max_irqs = min_t(u32, num_online_cpus() + 2, IWL_MAX_RX_HW_QUEUES);
        for (i = 0; i < max_irqs; i++)
                trans_pcie->msix_entries[i].entry = i;

max_entries is an array of IWL_MAX_RX_HW_QUEUES entries. AFAIU, if the compiler
decides to re-fetch num_online_cpus() for the loop condition after hot-plugging
of a few cpus, we end up doing an out-of bound access to the array.

The scenario would be:

- load __num_online_cpus into a register,
- compare register + 2 to IWL_MAX_RX_HW_QUEUES
  (let's say the current number of cpus is lower that IWL_MAX_RX_HW_QUEUES - 2)
- a few other cpus are brought online,
- compiler decides to re-fetch __num_online_cpus for the loop bound, because
  its value is rightfully assumed to have never changed,
- corruption and sadness follows.

I'm not saying the current compiler implementations actually generate this
for this specific function, but I'm concerned about the fact that they are
within their right to do so. It seems quite fragile to expose a kernel API
which can yield to this kind of subtle bug.

A quick kernel tree grep for both "num_online_cpus" and "min" on the same line
gives 64 results, so it's not an isolated case.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ