lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 29 Mar 2021 18:01:00 -0700
From:   Guenter Roeck <linux@...ck-us.net>
To:     Jonas Malaco <jonas@...tocubo.io>
Cc:     Jean Delvare <jdelvare@...e.com>, linux-hwmon@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] hwmon: (nzxt-kraken2) mark and order concurrent accesses

On 3/29/21 5:21 PM, Jonas Malaco wrote:
> On Mon, Mar 29, 2021 at 02:53:39PM -0700, Guenter Roeck wrote:
>> On Mon, Mar 29, 2021 at 05:22:01AM -0300, Jonas Malaco wrote:
>>> To avoid a spinlock, the driver explores concurrent memory accesses
>>> between _raw_event and _read, having the former updating fields on a
>>> data structure while the latter could be reading from them.  Because
>>> these are "plain" accesses, those are data races according to the Linux
>>> kernel memory model (LKMM).
>>>
>>> Data races are undefined behavior in both C11 and LKMM.  In practice,
>>> the compiler is free to make optimizations assuming there is no data
>>> race, including load tearing, load fusing and many others,[1] most of
>>> which could result in corruption of the values reported to user-space.
>>>
>>> Prevent undesirable optimizations to those concurrent accesses by
>>> marking them with READ_ONCE() and WRITE_ONCE().  This also removes the
>>> data races, according to the LKMM, because both loads and stores to each
>>> location are now "marked" accesses.
>>>
>>> As a special case, use smp_load_acquire() and smp_load_release() when
>>> loading and storing ->updated, as it is used to track the validity of
>>> the other values, and thus has to be stored after and loaded before
>>> them.  These imply READ_ONCE()/WRITE_ONCE() but also ensure the desired
>>> order of memory accesses.
>>>
>>> [1] https://lwn.net/Articles/793253/
>>>
>>
>> I think you lost me a bit there. What out-of-order accesses that would be
>> triggered by a compiler optimization are you concerned about here ?
>> The only "problem" I can think of is that priv->updated may have been
>> written before the actual values. The impact would be ... zero. An
>> attribute read would return "stale" data for a few microseconds.
>> Why is that a concern, and what difference does it make ?
> 
> The impact of out-of-order accesses to priv->updated is indeed minimal.
> 
> That said, smp_load_acquire() and smp_store_release() were meant to
> prevent reordering at runtime, and only affect architectures other than
> x86.  READ_ONCE() and WRITE_ONCE() would already prevent reordering from
> compiler optimizations, and x86 provides the load-acquire/store-release
> semantics by default.
> 
> But the reordering issue is not a concern to me, I got carried away when
> adding READ_ONCE()/WRITE_ONCE().  While smp_load_acquire() and
> smp_store_release() make the code work more like I intend it to, they
> are (small) costs we can spare.
> 
> I still think that READ_ONCE()/WRITE_ONCE() are necessary, including for
> priv->updated.  Do you agree?
> 

No. What is the point ? The order of writes doesn't matter, the writes won't
be randomly dropped, and it doesn't matter if the reader reports old values
for a couple of microseconds either. This would be different if the values
were used as synchronization primitives or similar, but that isn't the case
here. As for priv->updated, if you are concerned about lost reports and
the 4th report is received a few microseconds before the read, I'd suggest
to loosen the interval a bit instead.

Supposedly we are getting reports every 500ms. We have two situations:
- More than three reports are lost, making priv->updated somewhat relevant.
  In this case, it doesn't matter if outdated values are reported for
  a few uS since most/many/some reports are outdated more than a second
  anyway.
- A report is received but old values are reported for a few uS. That
  doesn't matter either because reports are always outdated anyway by
  much more than a few uS anyway, and the code already tolerates up to
  2 seconds of lost reports.

Sorry, I completely fail to see the problem you are trying to solve here.

Guenter

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ