linux-kernel - Re: [PATCH v2] i2c: designware: Fix corrupted memory seen in the ISR

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a7a85428-d40d-4adb-8f84-75e1dabe19c9@os.amperecomputing.com>
Date:   Fri, 15 Sep 2023 18:47:55 -0700
From:   Jan Bottorff <janb@...amperecomputing.com>
To:     Serge Semin <fancer.lancer@...il.com>
Cc:     Jarkko Nikula <jarkko.nikula@...ux.intel.com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Mika Westerberg <mika.westerberg@...ux.intel.com>,
        Jan Dabros <jsd@...ihalf.com>,
        Andi Shyti <andi.shyti@...nel.org>,
        Philipp Zabel <p.zabel@...gutronix.de>,
        linux-i2c@...r.kernel.org, linux-kernel@...r.kernel.org,
        Yann Sionneau <ysionneau@...rayinc.com>
Subject: Re: [PATCH v2] i2c: designware: Fix corrupted memory seen in the ISR

On 9/15/2023 8:21 AM, Serge Semin wrote:
...
> 
> Based on the patch log and the comment, smp_wmb() seems to be more
> suitable here since the problem looks like SMP-specific. Most
> importantly the smp_wmb() will get to be just the compiler barrier on
> the UP system, so no cache and pipeline flushes in that case.
> Meanwhile
> 
> I am not ARM expert, but based on the problem and the DMB/DSB barriers
> descriptions using DMB should be enough in your case since you only
> need memory syncs.
> 
Hi Serge,

I looked at the definition of smp_wmb, and it looks like on arm64 it 
uses a DMB barrier not a DSB barrier.

In /arch/arm64/include/asm/barrier.h:
...
#define __arm_heavy_mb(x...) dsb(x)
...
#if defined(CONFIG_ARM_DMA_MEM_BUFFERABLE) || defined(CONFIG_SMP)
...
#define wmb()		__arm_heavy_mb(st)
...
#define __smp_wmb()	dmb(ishst)

And then in /include/asm-generic/barrier.h it says:
#ifdef CONFIG_SMP
...
#ifndef smp_wmb
#define smp_wmb()	do { kcsan_wmb(); __smp_wmb(); } while (0)
#endif

This looks like wmb() is a DSB and smp_wmb() is a DMB on SMP systems, so 
the two functions are not equivalent on SMP systems.

So lets explore if we think DMB or DSB is the correct barrier.

The ARM barrier docs I referred to has a specific example that says this:

"In some message passing systems, it is common for one observer to 
update memory and then send an interrupt using a mailbox of some sort to 
a second observer to indicate that memory has been updated and the new
contents have been read. Even though the sending of the interrupt using 
a mailbox might be initiated using a memory access, a DSB barrier
must be used to ensure the completion of previous memory accesses.

Therefore the following sequence is needed to ensure that P2 sees the 
updated value.

P1:
  STR R5, [R1] ; message stored to shared memory location
  DSB [ST]
  STR R1, [R4] ; R4 contains the address of a mailbox

P2:
  ; interrupt service routine
  LDR R5, [R1]

Even if R4 is a pointer to Strongly-Ordered memory, the update to R1 
might not be visible without the DSB executed by P1.
It should be appreciated that these rules are required in connection to 
the ARM Generic Interrupt Controller (GIC).
"

I don't positivly understand why it needs to be a DSB and not just a 
DMB, but this example matches what happens in the driver. The ARM docs 
do some hand waving that DSB is required because of the GIC.

Unless we can come up with a reason why this example in the ARM Barrier 
docs is not a match for what happens in the i2c driver, then ARM is 
saying it has to be a DSB not a DMB. If it needs to be a DSB then 
smb_wmb is insufficient.

Does anybody else have a different interpretation of this section in the 
ARM barrier docs? They use the word mailbox, and show a shared memory 
write, an interrupt triggering write, and a read of shared memory on a 
different core. Some would describe that as a software mailbox.

I did read someplace (although don't have a specific reference I can 
give) that ordering applied to normal memory writes are in a different 
group than ordering applied between strongly ordered accesses. The 
excerpt from the ARM barrier document above does say "Even if R4 is a 
pointer to Strongly-Ordered memory, the update to R1 might not be 
visible without the DSB executed by P1", which implies a DMB is 
insufficient to cause ordering between normal memory writes and 
strongly-ordered device memory writes.

I know currently on ARM64 Windows, the low-level kernel device MMIO 
access functions (like WRITE_REGISTER_ULONG) all have a DSB before the 
MMIO memory access. That seems a little heavy handed to me, but it also 
may be that was required to get all the current driver code written for 
AMD/Intel processors to work correctly on ARM64 without adding barriers 
in the drivers. There are also non-barrier variants that can be used if 
a driver wants to optimize performance. Defaulting to correct operation 
with minimal code changes would reduce the risk to delivery schedules.

Linux doesn't seem to make any attempt to have barriers in the low level 
MMIO access functions. If Linux had chosen to do that on ARM64, this 
patch would not have been required. For a low speed device like an i2c 
controller, optimizing barriers likely make little difference in 
performance.

Let's look at it from a risk analysis viewpoint. Say a DMB is sufficient 
and we use the stronger DSB variant, the downside is a few cpu cycles 
will be wasted in i2c transfers. Say we use a DMB when a DSB is required 
for correct operation, the downside is i2c operations may malfunction. 
In this case, using a few extra cpu cycles for an operation that does 
not happen at high frequency is lower risk than failures in i2c 
transfers. If there is any uncertainty in what barrier type to use, 
picking DSB over DMB would be better. We determined from the include 
fragments above that wmb() give the DSB and smp_wmb() does not.

Based on the above info, I think wmb() is still the correct function, 
and a change to smp_wmb() would not be correct.

Sorry for the long message, I know some of you will be inspired to think 
deeply about barriers, and some will be annoyed that I spent this much 
space to explain how I came to the choice of wmb().

Thanks,
Jan