lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7865beac-cd03-4242-aab0-bbf05c60391a@cjdns.fr>
Date: Wed, 29 Oct 2025 21:40:39 +0100
From: Caleb James DeLisle <cjd@...ns.fr>
To: Jonas Gorski <jonas.gorski@...il.com>
Cc: nbd@....name, lorenzo@...nel.org, ryder.lee@...iatek.com,
 shayne.chen@...iatek.com, sean.wang@...iatek.com, matthias.bgg@...il.com,
 angelogioacchino.delregno@...labora.com, linux-wireless@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-mediatek@...ts.infradead.org,
 Daniel Golle <daniel@...rotopia.org>
Subject: Re: [PATCH] wifi: mt76: mmio_(read|write)_copy byte swap when on Big
 Endian


On 29/10/2025 21:12, Jonas Gorski wrote:
> On Wed, Oct 29, 2025 at 4:24 PM Caleb James DeLisle <cjd@...ns.fr> wrote:
>>
>> On 29/10/2025 10:15, Jonas Gorski wrote:
>>> On Tue, Oct 28, 2025 at 10:42 PM Caleb James DeLisle <cjd@...ns.fr> wrote:
>>>> On 28/10/2025 21:19, Jonas Gorski wrote:
>>>>> Hi,
>>>>>
>>>>> On Mon, Oct 27, 2025 at 6:19 PM Caleb James DeLisle <cjd@...ns.fr> wrote:
>>>>>> When on a Big Endian machine, PCI swaps words to/from LE when
>>>>>> reading/writing them. This presents a problem when we're trying
>>>>>> to copy an opaque byte array such as firmware or encryption key.
>>>>>>
>>>>>> Byte-swapping during copy results in two swaps, but solves the
>>>>>> problem.
>>>>>>
>>>>>> Fixes:
>>>>>> mt76x2e 0000:02:00.0: ROM patch build: 20141115060606a
>>>>>> mt76x2e 0000:02:00.0: Firmware Version: 0.0.00
>>>>>> mt76x2e 0000:02:00.0: Build: 1
>>>>>> mt76x2e 0000:02:00.0: Build Time: 201607111443____
>>>>>> mt76x2e 0000:02:00.0: Firmware failed to start
>>>>>> mt76x2e 0000:02:00.0: probe with driver mt76x2e failed with error -145
>>>>>>
>>>>>> Signed-off-by: Caleb James DeLisle <cjd@...ns.fr>
>>>>>> ---
>>>>>>     drivers/net/wireless/mediatek/mt76/mmio.c | 34 +++++++++++++++++++++++
>>>>>>     1 file changed, 34 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/net/wireless/mediatek/mt76/mmio.c b/drivers/net/wireless/mediatek/mt76/mmio.c
>>>>>> index cd2e9737c3bf..776dbaacc8a3 100644
>>>>>> --- a/drivers/net/wireless/mediatek/mt76/mmio.c
>>>>>> +++ b/drivers/net/wireless/mediatek/mt76/mmio.c
>>>>>> @@ -30,15 +30,49 @@ static u32 mt76_mmio_rmw(struct mt76_dev *dev, u32 offset, u32 mask, u32 val)
>>>>>>            return val;
>>>>>>     }
>>>>>>
>>>>>> +static void mt76_mmio_write_copy_portable(void __iomem *dst,
>>>>>> +                                         const u8 *src, int len)
>>>>>> +{
>>>>>> +       __le32 val;
>>>>>> +       int i = 0;
>>>>>> +
>>>>>> +       for (i = 0; i < ALIGN(len, 4); i += 4) {
>>>>>> +               memcpy(&val, src + i, sizeof(val));
>>>>>> +               writel(cpu_to_le32(val), dst + i);
>>>>>> +       }
>>>>>> +}
>>>>>> +
>>>>>>     static void mt76_mmio_write_copy(struct mt76_dev *dev, u32 offset,
>>>>>>                                     const void *data, int len)
>>>>>>     {
>>>>>> +       if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) {
>>>>>> +               mt76_mmio_write_copy_portable(dev->mmio.regs + offset, data,
>>>>>> +                                             len);
>>>>>> +               return;
>>>>>> +       }
>>>>>>            __iowrite32_copy(dev->mmio.regs + offset, data, DIV_ROUND_UP(len, 4));
>>>>> Maybe just replace this with memcpy_toio() which does no swapping at
>>>>> all instead of double swapping on BE?
>>>> I'm not that informed about how PCI works so I had to test to confirm
>>>> my understanding, but I can confirm that memcpy_toio() does not solve
>>>> the problem.
>>> Ah, right, I misread _iowrite32_copy() to do conversion to LE, but it doesn't.
>>>
>>> What architecture is this you have? PowerPC? ARM? MIPS? 32 bit? 64 bit?
>>
>> MIPS32 (EcoNet EN751221 34Kc)
>>
>>
>>> So the differences I see are:
>>>
>>> 1. __iowrite32_copy() uses __raw_writel(), which has different memory
>>> semantics than writel()
>>> 2. __iowrite32_copy() assumed src is aligned to 32 bit, while you
>>> explicitly align it
>>> 3. memcpy_toio() will handle unaligned src properly, but does 64 bit
>>> accesses on 64 bit systems (and uses __raw* again).
>>>
>>> Is src aligned? If not, then the issue might be 2. And if your system
>>> is 64 bit, it would explain why 3 didn't help.
>>
>> I'm not a regular developer of mt76 so I wasn't sure if that was
>> guaranteed and I just wanted to code for safety.
>>
>> After reviewing the code, I see that there are a few places where
>> mt76_mmio_write_copy is being called with stack-allocated u8 arrays
>> so it's pretty clear to me that this is being treated as a memcpy-like
>> function and we should be handling unaligned inputs.
>>
>>
>>> As a first step you could try to replace the writel(cpu_to_le32(val)
>>> with a iowrite32be(val, ...) which should do the same except avoiding
>>> the doubled byte swapping. If that works, you can try to replace it
>> This works.
>>
>> These symbols are a bit of a nightmare to trace, so I ended up making
>> an .i file so I could confirm what's happening.
>>
>> iowrite32be() uses the version in iomap.c so I understand that's using
>> writel(swab32(val),port), so a writel with an unconditional byte swap.
>> writel() is more complicated, it's an inline function that is generated
>> in a rat's nest of preprocessor macros in mips/include/asm/io.h
>>
>> The preprocessed is this:
>>
>> __mem = (void *)((unsigned long)(mem)); __val = (val); if (sizeof(u32)
>> != sizeof(u64) || sizeof(u64) == sizeof(long)) { *__mem = __val;
>>
>> The source is this:
>>
>>       __mem = (void *)__swizzle_addr_##bwlq((unsigned long)(mem));    \
>>       __val = pfx##ioswab##bwlq(__mem, val);                \
>>       if (sizeof(type) != sizeof(u64) || sizeof(u64) == sizeof(long)) \
>>           *__mem = __val;                        \
>>
>> The line "pfx##ioswab##bwlq(__mem, val);" is ioswabl() and the source
>> of that explains the issue:
>>
>>    * Sane hardware offers swapping of PCI/ISA I/O space accesses in hardware;
>>    * less sane hardware forces software to fiddle with this...
>>
>> So this confirms my initial understanding, the PCI hardware is doing the
>> byte swapping unconditionally.
>>
>>
>>> with __raw_writel(), which then would make this the same as
>>> __iowrite32_copy, except making sure that src is aligned.
>>
>> This fails.
>>
>> Since I'm the maintainer of this SoC and it's still fairly new, I wrote
>> a trivial kmod to verify that unaligned access is not just silently
>> returning trash, it works as though it were aligned so alignment is
>> not the issue.
>>
>>
>>> Also you could replace your memcpy() with get_unaligned((u32 *)(src +
>>> i)); Should do the same but inline.
>> Good idea, I will do this.
>>>> The issue as I understand it is that rather than making every driver
>>>> carefully call cpu_to_le*() every MMIO write, someone decided to make
>>>> the PCI host bridge itself transparently byte-swap all MMIO on the
>>>> wire. Since most MMIO is hitting registers and most buffers are
>>>> transferred by DMA, for the most part everything works and nobody
>>>> notices.
>>>>
>>>> But in the rare case that we need to write a blob to MMIO, it gets
>>>> transparently swapped in hardware so you need to use cpu_to_le in that
>>>> case. Doing a search of ./drivers for write.*cpu_to_le I can see this
>>>> issue comes up a bit.
>>> Every (PCI) driver does conversion to LE implicitly by using
>>> writel/readl (or iowrite32/ioread32) etc do the conversion to/from LE.
>>> So writel(foo, dst )is a __raw_writel(cpu_to_le32(foo), dst) etc. PCI
>>> memory is assumed to be in LE. If you are on a little endian system,
>>> then no byte swapping happens, and on BE it will do byte swapping
>>> before writing the value.
>> Okay so it seems that in the case of MIPS, that's not always how it
>> works.
>>
>> https://github.com/torvalds/linux/blob/e53642b87a4f4b03a8d7e5f8507fc3cd0c595ea6/arch/mips/include/asm/mach-generic/mangle-port.h#L17
>>
>> Since we don't know if the swap will happen in hardware or software
>> and (AFAIK) this is not a hot enough path that double-swapping will
>> have a notable performance penalty, I think the most sane thing to
>> do is use writel(cpu_to_le32()) and not care if it's swapped back
>> in the kernel or hardware.
> Oh, I think I see what it happening here. ECONET is a Big Endian MIPS
> platform, but does not select SWAP_IO_SPACE (most other BE platforms
> do).
>
> Does that mean the PCI space is swapped in hardware there?
>
> I guess that means that anything that uses __raw accessors to PCI
> space directly or indirectly is broken, as the raw data is now
> actually in the wrong order and needs to be swab'd.
>
> I don't know if it is a good idea to change this in __iowrite32_copy()
> / __ioread32_copy() (and other helpers), or if there are drivers that
> use it on non-PCI spaces and would be broken by that.
>
> If there is a way, I would suggest disabling hardware conversion and
> selecting SWAP_IO_SPACE, but that will affect a lot of your code that
> assumes that writel() etc don't convert to/from little endian.


I can look around in the hardware registers and see if I can shut it
off for EcoNet, but if you're saying MT76 should not support BE unless
they disable hardware swapping and use SWAP_IO_SPACE, that means the
majority of BE hardware on OpenWrt is not going to be supported. If
that's the decision then it at least warrants clear documentation.

Thanks,
Caleb


user@...-dev:~/en7526/openwrt$ find ./ -name 'config-6.12' | while read 
x; do grep -q 'CPU_BIG_ENDIAN=y' "$x" && ( grep -q 'SWAP_IO_SPACE=y' 
"$x" || echo "$x
  does not use SWAP_IO_SPACE" ) ; done
./target/linux/apm821xx/config-6.12 does not use SWAP_IO_SPACE
./target/linux/realtek/rtl931x/config-6.12 does not use SWAP_IO_SPACE
./target/linux/realtek/rtl930x_nand/config-6.12 does not use SWAP_IO_SPACE
./target/linux/realtek/rtl839x/config-6.12 does not use SWAP_IO_SPACE
./target/linux/realtek/rtl931x_nand/config-6.12 does not use SWAP_IO_SPACE
./target/linux/realtek/rtl930x/config-6.12 does not use SWAP_IO_SPACE
./target/linux/realtek/rtl838x/config-6.12 does not use SWAP_IO_SPACE
./target/linux/ath79/config-6.12 does not use SWAP_IO_SPACE
./target/linux/octeon/config-6.12 does not use SWAP_IO_SPACE
./target/linux/ixp4xx/config-6.12 does not use SWAP_IO_SPACE
./target/linux/econet/en751221/config-6.12 does not use SWAP_IO_SPACE
user@...-dev:~/en7526/openwrt$ find ./ -name 'config-6.12' | while read 
x; do grep -q 'CPU_BIG_ENDIAN=y' "$x" && ( grep -q 'SWAP_IO_SPACE=y' 
"$x" && echo "$x
  does use SWAP_IO_SPACE" ) ; done
./target/linux/bmips/bcm6358/config-6.12 does use SWAP_IO_SPACE
./target/linux/bmips/bcm6328/config-6.12 does use SWAP_IO_SPACE
./target/linux/bmips/bcm6318/config-6.12 does use SWAP_IO_SPACE
./target/linux/bmips/bcm6368/config-6.12 does use SWAP_IO_SPACE
./target/linux/bmips/bcm6362/config-6.12 does use SWAP_IO_SPACE
./target/linux/bmips/bcm63268/config-6.12 does use SWAP_IO_SPACE
./target/linux/lantiq/config-6.12 does use SWAP_IO_SPACE
user@...-dev:~/en7526/openwrt$


>
> Best regards,
> Jonas
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ