[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <871qkjxztc.fsf@toke.dk>
Date: Sun, 16 Apr 2023 23:49:35 +0200
From: Toke Høiland-Jørgensen <toke@...e.dk>
To: Christian Lamparter <chunkeey@...il.com>,
Álvaro Fernández Rojas
<noltari@...il.com>, f.fainelli@...il.com, jonas.gorski@...il.com,
nbd@....name, kvalo@...nel.org, davem@...emloft.net,
edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
linux-wireless@...r.kernel.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] ath9k: fix calibration data endianness
Christian Lamparter <chunkeey@...il.com> writes:
> On 4/16/23 12:50, Toke Høiland-Jørgensen wrote:
>> Christian Lamparter <chunkeey@...il.com> writes:
>>
>>> On 4/15/23 18:02, Christian Lamparter wrote:
>>>> Hi,
>>>>
>>>> On 4/15/23 17:25, Toke Høiland-Jørgensen wrote:
>>>>> Álvaro Fernández Rojas <noltari@...il.com> writes:
>>>>>
>>>>>> BCM63xx (Big Endian MIPS) devices store the calibration data in MTD
>>>>>> partitions but it needs to be swapped in order to work, otherwise it fails:
>>>>>> ath9k 0000:00:01.0: enabling device (0000 -> 0002)
>>>>>> ath: phy0: Ignoring endianness difference in EEPROM magic bytes.
>>>>>> ath: phy0: Bad EEPROM VER 0x0001 or REV 0x00e0
>>>>>> ath: phy0: Unable to initialize hardware; initialization status: -22
>>>>>> ath9k 0000:00:01.0: Failed to initialize device
>>>>>> ath9k: probe of 0000:00:01.0 failed with error -22
>>>>>
>>>>> How does this affect other platforms? Why was the NO_EEP_SWAP flag set
>>>>> in the first place? Christian, care to comment on this?
>>>>
>>>> I knew this would come up. I've written what I know and remember in the
>>>> pull-request/buglink.
>>>>
>>>> Maybe this can be added to the commit?
>>>> Link: https://github.com/openwrt/openwrt/pull/12365
>>>>
>>>> | From what I remember, the ah->ah_flags |= AH_NO_EEP_SWAP; was copied verbatim from ath9k_of_init's request_eeprom.
>>>>
>>>> Since the existing request_firmware eeprom fetcher code set the flag,
>>>> the nvmem code had to do it too.
>>>>
>>>> In theory, I don't think that not setting the AH_NO_EEP_SWAP flag will cause havoc.
>>>> I don't know if there are devices out there, which have a swapped magic (which is
>>>> used to detect the endianess), but the caldata is in the correct endiannes (or
>>>> vice versa - Magic is correct, but data needs swapping).
>>>>
>>>> I can run tests with it on a Netzgear WNDR3700v2 (AR7161+2xAR9220)
>>>> and FritzBox 7360v2 (Lantiq XWAY+AR9220). (But these worked fine.
>>>> So I don't expect there to be a new issue there).
>>>
>>> Nope! This is a classic self-own!... Well at least, this now gets documented!
>>>
>>> Here are my findings. Please excuse the overlong lines.
>>>
>>> ## The good news / AVM FritzBox 7360v2 ##
>>>
>>> The good news: The AVM FritzBox 7360v2 worked the same as before.
>>
>> [...]
>>
>>> ## The not so good news / Netgear WNDR3700v2 ##
>>>
>>> But not the Netgar WNDR3700v2. One WiFi (The 2.4G, reported itself now as the 5G @0000:00:11.0 -
>>> doesn't really work now), and the real 5G WiFi (@0000:00:12.0) failed with:
>>> "phy1: Bad EEPROM VER 0x0001 or REV 0x06e0"
>>
>> [...]
>>
>> Alright, so IIUC, we have a situation where some devices only work
>> *with* the flag, and some devices only work *without* the flag? So we'll
>> need some kind of platform-specific setting? Could we put this in the
>> device trees, or is there a better solution?
>
> Depends. From what I gather, ath9k calls this "need_swap". Thing is,
> the flag in the EEPROM is called "AR5416_EEPMISC_BIG_ENDIAN". In the
> official documentation about the AR9170 Base EEPROM (has the same base
> structure as AR5008 up to AR92xx) this is specified as:
>
> "Only bit 0 is defined as Big Endian. This bit should be written as 1
> when the structure is interpreted in big Endian byte ordering. This bit
> must be reviewed before any larger than byte parameters can be interpreted."
>
> It makes sense that on a Big-Endian MIPS device (like the Netgear WNDR3700v2),
> the caldata should be in "Big-Endian" too... so no swapping is necessary.
>
> Looking in ath9k's eeprom.c function ath9k_hw_nvram_swap_data() that deals
> with this eepmisc flag:
>
> | if (ah->eep_ops->get_eepmisc(ah) & AR5416_EEPMISC_BIG_ENDIAN) {
> | *swap_needed = true;
> | ath_dbg(common, EEPROM,
> | "Big Endian EEPROM detected according to EEPMISC register.\n");
> | } else {
> | *swap_needed = false;
> | }
>
> This doesn't take into consideration that swapping is not needed if
> the data is in big endian format on a big endian device. So, this
> could be changed so that the *swap_needed is only true if the flag and
> device endiannes disagrees?
>
> That said, Martin and Felix have written their reasons in the cover letter
> and patches for why the code is what it is:
> <https://ath9k-devel.ath9k.narkive.com/2q5A6nu0/patch-0-5-ath9k-eeprom-swapping-improvements>
>
> Toke, What's your take on this? Having something similar like the
> check_endian bool... but for OF? Or more logic that can somehow
> figure out if it's big or little endian.
Digging into that old thread, it seems we are re-hashing a lot of the
old discussion when those patches went in. Basically, the code you
quoted above is correct because the commit that introduced it sets all
fields to be __le16 and __le32 types and reads them using the
leXX_to_cpu() macros.
The code *further up* in that function is what is enabled by Alvaro's
patch. Which is a different type of swapping (where the whole eeprom is
swab16()'ed, not just the actual multi-byte data fields in them).
However, in OpenWrt the in-driver code to do this is not used; instead,
a hotplug script applies the swapping before the device is seen by the
driver, as described in this commit[0]. Martin indeed mentions that this
is a device-specific thing, so the driver can't actually do the right
thing without some outside feature flag[1]. The commit[0] also indicates
that there used used to exist a device-tree binding in the out-of-tree
device trees used in OpenWrt to do the unconditional swab16().
The code in [0] still exists in OpenWrt today, albeit in a somewhat
modified form[2]. I guess the question then boils down to, Álvaro, can
your issue be resolved by a pre-processing step similar to that which is
done in [2]? Or do we need the device tree flag after all?
-Toke
[0] https://git.openwrt.org/?p=openwrt/openwrt.git;a=commitdiff;h=afa37092663d00aa0abf8c61943d9a1b5558b144
[1] https://narkive.com/2q5A6nu0.34
[2] https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob;f=target/linux/lantiq/xway/base-files/etc/hotplug.d/firmware/12-ath9k-eeprom;h=98bb9af6947a298775ff7fa26ac6501c57df8378;hb=HEAD
Powered by blists - more mailing lists