lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250731125839.1137083-1-gur.stavi@huawei.com>
Date: Thu, 31 Jul 2025 15:58:39 +0300
From: Gur Stavi <gur.stavi@...wei.com>
To: <horms@...nel.org>
CC: <andrew+netdev@...n.ch>, <christophe.jaillet@...adoo.fr>,
	<corbet@....net>, <davem@...emloft.net>, <edumazet@...gle.com>,
	<fuguiming@...artners.com>, <gongfan1@...wei.com>, <guoxin09@...wei.com>,
	<gur.stavi@...wei.com>, <helgaas@...nel.org>, <jdamato@...tly.com>,
	<kuba@...nel.org>, <lee@...ger.us>, <linux-doc@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <luosifu@...wei.com>,
	<meny.yossefi@...wei.com>, <mpe@...erman.id.au>, <netdev@...r.kernel.org>,
	<pabeni@...hat.com>, <przemyslaw.kitszel@...el.com>,
	<shenchenyang1@...ilicon.com>, <shijing34@...wei.com>, <sumang@...vell.com>,
	<vadim.fedorenko@...ux.dev>, <wulike1@...wei.com>, <zhoushuai28@...wei.com>,
	<zhuyikai1@...artners.com>
Subject: Re: [PATCH net-next v10 1/8] hinic3: Async Event Queue interfaces

> On Thu, Jul 24, 2025 at 09:45:51PM +0800, Fan Gong wrote:
> > > > +
> > > > +/* Data provided to/by cmdq is arranged in structs with little endian fields but
> > > > + * every dword (32bits) should be swapped since HW swaps it again when it
> > > > + * copies it from/to host memory.
> > > > + */
> > >
> > > This scheme may work on little endian hosts.
> > > But if so it seems unlikely to work on big endian hosts.
> > >
> > > I expect you want be32_to_cpu_array() for data coming from hw,
> > > with a source buffer as an array of __be32 while
> > > the destination buffer is an array of u32.
> > >
> > > And cpu_to_be32_array() for data going to the hw,
> > > with the types of the source and destination buffers reversed.
> > >
> > > If those types don't match your data, then we have
> > > a framework to have that discussion.
> > >
> > >
> > > That said, it is more usual for drivers to keep structures in the byte
> > > order they are received. Stored in structures with members with types, in
> > > this case it seems that would be __be32, and accessed using a combination
> > > of BIT/GENMASK, FIELD_PREP/FIELD_GET, and cpu_to_be*/be*_to_cpu (in this
> > > case cpu_to_be32/be32_to_cpu).
> > >
> > > An advantage of this approach is that the byte order of
> > > data is only changed when needed. Another is that it is clear
> > > what the byte order of data is.
> >
> > There is a simplified example:
> >
> > Here is a 64 bit little endian that may appear in cmdq:
> > __le64 x
> > After the swap it will become:
> > __be32 x_lo
> > __be32 x_hi
> > This is NOT __be64.
> > __be64 looks like this:
> > __be32 x_hi
> > __be32 x_lo
>
> Sure, byte swapping 64 bit entities is different to byte swapping two
> consecutive 32 bit entities. I completely agree.
>
> >
> > So the swapped data by HW is neither BE or LE. In this case, we should use
> > swab32 to obtain the correct LE data because our driver currently supports LE.
> > This is for compensating for bad HW decisions.
>
> Let us assume that the host is reading data provided by HW.
>
> If the swab32 approach works on a little endian host
> to allow the host to access 32-bit values in host byte order.
> Then this is because it outputs a 32-bit little endian values

Values can be any size. 32 bit is arbitrary.
.
>
> But, given the same input, it will not work on a big endian host.
> This is because the same little endian output will be produced,
> while the host byte order is big endian.
>
> I think you need something based on be32_to_cpu()/cpu_to_be32().
> This will effectively be swab32 on little endian hosts (no change!).
> And a no-op on big endian hosts (addressing my point above).
>
> More specifically, I think you should use be32_to_cpu_array() and
> cpu_to_be32_array() instead of swab32_array().
>

Lets define a "coherent struct" as a structure made of fields that makes sense
to human beings. Every field endianity is defined and fields are arranged in
order that "makes sense". Fields can be of any integer size 8,16,32,64 and not
necessarily naturally aligned.

swab32_array transforms a coherent struct into "byte jumble". Small fields are
reordered and larger (misaligned) fields may be split into 2 (or even 3) parts.
swab32_array is reversible so a 2nd call with byte jumble as input will produce
the original coherent struct.

hinic3 dma has "swab32_array" built in.
On send-to-device it expects a byte jubmle so the DMA engine will transform it
into a coherent struct.
On receive-from-device it provides a byte jumble so the driver needs
to call swab32_array to transform it into a coherent struct.

The hinic3_cmdq_buf_swab32 function will work correctly, producing byte jumble,
on little endian and big endian hosts.

The code that runs prior to hinic3_cmdq_buf_swab32 that initializes a coherent
struct is endianity sensitive. It needs to initialize fields based on their
coherent endianity with or without byte swap. Practically use cpu_to_le or
cpu_to_be based on the coherent definition.

Specifically, cmdq "coherent structs" in hinic3 use little endian and since
Kconfig currently declares that big endian hosts are not supported then
coherent structs are initialized without explicit cpu_to_le macros.

And this is what the comment says:

/* Data provided to/by cmdq is arranged in structs with little endian fields but
 * every dword (32bits) should be swapped since HW swaps it again when it
 * copies it from/to host memory.
 */

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ