linux-kernel - Re: [PATCH v8 2/2] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <25f108d1-827f-4a18-bee4-4105fbd45974@gmx.de>
Date: Fri, 16 Feb 2024 06:54:55 +0100
From: Helge Deller <deller@....de>
To: Guenter Roeck <linux@...ck-us.net>, Charlie Jenkins <charlie@...osinc.com>
Cc: David Laight <David.Laight@...lab.com>,
 Palmer Dabbelt <palmer@...belt.com>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Parisc List <linux-parisc@...r.kernel.org>, Al Viro
 <viro@...iv.linux.org.uk>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v8 2/2] lib: checksum: Use aligned accesses for
 ip_fast_csum and csum_ipv6_magic tests

On 2/15/24 02:58, Guenter Roeck wrote:
> Hi Charlie,
>
> On 2/14/24 17:30, Charlie Jenkins wrote:
>> On Wed, Feb 14, 2024 at 03:03:07PM -0800, Guenter Roeck wrote:
>>> On 2/14/24 13:41, Charlie Jenkins wrote:
>>>> The test cases for ip_fast_csum and csum_ipv6_magic were failing on a
>>>> variety of architectures that are big endian or do not support
>>>> misalgined accesses. Both of these test cases are changed to support big
>>>> and little endian architectures.
>>>>
>>>> The test for ip_fast_csum is changed to align the data along (14 +
>>>> NET_IP_ALIGN) bytes which is the alignment of an IP header. The test for
>>>> csum_ipv6_magic aligns the data using a struct. An extra padding field
>>>> is added to the struct to ensure that the size of the struct is the same
>>>> on all architectures (44 bytes).
>>>>
>>>> The test for csum_ipv6_magic somewhat arbitrarily aligned saddr and
>>>> daddr. This would fail on parisc64 due to the following code snippet in
>>>> arch/parisc/include/asm/checksum.h:
>>>>
>>>> add        %4, %0, %0\n"
>>>> ldd,ma        8(%1), %6\n"
>>>> ldd,ma        8(%2), %7\n"
>>>> add,dc        %5, %0, %0\n"
>>>>
>>>> The second add is expecting carry flags from the first add. Normally,
>>>> a double word load (ldd) does not modify the carry flags. However,
>>>> because saddr and daddr may be misaligned, ldd triggers a misalignment
>>>> trap that gets handled in arch/parisc/kernel/unaligned.c. This causes
>>>> many additional instructions to be executed between the two adds. This
>>>> can be easily solved by adding the carry into %0 before executing the
>>>> ldd.
>>>>
>>>
>>> I really think this is a bug either in the trap handler or in the hppa64
>>> qemu emulation. Only unaligned ldd instructions affect (actually,
>>> unconditionally set) the carry flag. That doesn't happen with unaligned
>>> ldw instructions. It would be worthwhile tracking this down since there are
>>> lots of unaligned data accesses (8-byte accesses on 4-byte aligned addresses)
>>> when running the kernel in 64-bit mode. On the other side, I guess this
>>> is a different problem. Not sure though if that should even be mentioned
>>> here since that makes it sound as if it would be expected that such
>>> accesses impact the carry flag.
>>
>> I wasn't confident it was a bug somewhere so that's why I sent this patch.
>>
>> However, I have just found the section of the processor manual [1] I was
>> looking for (Section Privileged Software-Accessible Registers subsection
>> Processor Status Word (PSW)):
>>
>> "Processor state is encoded in a 64-bit register called the Processor
>> Status Word (PSW). When an interruption occurs, the current value of the
>> PSW is saved in the Interruption Processor Status Word (IPSW) and
>> usually all defined PSW bits are set to 0.
>>
>> "The PSW is set to the contents of the IPSW by the RETURN FROM
>> INTERRUPTION instruction. The interruption handler may restore the
>> original PSW, modify selected bits, or may change the PSW to an entirely
>> new value."
>>
>> Stored in the PSW register are the "Carry/borrow bits". This confirms
>> that the carry/borrow bits should be restored. The save is supposed to
>> automatically happen upon an interrupt and restored by the RETURN FROM
>> INTERRUPTION, thus this is a QEMU bug and not a Linux bug (please
>> correct me if I am wrong).
>>
>
> I know that much (I looked into the manual as well), I just really don't
> know if this is a Linux bug or a QEMU bug, and I have not been able to
> nail it down. I think someone with access to hardware will need to confirm.
>
> Specifically: Yes, the carry/borrow bits should be restored. Question is
> if the Linux kernel's interrupt handler doesn't restore the carry bits
> or if the problem is on the qemu side.
>
>> This v8 was not needed after-all it seems. It would be best to stick
>> with the v7.
>>
> I tend to agree; after all, v7 exposes the problem, making it easier to
> determine if the problem can be reproduced on real hardware.
>
> FWIW,I wrote some test code which exposes the problem.

Can you please give a pointer to this test code?
I'm happy to try it on real hardware.

> It is quite easy to show that carry is always set after executing ldd
> on an unaligned address. That is also why I know for sure that the
> problem is not seen with ldw on unaligned addresses.
Interesting.
In general I think it's quite important to differentiate between
running on qemu or running on physical hardware.
Qemu just recently got 64-bit support, and it's not yet behaving
like real hardware. One thing I noticed is, that read hardware
does not seem to jump into the exception handler twice, while
qemu does. So, if you run into an exception (e.g. unaligned ldd)
then if a second exception happens in the fault handler (e.g. second
unaligned ldd to resolve wrongly-coded code lookup), you will
get different behaviour between hardware and emulation.
This is also the reason why qemu still fails to emulate newer
64-bit Linux kernels which uses kernel modules.

Helge