lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Fri, 16 Feb 2024 13:22:52 -0500
From: John David Anglin <dave.anglin@...l.net>
To: Helge Deller <deller@....de>, Guenter Roeck <linux@...ck-us.net>,
 Charlie Jenkins <charlie@...osinc.com>
Cc: David Laight <David.Laight@...lab.com>,
 Palmer Dabbelt <palmer@...belt.com>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Parisc List <linux-parisc@...r.kernel.org>, Al Viro
 <viro@...iv.linux.org.uk>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v8 2/2] lib: checksum: Use aligned accesses for
 ip_fast_csum and csum_ipv6_magic tests

On 2024-02-16 12:54 a.m., Helge Deller wrote:
> On 2/15/24 02:58, Guenter Roeck wrote:
>> Hi Charlie,
>>
>> On 2/14/24 17:30, Charlie Jenkins wrote:
>>> On Wed, Feb 14, 2024 at 03:03:07PM -0800, Guenter Roeck wrote:
>>>> On 2/14/24 13:41, Charlie Jenkins wrote:
>>>>> The test cases for ip_fast_csum and csum_ipv6_magic were failing on a
>>>>> variety of architectures that are big endian or do not support
>>>>> misalgined accesses. Both of these test cases are changed to support big
>>>>> and little endian architectures.
>>>>>
>>>>> The test for ip_fast_csum is changed to align the data along (14 +
>>>>> NET_IP_ALIGN) bytes which is the alignment of an IP header. The test for
>>>>> csum_ipv6_magic aligns the data using a struct. An extra padding field
>>>>> is added to the struct to ensure that the size of the struct is the same
>>>>> on all architectures (44 bytes).
>>>>>
>>>>> The test for csum_ipv6_magic somewhat arbitrarily aligned saddr and
>>>>> daddr. This would fail on parisc64 due to the following code snippet in
>>>>> arch/parisc/include/asm/checksum.h:
>>>>>
>>>>> add        %4, %0, %0\n"
>>>>> ldd,ma        8(%1), %6\n"
>>>>> ldd,ma        8(%2), %7\n"
>>>>> add,dc        %5, %0, %0\n"
>>>>>
>>>>> The second add is expecting carry flags from the first add. Normally,
>>>>> a double word load (ldd) does not modify the carry flags. However,
>>>>> because saddr and daddr may be misaligned, ldd triggers a misalignment
>>>>> trap that gets handled in arch/parisc/kernel/unaligned.c. This causes
>>>>> many additional instructions to be executed between the two adds. This
>>>>> can be easily solved by adding the carry into %0 before executing the
>>>>> ldd.
>>>>>
>>>>
>>>> I really think this is a bug either in the trap handler or in the hppa64
>>>> qemu emulation. Only unaligned ldd instructions affect (actually,
>>>> unconditionally set) the carry flag. That doesn't happen with unaligned
>>>> ldw instructions. It would be worthwhile tracking this down since there are
>>>> lots of unaligned data accesses (8-byte accesses on 4-byte aligned addresses)
>>>> when running the kernel in 64-bit mode. On the other side, I guess this
>>>> is a different problem. Not sure though if that should even be mentioned
>>>> here since that makes it sound as if it would be expected that such
>>>> accesses impact the carry flag.
>>>
>>> I wasn't confident it was a bug somewhere so that's why I sent this patch.
>>>
>>> However, I have just found the section of the processor manual [1] I was
>>> looking for (Section Privileged Software-Accessible Registers subsection
>>> Processor Status Word (PSW)):
>>>
>>> "Processor state is encoded in a 64-bit register called the Processor
>>> Status Word (PSW). When an interruption occurs, the current value of the
>>> PSW is saved in the Interruption Processor Status Word (IPSW) and
>>> usually all defined PSW bits are set to 0.
>>>
>>> "The PSW is set to the contents of the IPSW by the RETURN FROM
>>> INTERRUPTION instruction. The interruption handler may restore the
>>> original PSW, modify selected bits, or may change the PSW to an entirely
>>> new value."
>>>
>>> Stored in the PSW register are the "Carry/borrow bits". This confirms
>>> that the carry/borrow bits should be restored. The save is supposed to
>>> automatically happen upon an interrupt and restored by the RETURN FROM
>>> INTERRUPTION, thus this is a QEMU bug and not a Linux bug (please
>>> correct me if I am wrong).
>>>
>>
>> I know that much (I looked into the manual as well), I just really don't
>> know if this is a Linux bug or a QEMU bug, and I have not been able to
>> nail it down. I think someone with access to hardware will need to confirm.
>>
>> Specifically: Yes, the carry/borrow bits should be restored. Question is
>> if the Linux kernel's interrupt handler doesn't restore the carry bits
>> or if the problem is on the qemu side.
>>
>>> This v8 was not needed after-all it seems. It would be best to stick
>>> with the v7.
>>>
>> I tend to agree; after all, v7 exposes the problem, making it easier to
>> determine if the problem can be reproduced on real hardware.
>>
>> FWIW,I wrote some test code which exposes the problem.
>
> Can you please give a pointer to this test code?
> I'm happy to try it on real hardware.
>
>> It is quite easy to show that carry is always set after executing ldd
>> on an unaligned address. That is also why I know for sure that the
>> problem is not seen with ldw on unaligned addresses.
> Interesting.
> In general I think it's quite important to differentiate between
> running on qemu or running on physical hardware.
> Qemu just recently got 64-bit support, and it's not yet behaving
> like real hardware. One thing I noticed is, that read hardware
> does not seem to jump into the exception handler twice, while
> qemu does. So, if you run into an exception (e.g. unaligned ldd)
See page 2-13 of arch.  An interruption sets the PSW Q bit to 0 freezing the IIA
queues.  If an interruption occurs with Q=0, the interruption parameter registers
are left unchanged, so I don't think there's a way to handle double exceptions
on real hardware (Q would have to be set back to 1 after the IPRs are read).

Dave

-- 
John David Anglin  dave.anglin@...l.net


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ