linux-kernel - Re: [PATCH v10] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABVgOSn0uD+595fLvC0FWUoznCQTv0NeMAykyiOaS956eXQvGQ@mail.gmail.com>
Date: Thu, 29 Feb 2024 16:07:03 +0800
From: David Gow <davidgow@...gle.com>
To: Guenter Roeck <linux@...ck-us.net>
Cc: Geert Uytterhoeven <geert@...ux-m68k.org>, Christophe Leroy <christophe.leroy@...roup.eu>, 
	Charlie Jenkins <charlie@...osinc.com>, "Russell King (Oracle)" <linux@...linux.org.uk>, 
	David Laight <David.Laight@...lab.com>, Palmer Dabbelt <palmer@...belt.com>, 
	Andrew Morton <akpm@...ux-foundation.org>, Helge Deller <deller@....de>, 
	"James E.J. Bottomley" <James.Bottomley@...senpartnership.com>, 
	Parisc List <linux-parisc@...r.kernel.org>, Arnd Bergmann <arnd@...db.de>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Palmer Dabbelt <palmer@...osinc.com>, 
	Linux ARM <linux-arm-kernel@...ts.infradead.org>, 
	"open list:KERNEL SELFTEST FRAMEWORK" <linux-kselftest@...r.kernel.org>, 
	KUnit Development <kunit-dev@...glegroups.com>
Subject: Re: [PATCH v10] lib: checksum: Use aligned accesses for ip_fast_csum
 and csum_ipv6_magic tests

On Wed, 28 Feb 2024 at 23:40, Guenter Roeck <linux@...ck-us.net> wrote:
>
> On 2/28/24 02:15, Geert Uytterhoeven wrote:
> > CC testing
> >
> > On Wed, Feb 28, 2024 at 8:59 AM Guenter Roeck <linux@...ck-us.net> wrote:
> >> On 2/27/24 23:25, Christophe Leroy wrote:
> >> [ ... ]
> >>>>
> >>>> This test case is supposed to be as true to the "general case" as
> >>>> possible, so I have aligned the data along 14 + NET_IP_ALIGN. On ARM
> >>>> this will be a 16-byte boundary since NET_IP_ALIGN is 2. A driver that
> >>>> does not follow this may not be appropriately tested by this test case,
> >>>> but anyone is welcome to submit additional test cases that address this
> >>>> additional alignment concern.
> >>>
> >>> But then this test case is becoming less and less true to the "general
> >>> case" with this patch, whereas your initial implementation was almost
> >>> perfect as it was covering most cases, a lot more than what we get with
> >>> that patch applied.
> >>>
> >> NP with me if that is where people want to go. I'll simply disable checksum
> >> tests on all architectures which don't support unaligned accesses (so far
> >> it looks like that is only arm with thumb instructions, and possibly nios2).
> >> I personally find that less desirable and would have preferred a second
> >> configurable set of tests for unaligned accesses, but I have no problem
> >> with it.
> >
> > IMHO the tests should validate the expected functionality.  If a test
> > fails, either functionality is missing or behaves wrong, or the test
> > is wrong.
> >
> > What is the point of writing tests for a core functionality like network
> > checksumming that do not match the expected functionality?
> >
>
> Tough one. I can't enable CONFIG_NET_TEST on nios2, parisc, and arm with THUMB
> enabled due to crashes or hangs in gso tests. I accept that. Downside is that I
> have to disable CONFIG_NET_TEST on those architectures/platforms entirely,
> meaning a whole class of tests are missing for those architectures. I would
> prefer to have a configuration option such as CONFIG_NET_GSO_TEST to let me
> disable the problematic tests for the affected platforms so I can run all
> the other network unit tests. Yes, obviously something is wrong either with
> the affected tests or with the implementation of the tested functionality
> on the affected systems, but that could be handled separately if a separate
> configuration option existed, and new regressions in other tests on the affected
> architectures could be identified as they happen.
>
> This case is similar. I'd prefer to have a separate configuration option,
> say, CONFIG_CHECKSUM_MISALIGNED_KUNIT, which I can disable to be able to
> run the common checksum tests on platforms / architectures which don't
> support unaligned accesses.
>
> However, as I said, if the community wants to take a harsh stance, I have no
> problem with just disabling groups of tests entirely on platforms which have
> a problem with part of it.
>
> Guenter
>

I think the ideal solution is for there to be some official stance on
the required alignment, for every architecture to support that, and
for the tests to exercise it. Now, judging from the sheer number of
replies in this thread, it seems like there isn't any real agreement
on that. (From my quick reading of some of the checksum code, my
assumption was that this was either 1- or 2- byte alignment required,
with 4-byte alignment being ideal for performance reasons in most
setups).

If different architectures have different alignment requirements
(ouch!), my feeling is that the test suite should be written to the
maximum such alignment (as any non-architecture-specific code will
need to align things anyway), and architectures/drivers with
non-aligned buffers can have their own tests. If it turns out there
are a lot of such drivers/architectures, then we can add the extra
config option.

I'd rather, if there is a config option to disable these tests, it be
of the form ARCH_HAS_UNALIGNED_CHECKSUM to enable it, or similar.
There's also the option of having the test 'skip' itself on a
configuration which doesn't support it. That way it'll still show up
in the list of tests, but with a description, like "Disabled due to
checksum alignment requirements" or something, which may be more
obvious to people debugging it later.

For the gso test hangs, I think it's probably quite sensible to have a
config option for the GSO tests generally. I'd be more hesitant to
have a separate CONFIG_NET_GSO_FREQUENTLY_BROKEN_TESTS, which is
selected automatically by a bunch of architectures. At that point, I
think we need to either just fix the bugs, or start thinking about a
better solution for these tests / architectures.

One of the things I'm hoping to work on this year is some improvements
to KUnit tooling to automatically run tests across a wider set of
architectures and configs, so test authors can catch this sort of
thing before even sending patches out. We can do a bit of this with
the manual --arch <arch> option to kunit.py, but very few people will
test things across more than a couple of architectures, and rarely
will we get good testing on the less common architectures, like 32-bit
ones, big endian ones, or ones with stricter alignment requirements.
So we can do better there.

tl;dr: I think it's a good idea for tests to sit behind config
options. Obviously they shouldn't be either too broad, or too
granular, but common sense usually prevails here. I'd rather not have
config options explicitly for "broken" tests, though: if you have to,
try to make the config option for the missing/broken feature (HAS_xxx)
rather than the test if possible. Otherwise, 'skip' the test, with a
suitable reason string if you can.

-- David

Download attachment "smime.p7s" of type "application/pkcs7-signature" (4014 bytes)