[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iJNvxatTTcHvzNKuUu2HyNfH=O7XesA3pHUwfn4Qy=pJQ@mail.gmail.com>
Date: Fri, 12 Nov 2021 06:21:38 -0800
From: Eric Dumazet <edumazet@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Alexander Duyck <alexander.duyck@...il.com>,
Eric Dumazet <eric.dumazet@...il.com>,
"David S . Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
netdev <netdev@...r.kernel.org>,
"the arch/x86 maintainers" <x86@...nel.org>
Subject: Re: [PATCH v1] x86/csum: rewrite csum_partial()
On Fri, Nov 12, 2021 at 1:13 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Thu, Nov 11, 2021 at 02:30:50PM -0800, Eric Dumazet wrote:
> > > For values 7 through 1 I wonder if you wouldn't be better served by
> > > just doing a single QWORD read and a pair of shifts. Something along
> > > the lines of:
> > > if (len) {
> > > shift = (8 - len) * 8;
> > > temp64 = (*(unsigned long)buff << shift) >> shift;
> > > result += temp64;
> > > result += result < temp64;
> > > }
> >
> > Again, KASAN will not be happy.
>
> If you do it in asm, kasan will not know, so who cares :-) as long as
> the load is aligned, loading beyond @len shouldn't be a problem,
> otherwise there's load_unaligned_zeropad().
OK, but then in this case we have to align buff on qword boundary,
or risk crossing page boundary.
So this stuff has to be done at the beginning, and at the end.
And with IP_IP_ALIGN==0, this will unfortunately trigger for the 40-byte
IPV6 header.
IPv6 header : <2 bytes before qword boundary><4 * 8 bytes> < 6 bytes at trail>
I will try, but I have some doubts it can save one or two cycles...
Powered by blists - more mailing lists