netdev - Re: [RFC PATCH net-next] net: add an entry for CONFIG_NET_RX_BUSY

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAL+tcoCC2g1iHA__vr8bbUX-kba2bBi2NbQNZnxOAMTJOQQAWg@mail.gmail.com>
Date: Tue, 23 Jul 2024 23:12:20 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com, horms@...nel.org, 
	netdev@...r.kernel.org, Jason Xing <kernelxing@...cent.com>
Subject: Re: [RFC PATCH net-next] net: add an entry for CONFIG_NET_RX_BUSY_POLL

On Tue, Jul 23, 2024 at 11:09 PM Jason Xing <kerneljasonxing@...il.com> wrote:
>
> On Tue, Jul 23, 2024 at 10:57 PM Eric Dumazet <edumazet@...gle.com> wrote:
> >
> > On Tue, Jul 23, 2024 at 3:57 PM Jason Xing <kerneljasonxing@...il.com> wrote:
> > >
> > > From: Jason Xing <kernelxing@...cent.com>
> > >
> > > When I was doing performance test on unix_poll(), I found out that
> > > accessing sk->sk_ll_usec when calling sock_poll()->sk_can_busy_loop()
> > > occupies too much time, which causes around 16% degradation. So I
> > > decided to turn off this config, which cannot be done apparently
> > > before this patch.
> >
> > Too many CONFIG_ options, distros will enable it anyway.
> >
> > In my builds, offset of sk_ll_usec is 0xe8.
> >
> > Are you using some debug options or an old tree ?

I forgot to say: I'm running the latest kernel which I pulled around
two hours ago. Whatever kind of configs with/without debug options I
use, I can still reproduce it.

> >
> > I can not understand how a 16% degradation can occur, reading a field
> > in a cache line which contains read mostly fields for af_unix socket.
> >
> > I think you need to provide more details / analysis, and perhaps come
> > to a different conclusion.
>
> Thanks for your comments.
>
> I'm also confused about the result. The design of the cache line is
> correct from my perspective because they are all read mostly fields as
> you said.
>
> I was doing some tests by using libmicro[1] and found this line '41.30
> │      test  %r14d,%r14d' by using perf. So I realised that there is
> something strange here. Then I disable that config, the result turns
> out to be better than before. One of my colleagues can prove it.
>
> In this patch, I described a story about why I would like to let
> people disable/enable it, but investigating this part may be another
> different thing, I think. I will keep trying.
>
> [1]: https://github.com/redhat-performance/libMicro.git
> running 'https://github.com/redhat-performance/libMicro.git' to see the results
>
> >
> > >
> > > Signed-off-by: Jason Xing <kernelxing@...cent.com>
> > > ---
> > > More data not much related if you're interested:
> > >   5.82 │      mov   0x18(%r13),%rdx
> > >   0.03 │      mov   %rsi,%r12
> > >   1.76 │      mov   %rdi,%rbx
> > >        │    sk_can_busy_loop():
> > >   0.50 │      mov   0x104(%rdx),%r14d
> > >  41.30 │      test  %r14d,%r14d
> > > Note: I run 'perf record -e  L1-dcache-load-misses' to diagnose
> > > ---
> > >  net/Kconfig | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/net/Kconfig b/net/Kconfig
> > > index d27d0deac0bf..1f1b793984fe 100644
> > > --- a/net/Kconfig
> > > +++ b/net/Kconfig
> > > @@ -335,8 +335,10 @@ config CGROUP_NET_CLASSID
> > >           being used in cls_cgroup and for netfilter matching.
> > >
> > >  config NET_RX_BUSY_POLL
> > > -       bool
> > > +       bool "Low latency busy poll timeout"
> > >         default y if !PREEMPT_RT || (PREEMPT_RT && !NETCONSOLE)
> > > +       help
> > > +         Approximate time in us to spin waiting for packets on the device queue.
> >
> > Wrong comment. It is a y/n choice, no 'usec' at this stage.
>
> Oh, I see.
>
> Thanks,
> Jason
>
> >
> > >
> > >  config BQL
> > >         bool
> > > --
> > > 2.37.3
> > >