lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iK-kaAUBF4MkAZJuRiJOO5LCE-SCRKDLBjz6gGoR5G4cA@mail.gmail.com>
Date: Thu, 11 Jul 2024 17:32:30 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: Jason Xing <kerneljasonxing@...il.com>
Cc: heze0908 <heze0908@...il.com>, davem@...emloft.net, dsahern@...nel.org, 
	kuba@...nel.org, pabeni@...hat.com, netdev@...r.kernel.org, 
	kernelxing@...cent.com
Subject: Re: [PATCH net-next] inet: reduce the execution time of getsockname()

On Thu, Jul 11, 2024 at 4:58 PM Jason Xing <kerneljasonxing@...il.com> wrote:
>
> Hello Eric,
>
> On Fri, Jul 12, 2024 at 1:26 AM Eric Dumazet <edumazet@...gle.com> wrote:
> >
> > On Thu, Jul 11, 2024 at 12:10 AM heze0908 <heze0908@...il.com> wrote:
> > >
> > > From: Ze He <zanehe@...cent.com>
> > >
> > > Recently, we received feedback regarding an increase
> > > in the time consumption of getsockname() in production.
> > > Therefore, we conducted tests based on the
> > > "getsockname" test item in libmicro. The test results
> > > indicate that compared to the kernel 5.4, the latest
> > > kernel indeed has an increased time consumption
> > > in getsockname().
> > > The test results are as follows:
> > >
> > > case_name       kernel 5.4      latest kernel     diff
> > > ----------      -----------     -------------   --------
> > > getsockname       0.12278         0.18246       +48.61%
> > >
> > > It was discovered that the introduction of lock_sock() in
> > > commit 9dfc685e0262 ("inet: remove races in inet{6}_getname()")
> > > to solve the data race problem between __inet_hash_connect()
> > > and inet_getname() has led to the increased time consumption.
> > > This patch attempts to propose a lockless solution to replace
> > > the spinlock solution.
> > >
> > > We have to solve the race issue without heavy spin lock:
> > > one reader is reading some members in struct inet_sock
> > > while the other writer is trying to modify them. Those
> > > members are "inet_sport" "inet_saddr" "inet_dport"
> > > "inet_rcv_saddr". Therefore, in the path of getname, we
> > > use READ_ONCE to read these data, and correspondingly,
> > > in the path of tcp connect, we use WRITE_ONCE to write
> > > these data.
> > >
> > > Using this patch, we conducted the getsockname test again,
> > > and the results are as follows:
> > >
> > > case_name       latest kernel   latest kernel(patched)
> > > ----------      -----------     ---------------------
> > > getsockname       0.18246             0.14423
> > >
> > > Signed-off-by: Jason Xing <kernelxing@...cent.com>
> > > Signed-off-by: Ze He <zanehe@...cent.com>
> >
> > There is no way you can implement a correct getsockname() without
> > extra synchronization.
> >
> > When multiple fields are read, READ_ONCE() will not ensure
> > consistency, especially for IPv6 addresses
> > which are too big to fit in a single word.
> >
>
> Thanks for your reply.
>
> I was thinking two ways at the beginning, one is using lockless way as
> this patch does which apparently is a little bit complicated, the
> other one is reverting commit 9dfc685e0262 ("inet: remove races in
> inet{6}_getname()") because in the real world I don't think the
> software programmer could call this two syscalls (connect and
> getsockname) concurrently. What is the use/meaning of calling those
> two concurrently? Even if there is data-race in this case, programmers
> cannot trust the results of getsockname one way or another. The fact
> is the degradation of performance, which the users complain about
> after upgrading the kernel from 5.4 to the latest. What do you
> suggest, Eric?

In the 'real world' we need results that we can trust.

Fact that it was missing in the past is not an excuse, this was a bug
in the first place.

Feel free to implement an alternate protection, if you really need to.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ