lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 30 Nov 2023 10:29:33 +0100
From: Jakub Sitnicki <>
To: David Laight <David.Laight@...LAB.COM>
Cc: "" <>, 'Jakub Kicinski'
 <>, "David S. Miller" <>, Stephen
 Hemminger <>, Eric Dumazet
 <>, 'David Ahern' <>, Paolo Abeni
Subject: Re: [PATCH net-next] ipv4: Use READ/WRITE_ONCE() for IP

On Thu, Nov 30, 2023 at 09:04 AM GMT, David Laight wrote:
> From: Jakub Sitnicki
>> Sent: 29 November 2023 20:17
>> Hey David,
>> On Wed, Nov 29, 2023 at 07:26 PM GMT, David Laight wrote:
>> > Commit 227b60f5102cd added a seqlock to ensure that the low and high
>> > port numbers were always updated together.
>> > This is overkill because the two 16bit port numbers can be held in
>> > a u32 and read/written in a single instruction.
>> >
>> > More recently 91d0b78c5177f added support for finer per-socket limits.
>> > The user-supplied value is 'high << 16 | low' but they are held
>> > separately and the socket options protected by the socket lock.
>> >
>> > Use a u32 containing 'high << 16 | low' for both the 'net' and 'sk'
>> > fields and use READ_ONCE()/WRITE_ONCE() to ensure both values are
>> > always updated together.
>> >
>> > Change (the now trival) inet_get_local_port_range() to a static inline
>> > to optimise the calling code.
>> > (In particular avoiding returning integers by reference.)
>> >
>> > Signed-off-by: David Laight <>
>> > ---
>> Regarding the per-socket changes - we don't expect contention on sock
>> lock between inet_stream_connect / __inet_bind, where we grab it and
>> eventually call inet_sk_get_local_port_range, and sockopt handlers, do
>> we?
>> The motivation is not super clear for me for that of the changes.
> The locking in the getsockopt() code is actually quite horrid.
> Look at the conditionals for the rntl lock.
> It is conditionally acquired based on a function that sets a flag,
> but released based on the exit path from the switch statement.
> But there are only two options that need the rtnl lock and the socket
> lock, and two trivial ones (including this one) that need the socket
> lock.
> So the code can be simplified by moving the locking into the case branches.
> With only 2 such cases the overhead will be minimal (compared the to
> setsockopt() case where a lot of options need locking.
> This is all part of a big patchset I'm trying to write that converts
> all the setsockopt code to take an 'unsigned int optlen' parameter
> and return the length to pass back to the caller.
> So the copy_to_user() of the updated length is done by the syscall
> stub rather than inside every separate function (and sometimes in
> multiple places in a function).
> After all, if the copy fails EFAULT the application is broken.
> It really doesn't matter if any side effects have happened.
> If you get a fault reading from a pipe the data is lost.

OK, so you're trying to refactor the setsockopt handler. Now it makes
more sense what is driving this work. Thanks for the context.


Powered by blists - more mailing lists