[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20a08d9d62f442fcb710a2bbfae47289@AcuMS.aculab.com>
Date: Mon, 17 Jan 2022 03:23:50 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Eric Dumazet' <eric.dumazet@...il.com>,
"David S . Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>
CC: netdev <netdev@...r.kernel.org>,
Eric Dumazet <edumazet@...gle.com>,
syzbot <syzkaller@...glegroups.com>,
Ido Schimmel <idosch@...lanox.com>,
"Jiri Pirko" <jiri@...lanox.com>
Subject: RE: [PATCH v2 net] ipv4: update fib_info_cnt under spinlock
protection
From: Eric Dumazet
> Sent: 16 January 2022 09:02
>
> In the past, free_fib_info() was supposed to be called
> under RTNL protection.
>
> This eventually was no longer the case.
>
> Instead of enforcing RTNL it seems we simply can
> move fib_info_cnt changes to occur when fib_info_lock
> is held.
>
> v2: David Laight suggested to update fib_info_cnt
> only when an entry is added/deleted to/from the hash table,
> as fib_info_cnt is used to make sure hash table size
> is optimal.
Already applied, but
acked-by: David Laight
...
If you are going to add READ_ONCE() markers then one on
'fib_info_hash_size' would be much more appropriate since
this value is used twice.
> err = -ENOBUFS;
> - if (fib_info_cnt >= fib_info_hash_size) {
> +
> + /* Paired with WRITE_ONCE() in fib_release_info() */
> + if (READ_ONCE(fib_info_cnt) >= fib_info_hash_size) {
> unsigned int new_size = fib_info_hash_size << 1;
> struct hlist_head *new_info_hash;
> struct hlist_head *new_laddrhash;
> @@ -1462,7 +1467,6 @@ struct fib_info *fib_create_info(struct fib_config *cfg,
If is also possible for two (or many) threads to decide to
increase the hash table size at the same time.
The code that moves the items to the new hash tables should
probably discard the new tables is they aren't larger than
the existing ones.
The copy does look safe - just a waste of time.
It is also technically possible (but very unlikely) that the table
will get shrunk!
It will grow again on the next allocate.
But this is a different bug.
I also though Linus said that the WRITE_ONCE() weren't needed
here because the kernel basically assumes the compiler isn't
stupid enough to do 'write tearing' on word sized items
(or just write zero before every write).
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Powered by blists - more mailing lists