[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241020013039.4151-1-kuniyu@amazon.com>
Date: Sat, 19 Oct 2024 18:30:39 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <remi@...lab.net>
CC: <kuniyu@...zon.com>, <netdev@...r.kernel.org>
Subject: Re: [PATCH v1 net-next 5/9] phonet: Don't hold RTNL for getaddr_dumpit().
From: "RĂ©mi Denis-Courmont" <remi@...lab.net>
Date: Sat, 19 Oct 2024 10:48:57 +0300
> > > > list_for_each_entry_rcu(pnd, &pndevs->list, list) {
> > > >
> > > > + DECLARE_BITMAP(addrs, 64);
> > > >
> > > > u8 addr;
> > > >
> > > > if (dev_idx > dev_start_idx)
> > > >
> > > > @@ -143,23 +146,26 @@ static int getaddr_dumpit(struct sk_buff *skb,
> > > > struct
> > > > netlink_callback *cb) continue;
> > > >
> > > > addr_idx = 0;
> > > >
> > > > - for_each_set_bit(addr, pnd->addrs, 64) {
> > > > + memcpy(addrs, pnd->addrs, sizeof(pnd->addrs));
> > >
> > > Is that really safe? Are we sure that the bit-field writers are atomic
> > > w.r.t. memcpy() on all platforms? If READ_ONCE is needed for an integer,
> > > using memcpy() seems sketchy, TBH.
> >
> > I think bit-field read/write need not be atomic here because even
> > if a data-race happens, for_each_set_bit() iterates each bit, which
> > is the real data, regardless of whether data-race happened or not.
>
> Err, it looks to me that a corrupt bit would lead to the index getting corrupt
> and addresses getting skipped or repeated. AFAICT, the RTNL lock is still
> needed here.
Let's say pnd->addrs has 0b00010001 as the lowest in 8 bits
and addr_dumpit() and addr_doit() are executed concurrently,
and addr_doit() tries to change it to 0b00010011.
If we have a lock, there could be two results of dumpit().
1.
lock()
dumpit() -> 0b00010001
unlock()
lock()
doit()
unlock()
2.
lock()
doit()
unlock()
lock()
dumpit() -> 0b00010011
unlock()
If we don't have a lock and dumpit()'s read and doit()'s write
are split to upper 4 bits (0b0001) and lower 4 bits (0b0011),
there are 6 patterns of occurences, but the results are only
2 patterns and the same with the locked version.
If you get 0b00010001, you can think dumpit() completes earlier,
and the opposite if you get 0b00010011.
This is how we think with lockless algo. In this case, we do
not require lock to iterate bitmaps.
1.
upper half read of dumpit()
lower half read of dumpit()
upper half write of doit() -> 0b0001
lower half write of doit() -> 0b0011
-> 0b00010001
2.
upper half read of dumpit()
upper half write of doit() -> 0b0001
lower half read of dumpit()
lower half write of doit() -> 0b0011
-> 0b00010001
3.
upper half write of doit() -> 0b0001
upper half read of dumpit()
lower half read of dumpit()
lower half write of doit() -> 0b0011
-> 0b00010001
4.
upper half write of doit() -> 0b0001
lower half write of doit() -> 0b0011
upper half read of dumpit()
lower half read of dumpit()
-> 0b00010011
5.
upper half write of doit() -> 0b0001
upper half read of dumpit()
lower half write of doit() -> 0b0011
lower half read of dumpit()
-> 0b00010011
6.
upper half read of dumpit()
upper half write of doit() -> 0b0001
lower half write of doit() -> 0b0011
lower half read of dumpit()
-> 0b00010011
Powered by blists - more mailing lists