netdev - Re: [PATCH v1] neighbour: Don't let neigh_forced

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iLzmKOGhMeUUxeM=1b2PP3kieTeYsmpfA0GvJdcQMkgtQ@mail.gmail.com>
Date: Fri, 1 Dec 2023 16:58:21 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Doug Anderson <dianders@...omium.org>
Cc: Judy Hsiao <judyhsiao@...omium.org>, David Ahern <dsahern@...nel.org>, 
	Simon Horman <horms@...nel.org>, Brian Haley <haleyb.dev@...il.com>, 
	"David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, 
	Joel Granados <joel.granados@...il.com>, Julian Anastasov <ja@....bg>, Leon Romanovsky <leon@...nel.org>, 
	Luis Chamberlain <mcgrof@...nel.org>, Paolo Abeni <pabeni@...hat.com>, linux-kernel@...r.kernel.org, 
	netdev@...r.kernel.org
Subject: Re: [PATCH v1] neighbour: Don't let neigh_forced_gc() disable
 preemption for long

On Fri, Dec 1, 2023 at 4:16 PM Doug Anderson <dianders@...omium.org> wrote:
>
> Hi,
>
> On Fri, Dec 1, 2023 at 1:10 AM Eric Dumazet <edumazet@...gle.com> wrote:
> >
> > On Fri, Dec 1, 2023 at 9:39 AM Judy Hsiao <judyhsiao@...omium.org> wrote:
> > >
> > > We are seeing cases where neigh_cleanup_and_release() is called by
> > > neigh_forced_gc() many times in a row with preemption turned off.
> > > When running on a low powered CPU at a low CPU frequency, this has
> > > been measured to keep preemption off for ~10 ms. That's not great on a
> > > system with HZ=1000 which expects tasks to be able to schedule in
> > > with ~1ms latency.
> >
> > This will not work in general, because this code runs with BH blocked.
> >
> > jiffies will stay untouched for many more ms on systems with only one CPU.
> >
> > I would rather not rely on jiffies here but ktime_get_ns() [1]
> >
> > Also if we break the loop based on time, we might be unable to purge
> > the last elements in gc_list.
> > We might need to use a second list to make sure to cycle over all
> > elements eventually.
> >
> >
> > [1]
> > diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> > index df81c1f0a57047e176b7c7e4809d2dae59ba6be5..e2340e6b07735db8cf6e75d23ef09bb4b0db53b4
> > 100644
> > --- a/net/core/neighbour.c
> > +++ b/net/core/neighbour.c
> > @@ -253,9 +253,11 @@ static int neigh_forced_gc(struct neigh_table *tbl)
> >  {
> >         int max_clean = atomic_read(&tbl->gc_entries) -
> >                         READ_ONCE(tbl->gc_thresh2);
> > +       u64 tmax = ktime_get_ns() + NSEC_PER_MSEC;
>
> It might be nice to make the above timeout based on jiffies. On a
> HZ=100 system it's probably OK to keep preemption disabled for 10 ms
> but on a HZ=1000 system you'd want 1 ms. ...so maybe you'd want to use
> jiffies_to_nsecs(1)?

I do not think so. 10ms would be awfully long.

We have nsec based time service, why downgrading to jiffies resolution ???

>
> One worry might be that we disabled preemption _right before_ we were
> supposed to be scheduled out. In that case we'll end up blocking some
> other task for another full timeslice, but maybe there's not a lot we
> can do there?

Can you tell us in which scenario this gc_list can be so big, other
than fuzzers ?