netdev - Re: [PATCH v1] neighbour: Don't let neigh_forced

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4b095b1c-9fa8-4df9-846b-c33c01e15d97@kernel.org>
Date: Mon, 4 Dec 2023 18:08:38 -0700
From: David Ahern <dsahern@...nel.org>
To: Doug Anderson <dianders@...omium.org>, Eric Dumazet <edumazet@...gle.com>
Cc: Judy Hsiao <judyhsiao@...omium.org>, Simon Horman <horms@...nel.org>,
 Brian Haley <haleyb.dev@...il.com>, "David S. Miller" <davem@...emloft.net>,
 Jakub Kicinski <kuba@...nel.org>, Joel Granados <joel.granados@...il.com>,
 Julian Anastasov <ja@....bg>, Leon Romanovsky <leon@...nel.org>,
 Luis Chamberlain <mcgrof@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
 linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH v1] neighbour: Don't let neigh_forced_gc() disable
 preemption for long

On 12/4/23 4:40 PM, Doug Anderson wrote:
> Hi,
> 
> On Fri, Dec 1, 2023 at 1:10 AM Eric Dumazet <edumazet@...gle.com> wrote:
>>
>> On Fri, Dec 1, 2023 at 9:39 AM Judy Hsiao <judyhsiao@...omium.org> wrote:
>>>
>>> We are seeing cases where neigh_cleanup_and_release() is called by
>>> neigh_forced_gc() many times in a row with preemption turned off.
>>> When running on a low powered CPU at a low CPU frequency, this has
>>> been measured to keep preemption off for ~10 ms. That's not great on a
>>> system with HZ=1000 which expects tasks to be able to schedule in
>>> with ~1ms latency.
>>
>> This will not work in general, because this code runs with BH blocked.
>>
>> jiffies will stay untouched for many more ms on systems with only one CPU.
>>
>> I would rather not rely on jiffies here but ktime_get_ns() [1]
>>
>> Also if we break the loop based on time, we might be unable to purge
>> the last elements in gc_list.
>> We might need to use a second list to make sure to cycle over all
>> elements eventually.
>>
>>
>> [1]
>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>> index df81c1f0a57047e176b7c7e4809d2dae59ba6be5..e2340e6b07735db8cf6e75d23ef09bb4b0db53b4
>> 100644
>> --- a/net/core/neighbour.c
>> +++ b/net/core/neighbour.c
>> @@ -253,9 +253,11 @@ static int neigh_forced_gc(struct neigh_table *tbl)
>>  {
>>         int max_clean = atomic_read(&tbl->gc_entries) -
>>                         READ_ONCE(tbl->gc_thresh2);
>> +       u64 tmax = ktime_get_ns() + NSEC_PER_MSEC;
>>         unsigned long tref = jiffies - 5 * HZ;
>>         struct neighbour *n, *tmp;
>>         int shrunk = 0;
>> +       int loop = 0;
>>
>>         NEIGH_CACHE_STAT_INC(tbl, forced_gc_runs);
>>
>> @@ -279,10 +281,16 @@ static int neigh_forced_gc(struct neigh_table *tbl)
>>                         if (shrunk >= max_clean)
>>                                 break;
>>                 }
>> +               if (++loop == 16) {
>> +                       if (ktime_get_ns() > tmax)
>> +                               goto unlock;
>> +                       loop = 0;
>> +               }
>>         }
>>
>>         WRITE_ONCE(tbl->last_flush, jiffies);
>>
>> +unlock:
>>         write_unlock_bh(&tbl->lock);
> 
> I'm curious what the plan here is. Your patch looks OK to me and I
> could give it a weak Reviewed-by, but I don't know the code well
> enough to know if we also need to address your second comment that we
> need to "use a second list to make sure to cycle over all elements
> eventually". Is that something you'd expect to get resolved before
> landing?
> 
> Thanks! :-)

entries are added to the gc_list at the tail, so it should be ok to take
a break. It will pickup at the head on the next trip through.