[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5df6bc58-df2d-4d11-9447-b3bf06876cdc@uliege.be>
Date: Tue, 15 Apr 2025 13:56:51 +0200
From: Justin Iurman <justin.iurman@...ege.be>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Sebastian Sewior <bigeasy@...utronix.de>,
Stanislav Fomichev <stfomichev@...il.com>,
Network Development <netdev@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>, Kuniyuki Iwashima <kuniyu@...zon.com>,
bpf <bpf@...r.kernel.org>, Andrea Mayer <andrea.mayer@...roma2.it>
Subject: Re: [PATCH net] net: lwtunnel: disable preemption when required
On 4/15/25 01:13, Alexei Starovoitov wrote:
> On Fri, Apr 11, 2025 at 11:34 AM Justin Iurman <justin.iurman@...ege.be> wrote:
>>
>> On 4/7/25 19:54, Alexei Starovoitov wrote:
>>> On Sun, Apr 6, 2025 at 1:59 AM Justin Iurman <justin.iurman@...ege.be> wrote:
>>>>
>>>> On 4/4/25 16:19, Sebastian Sewior wrote:
>>>>> Alexei, thank you for the Cc.
>>>>>
>>>>> On 2025-04-03 13:35:10 [-0700], Alexei Starovoitov wrote:
>>>>>> Stating the obvious...
>>>>>> Sebastian did a lot of work removing preempt_disable from the networking
>>>>>> stack.
>>>>>> We're certainly not adding them back.
>>>>>> This patch is no go.
>>>>>
>>>>> While looking through the code, it looks as if lwtunnel_xmit() lacks a
>>>>> local_bh_disable().
>>>>
>>>> Thanks Sebastian for the confirmation, as the initial idea was to use
>>>> local_bh_disable() as well. Then I thought preempt_disable() would be
>>>> enough in this context, but I didn't realize you made efforts to remove
>>>> it from the networking stack.
>>>>
>>>> @Alexei, just to clarify: would you ACK this patch if we do
>>>> s/preempt_{disable|enable}()/local_bh_{disable|enable}()/g ?
>>>
>>> You need to think it through and not sprinkle local_bh_disable in
>>> every lwt related function.
>>> Like lwtunnel_input should be running with bh disabled already.
>>
>> Having nested calls to local_bh_{disable|enable}() is fine (i.e.,
>> disabling BHs when they're already disabled), but I guess it's cleaner
>> to avoid it here as you suggest. And since lwtunnel_input() is indeed
>> (always) running with BHs disabled, no changes needed. Thanks for the
>> reminder.
>>
>>> I don't remember the exact conditions where bh is disabled in xmit path.
>>
>> Right. Not sure for lwtunnel_xmit(), but lwtunnel_output() can
>> definitely run with or without BHs disabled. So, what I propose is the
>> following logic (applied to lwtunnel_xmit() too): if BHs disabled then
>> NOP else local_bh_disable(). Thoughts on this new version? (sorry, my
>> mailer messes it up, but you got the idea):
>>
>> diff --git a/net/core/lwtunnel.c b/net/core/lwtunnel.c
>> index e39a459540ec..d44d341683c5 100644
>> --- a/net/core/lwtunnel.c
>> +++ b/net/core/lwtunnel.c
>> @@ -331,8 +331,13 @@ int lwtunnel_output(struct net *net, struct sock
>> *sk, struct sk_buff *skb)
>> const struct lwtunnel_encap_ops *ops;
>> struct lwtunnel_state *lwtstate;
>> struct dst_entry *dst;
>> + bool in_softirq;
>> int ret;
>>
>> + in_softirq = in_softirq();
>> + if (!in_softirq)
>> + local_bh_disable();
>> +
>
> This looks like a hack to me.
>
> Instead analyze the typical xmit path. If bh is not disabled
> then add local_bh_disable(). It's fine if it happens to be nested
> in some cases.
FYI, and based on my previous response, the patch would look like this
in that case (again, my mailer messes long lines up, sorry). I'll let
others comment on which solution/tradeoff seems better.
diff --git a/net/core/lwtunnel.c b/net/core/lwtunnel.c
index e39a459540ec..d0cb0f2f9efe 100644
--- a/net/core/lwtunnel.c
+++ b/net/core/lwtunnel.c
@@ -333,6 +333,8 @@ int lwtunnel_output(struct net *net, struct sock
*sk, struct sk_buff *skb)
struct dst_entry *dst;
int ret;
+ DEBUG_NET_WARN_ON_ONCE(!in_softirq());
+
if (dev_xmit_recursion()) {
net_crit_ratelimited("%s(): recursion limit reached on datapath\n",
__func__);
@@ -380,6 +382,8 @@ int lwtunnel_xmit(struct sk_buff *skb)
struct dst_entry *dst;
int ret;
+ DEBUG_NET_WARN_ON_ONCE(!in_softirq());
+
if (dev_xmit_recursion()) {
net_crit_ratelimited("%s(): recursion limit reached on datapath\n",
__func__);
@@ -428,6 +432,8 @@ int lwtunnel_input(struct sk_buff *skb)
struct dst_entry *dst;
int ret;
+ DEBUG_NET_WARN_ON_ONCE(!in_softirq());
+
if (dev_xmit_recursion()) {
net_crit_ratelimited("%s(): recursion limit reached on datapath\n",
__func__);
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 6e18d7ec5062..89bda2f424bb 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -124,10 +124,13 @@ int ip_local_out(struct net *net, struct sock *sk,
struct sk_buff *skb)
{
int err;
+ local_bh_disable();
+
err = __ip_local_out(net, sk, skb);
if (likely(err == 1))
err = dst_output(net, sk, skb);
+ local_bh_enable();
return err;
}
EXPORT_SYMBOL_GPL(ip_local_out);
diff --git a/net/ipv6/output_core.c b/net/ipv6/output_core.c
index 806d4b5dd1e6..bb40196edeb6 100644
--- a/net/ipv6/output_core.c
+++ b/net/ipv6/output_core.c
@@ -150,10 +150,13 @@ int ip6_local_out(struct net *net, struct sock
*sk, struct sk_buff *skb)
{
int err;
+ local_bh_disable();
+
err = __ip6_local_out(net, sk, skb);
if (likely(err == 1))
err = dst_output(net, sk, skb);
+ local_bh_enable();
return err;
}
EXPORT_SYMBOL_GPL(ip6_local_out);
Powered by blists - more mailing lists