lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5df6bc58-df2d-4d11-9447-b3bf06876cdc@uliege.be>
Date: Tue, 15 Apr 2025 13:56:51 +0200
From: Justin Iurman <justin.iurman@...ege.be>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Sebastian Sewior <bigeasy@...utronix.de>,
 Stanislav Fomichev <stfomichev@...il.com>,
 Network Development <netdev@...r.kernel.org>,
 "David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
 Simon Horman <horms@...nel.org>, Kuniyuki Iwashima <kuniyu@...zon.com>,
 bpf <bpf@...r.kernel.org>, Andrea Mayer <andrea.mayer@...roma2.it>
Subject: Re: [PATCH net] net: lwtunnel: disable preemption when required

On 4/15/25 01:13, Alexei Starovoitov wrote:
> On Fri, Apr 11, 2025 at 11:34 AM Justin Iurman <justin.iurman@...ege.be> wrote:
>>
>> On 4/7/25 19:54, Alexei Starovoitov wrote:
>>> On Sun, Apr 6, 2025 at 1:59 AM Justin Iurman <justin.iurman@...ege.be> wrote:
>>>>
>>>> On 4/4/25 16:19, Sebastian Sewior wrote:
>>>>> Alexei, thank you for the Cc.
>>>>>
>>>>> On 2025-04-03 13:35:10 [-0700], Alexei Starovoitov wrote:
>>>>>> Stating the obvious...
>>>>>> Sebastian did a lot of work removing preempt_disable from the networking
>>>>>> stack.
>>>>>> We're certainly not adding them back.
>>>>>> This patch is no go.
>>>>>
>>>>> While looking through the code, it looks as if lwtunnel_xmit() lacks a
>>>>> local_bh_disable().
>>>>
>>>> Thanks Sebastian for the confirmation, as the initial idea was to use
>>>> local_bh_disable() as well. Then I thought preempt_disable() would be
>>>> enough in this context, but I didn't realize you made efforts to remove
>>>> it from the networking stack.
>>>>
>>>> @Alexei, just to clarify: would you ACK this patch if we do
>>>> s/preempt_{disable|enable}()/local_bh_{disable|enable}()/g ?
>>>
>>> You need to think it through and not sprinkle local_bh_disable in
>>> every lwt related function.
>>> Like lwtunnel_input should be running with bh disabled already.
>>
>> Having nested calls to local_bh_{disable|enable}() is fine (i.e.,
>> disabling BHs when they're already disabled), but I guess it's cleaner
>> to avoid it here as you suggest. And since lwtunnel_input() is indeed
>> (always) running with BHs disabled, no changes needed. Thanks for the
>> reminder.
>>
>>> I don't remember the exact conditions where bh is disabled in xmit path.
>>
>> Right. Not sure for lwtunnel_xmit(), but lwtunnel_output() can
>> definitely run with or without BHs disabled. So, what I propose is the
>> following logic (applied to lwtunnel_xmit() too): if BHs disabled then
>> NOP else local_bh_disable(). Thoughts on this new version? (sorry, my
>> mailer messes it up, but you got the idea):
>>
>> diff --git a/net/core/lwtunnel.c b/net/core/lwtunnel.c
>> index e39a459540ec..d44d341683c5 100644
>> --- a/net/core/lwtunnel.c
>> +++ b/net/core/lwtunnel.c
>> @@ -331,8 +331,13 @@ int lwtunnel_output(struct net *net, struct sock
>> *sk, struct sk_buff *skb)
>>          const struct lwtunnel_encap_ops *ops;
>>          struct lwtunnel_state *lwtstate;
>>          struct dst_entry *dst;
>> +       bool in_softirq;
>>          int ret;
>>
>> +       in_softirq = in_softirq();
>> +       if (!in_softirq)
>> +               local_bh_disable();
>> +
> 
> This looks like a hack to me.
> 
> Instead analyze the typical xmit path. If bh is not disabled
> then add local_bh_disable(). It's fine if it happens to be nested
> in some cases.

FYI, and based on my previous response, the patch would look like this 
in that case (again, my mailer messes long lines up, sorry). I'll let 
others comment on which solution/tradeoff seems better.

diff --git a/net/core/lwtunnel.c b/net/core/lwtunnel.c
index e39a459540ec..d0cb0f2f9efe 100644
--- a/net/core/lwtunnel.c
+++ b/net/core/lwtunnel.c
@@ -333,6 +333,8 @@ int lwtunnel_output(struct net *net, struct sock 
*sk, struct sk_buff *skb)
  	struct dst_entry *dst;
  	int ret;

+	DEBUG_NET_WARN_ON_ONCE(!in_softirq());
+
  	if (dev_xmit_recursion()) {
  		net_crit_ratelimited("%s(): recursion limit reached on datapath\n",
  				     __func__);
@@ -380,6 +382,8 @@ int lwtunnel_xmit(struct sk_buff *skb)
  	struct dst_entry *dst;
  	int ret;

+	DEBUG_NET_WARN_ON_ONCE(!in_softirq());
+
  	if (dev_xmit_recursion()) {
  		net_crit_ratelimited("%s(): recursion limit reached on datapath\n",
  				     __func__);
@@ -428,6 +432,8 @@ int lwtunnel_input(struct sk_buff *skb)
  	struct dst_entry *dst;
  	int ret;

+	DEBUG_NET_WARN_ON_ONCE(!in_softirq());
+
  	if (dev_xmit_recursion()) {
  		net_crit_ratelimited("%s(): recursion limit reached on datapath\n",
  				     __func__);
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 6e18d7ec5062..89bda2f424bb 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -124,10 +124,13 @@ int ip_local_out(struct net *net, struct sock *sk, 
struct sk_buff *skb)
  {
  	int err;

+	local_bh_disable();
+
  	err = __ip_local_out(net, sk, skb);
  	if (likely(err == 1))
  		err = dst_output(net, sk, skb);

+	local_bh_enable();
  	return err;
  }
  EXPORT_SYMBOL_GPL(ip_local_out);
diff --git a/net/ipv6/output_core.c b/net/ipv6/output_core.c
index 806d4b5dd1e6..bb40196edeb6 100644
--- a/net/ipv6/output_core.c
+++ b/net/ipv6/output_core.c
@@ -150,10 +150,13 @@ int ip6_local_out(struct net *net, struct sock 
*sk, struct sk_buff *skb)
  {
  	int err;

+	local_bh_disable();
+
  	err = __ip6_local_out(net, sk, skb);
  	if (likely(err == 1))
  		err = dst_output(net, sk, skb);

+	local_bh_enable();
  	return err;
  }
  EXPORT_SYMBOL_GPL(ip6_local_out);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ