netdev - Re: [PATCH net] net: fix stack overflow when LRO is disabled for virtual interfaces

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <eeff656b-22ac-082d-9b94-62980e806f0f@blackwall.org>
Date: Mon, 15 May 2023 09:24:14 +0300
From: Nikolay Aleksandrov <razor@...ckwall.org>
To: Taehee Yoo <ap420073@...il.com>, davem@...emloft.net, kuba@...nel.org,
 pabeni@...hat.com, edumazet@...gle.com, jiri@...nulli.us,
 j.vosburgh@...il.com, andy@...yhouse.net, netdev@...r.kernel.org
Cc: jarod@...hat.com, wangyufen@...wei.com,
 syzbot+60748c96cf5c6df8e581@...kaller.appspotmail.com
Subject: Re: [PATCH net] net: fix stack overflow when LRO is disabled for
 virtual interfaces

On 15/05/2023 08:37, Taehee Yoo wrote:
> When the virtual interface's feature is updated, it synchronizes the
> updated feature for its own lower interface.
> This propagation logic should be worked as the iteration, not recursively.
> But it works recursively due to the netdev notification unexpectedly.
> This problem occurs when it disables LRO only for the team and bonding
> interface type.
> 
>        team0
>          |
>   +------+------+-----+-----+
>   |      |      |     |     |
> team1  team2  team3  ...  team200
> 
> If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE
> event to its own lower interfaces(team1 ~ team200).
> It is worked by netdev_sync_lower_features().
> So, the NETDEV_FEAT_CHANGE notification logic of each lower interface
> work iteratively.
> But generated NETDEV_FEAT_CHANGE event is also sent to the upper
> interface too.
> upper interface(team0) generates the NETDEV_FEAT_CHANGE event for its own
> lower interfaces again.
> lower and upper interfaces receive this event and generate this
> event again and again.
> So, the stack overflow occurs.
> 
> But it is not the infinite loop issue.
> Because the netdev_sync_lower_features() updates features before
> generating the NETDEV_FEAT_CHANGE event.
> Already synchronized lower interfaces skip notification logic.
> So, it is just the problem that iteration logic is changed to the
> recursive unexpectedly due to the notification mechanism.
> 
> Reproducer:
> 
> ip link add team0 type team
> ethtool -K team0 lro on
> for i in {1..200}
> do
>         ip link add team$i master team0 type team
>         ethtool -K team$i lro on
> done
> 
> ethtool -K team0 lro off
> 
> In order to fix it, the priv_notifier_ctx net_device member is introduced.
> This variable can be used by each interface in its own way in the
> notification context. The bonding and team interface is going to use it
> to avoid duplicated NETDEV_FEAT_CHANGE event handling.
> 
> Reported-by: syzbot+60748c96cf5c6df8e581@...kaller.appspotmail.com
> Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down stack")
> Signed-off-by: Taehee Yoo <ap420073@...il.com>
> ---
>  drivers/net/bonding/bond_main.c | 6 +++++-
>  drivers/net/team/team.c         | 6 +++++-
>  include/linux/netdevice.h       | 1 +
>  net/core/dev.c                  | 2 ++
>  4 files changed, 13 insertions(+), 2 deletions(-)
> 

Since you're syncing to lower devices, can't you check if the event source device
is lower to the current one (i.e. reverse propagation has happened) in the affected
drivers ? Adding a new struct netdevice member just for this seems unnecessary to me.
Especially for a setup like a bond of bonds or a team of teams, these are corner case
setups that shouldn't exist in general. :)

Cheers,
 Nik