netdev - Re: [PATCH net v2] net: team: get rid of team->lock in team module

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <9ac76b6a-d490-f633-ba90-f0851f5a3b6f@gmail.com>
Date: Mon, 18 Sep 2023 16:42:12 +0900
From: Taehee Yoo <ap420073@...il.com>
To: Jiri Pirko <jiri@...nulli.us>
Cc: davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
 edumazet@...gle.com, netdev@...r.kernel.org,
 syzbot+9bbbacfbf1e04d5221f7@...kaller.appspotmail.com,
 syzbot+1c71587a1a09de7fbde3@...kaller.appspotmail.com
Subject: Re: [PATCH net v2] net: team: get rid of team->lock in team module



On 2023. 9. 18. 오후 4:19, Jiri Pirko wrote:
 > Mon, Sep 18, 2023 at 03:16:26AM CEST, ap420073@...il.com wrote:
 >>
 >>
 >> On 2023. 9. 17. 오전 1:47, Jiri Pirko wrote:
 >>
 >> Hi Jiri,
 >> Thank you so much for your review!
 >>
 >>> Sat, Sep 16, 2023 at 03:11:15PM CEST, ap420073@...il.com wrote:
 >>>> The purpose of team->lock is to protect the private data of the team
 >>>> interface. But RTNL already protects it all well.
 >>>> The precise purpose of the team->lock is to reduce contention of
 >>>> RTNL due to GENL operations such as getting the team port list, and
 >>>> configuration dump.
 >>>>
 >>>> team interface has used a dynamic lockdep key to avoid false-positive
 >>>> lockdep deadlock detection. Virtual interfaces such as team usually
 >>>> have their own lock for protecting private data.
 >>>> These interfaces can be nested.
 >>>> team0
 >>>>    |
 >>>> team1
 >>>>
 >>>> Each interface's lock is actually different(team0->lock and 
team1->lock).
 >>>> So,
 >>>> mutex_lock(&team0->lock);
 >>>> mutex_lock(&team1->lock);
 >>>> mutex_unlock(&team1->lock);
 >>>> mutex_unlock(&team0->lock);
 >>>> The above case is absolutely safe. But lockdep warns about deadlock.
 >>>> Because the lockdep understands these two locks are same. This is a
 >>>> false-positive lockdep warning.
 >>>>
 >>>> So, in order to avoid this problem, the team interfaces started to use
 >>>> dynamic lockdep key. The false-positive problem was fixed, but it
 >>>> introduced a new problem.
 >>>>
 >>>> When the new team virtual interface is created, it registers a dynamic
 >>>> lockdep key(creates dynamic lockdep key) and uses it. But there is the
 >>>> limitation of the number of lockdep keys.
 >>>> So, If so many team interfaces are created, it consumes all 
lockdep keys.
 >>>> Then, the lockdep stops to work and warns about it.
 >>>
 >>> What about fixing the lockdep instead? I bet this is not the only
 >>> occurence of this problem.
 >>
 >> There were many similar patches for fixing lockdep false-positive 
problem.
 >> But, I didn't consider fixing lockdep because I thought the 
limitation of
 >> lockdep key was normal.
 >> So, I still think stopping working due to exceeding lockdep keys is 
not a
 >> problem of the lockdep itself.
 >
 > Lockdep is a diagnostic tool. The fact the tool is not working properly
 > does not mean we need to change the code the tool is working with. Fix
 > the tool.

I agree with you.
Fixing the lockdep side looks more correct way.
I will dig some way to fix this problem on the lockdep side.

Thank you so much!
Taehee Yoo

 >
 >
 >>
 >>>
 >>>
 >>>>
 >>>> So, in order to fix this issue, It just removes team->lock and uses
 >>>> RTNL instead.
 >>>>
 >>>> The previous approach to fix this issue was to use the subclass 
lockdep
 >>>> key instead of the dynamic lockdep key. It requires RTNL before 
acquiring
 >>>> a nested lock because the subclass variable(dev->nested_lock) is
 >>>> protected by RTNL.
 >>>> However, the coverage of team->lock is too wide so sometimes it should
 >>>> use a subclass variable before initialization.
 >>>> So, it can't work well in the port initialization and unregister 
logic.
 >>>>
 >>>> This approach is just removing the team->lock clearly.
 >>>> So there is no special locking scenario in the team module.
 >>>> Also, It may convert RTNL to RCU for the read-most operations such as
 >>>> GENL dump but not yet adopted.
 >>>>
 >>>> Reproducer:
 >>>>     for i in {0..1000}
 >>>>     do
 >>>>             ip link add team$i type team
 >>>>             ip link add dummy$i master team$i type dummy
 >>>>             ip link set dummy$i up
 >>>>             ip link set team$i up
 >>>>     done
 >>>>
 >>
 >> Thanks a lot!
 >> Taehee Yoo