[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMArcTVy1vf38ktQY4e_V7ZnCq+pDf49jFHYGnZHSEy1zjinkg@mail.gmail.com>
Date: Mon, 13 Jan 2020 19:13:43 +0900
From: Taehee Yoo <ap420073@...il.com>
To: Cong Wang <xiyou.wangcong@...il.com>
Cc: syzbot <syzbot+4ec99438ed7450da6272@...kaller.appspotmail.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: WARNING: bad unlock balance in sch_direct_xmit
On Sun, 12 Jan 2020 at 06:53, Cong Wang <xiyou.wangcong@...il.com> wrote:
>
> On Thu, Jan 9, 2020 at 10:02 PM Taehee Yoo <ap420073@...il.com> wrote:
> > ndo_get_lock_subclass() was used to calculate subclass which was used by
> > netif_addr_lock_nested().
> >
> > -static inline void netif_addr_lock_nested(struct net_device *dev)
> > -{
> > - int subclass = SINGLE_DEPTH_NESTING;
> > -
> > - if (dev->netdev_ops->ndo_get_lock_subclass)
> > - subclass = dev->netdev_ops->ndo_get_lock_subclass(dev);
> > -
> > - spin_lock_nested(&dev->addr_list_lock, subclass);
> > -}
> >
> > The most important thing about nested lock is to get the correct subclass.
> > nest_level was used as subclass and this was calculated by
> > ->ndo_get_lock_subclass().
> > But, ->ndo_get_lock_subclass() didn't calculate correct subclass.
> > After "master" and "nomaster" operations, nest_level should be updated
> > recursively, but it didn't. So incorrect subclass was used.
> >
> > team3 <-- subclass 0
> >
> > "ip link set team3 master team2"
> >
> > team2 <-- subclass 0
> > team3 <-- subclass 1
> >
> > "ip link set team2 master team1"
> >
> > team1 <-- subclass 0
> > team3 <-- subclass 1
> > team3 <-- subclass 1
> >
> > "ip link set team1 master team0"
> >
> > team0 <-- subclass 0
> > team1 <-- subclass 1
> > team3 <-- subclass 1
> > team3 <-- subclass 1
> >
> > After "master" and "nomaster" operation, subclass values of all lower or
> > upper interfaces would be changed. But ->ndo_get_lock_subclass()
> > didn't update subclass recursively, lockdep warning appeared.
> > In order to fix this, I had two ways.
> > 1. use dynamic keys instead of static keys.
> > 2. fix ndo_get_lock_subclass().
> >
> > The reason why I adopted using dynamic keys instead of fixing
> > ->ndo_get_lock_subclass() is that the ->ndo_get_lock_subclass() isn't
> > a common helper function.
> > So, driver writers should implement ->ndo_get_lock_subclass().
> > If we use dynamic keys, ->ndo_get_lock_subclass() code could be removed.
> >
>
> The details you provide here are really helpful for me to understand
> the reasons behind your changes. Let me think about this and see how
> I could address both problems. This appears to be harder than I originally
> thought.
>
> >
> > What I fixed problems with dynamic lockdep keys could be fixed by
> > nested lock too. I think if the subclass value synchronization routine
> > works well, there will be no problem.
>
> Great! We are on the same page.
>
> Thanks for all the information and the reproducer too!
I really glad my explanation helps you!
Thank you so much!
Powered by blists - more mailing lists