[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20080306135625.M25627@visp.net.lb>
Date: Thu, 6 Mar 2008 15:57:30 +0200
From: "Denys Fedoryshchenko" <denys@...p.net.lb>
To: Jarek Poplawski <jarkao2@...il.com>
Cc: netdev@...r.kernel.org, jamal <hadi@...erus.ca>
Subject: Re: circular locking, mirred, 2.6.24.2
I am able to reproduce this warning over this relatively simple shell script
on my Gentoo PC (2.6.25-rc3).
http://www.nuclearcat.com/files/bug_feb.txt
Probably it will help to debug issue for more experienced developers. Note:
it appears not immediately, second time i tested, it's appeared after while,
but in matter of seconds.
Note - it can stop traffic on PC completely. It is also seems crashed my
desktop PC, i am not able to execute "tc qdisc del dev eth0 root".
The system hang completely. I had few similar issues on my PPPoE servers
(with different scripts for shapers), that system hang, and even "reboot -f"
doesn't work sometimes.
On Thu, 6 Mar 2008 13:40:15 +0000, Jarek Poplawski wrote
> On Wed, Mar 05, 2008 at 12:45:51PM +0200, Denys Fedoryshchenko wrote:
> > I did test on vanilla 2.6.25-rc3, on clean Gentoo distro and got
> > similar message. The strange thing, message appeared not immediately
after
> > launching script, but after few seconds.
> >
> > Scripts is the same. I have same message on another script, used for ppp
> > shaper.
> >
> > [ 10.536424] =======================================================
> > [ 10.536424] [ INFO: possible circular locking dependency detected ]
> > [ 10.536424] 2.6.25-rc3-devel #3
> > [ 10.536424] -------------------------------------------------------
> > [ 10.536424] swapper/0 is trying to acquire lock:
> > [ 10.536424] (&dev->queue_lock){-+..}, at: [<c0299b4a>]
> > dev_queue_xmit+0x175/0x2f3
> > [ 10.536424]
> > [ 10.536424] but task is already holding lock:
> > [ 10.536424] (&p->tcfc_lock){-+..}, at: [<f8a67154>] tcf_mirred+0x20/
0x178
> > [act_mirred]
> > [ 10.536424]
> > [ 10.536424] which lock already depends on the new lock.
> ....
>
> Hi,
>
> I'm not sure this lockdep report is because of this, but there is
> really a problem with lock order while using sch_ingress with
> act_mirred: dev->queue_lock is taken after dev->ingress_lock, so
> reversely to e.g. qdisc_lock_tree(). This shouldn't be a problem
> when one of the devices is ifb yet.
>
> Regards,
> Jarek P.
>
> Here is a patch for testing:
>
> ---
>
> drivers/net/ifb.c | 19 +++++++++++++++++++
> 1 files changed, 19 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
> index 15949d3..2bc71df 100644
> --- a/drivers/net/ifb.c
> +++ b/drivers/net/ifb.c
> @@ -227,6 +227,22 @@ static struct rtnl_link_ops ifb_link_ops
> __read_mostly = { module_param(numifbs, int, 0);
> MODULE_PARM_DESC(numifbs, "Number of ifb devices");
>
> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> +/*
> + * dev_ifb->queue_lock is usually taken after dev->ingress_lock,
> + * so let's tell lockdep it's different from dev->queue_lock
> + */
> +static struct lock_class_key ifb_queue_lock_key;
> +static inline void ifb_set_lock_class(spinlock_t *lock)
> +{
> + lockdep_set_class(lock, &ifb_queue_lock_key);
> +}
> +#else
> +static inline void ifb_set_lock_class(spinlock_t *lock)
> +{
> +}
> +#endif /* CONFIG_DEBUG_LOCK_ALLOC */
> +
> static int __init ifb_init_one(int index)
> {
> struct net_device *dev_ifb;
> @@ -246,6 +262,9 @@ static int __init ifb_init_one(int index)
> err = register_netdevice(dev_ifb);
> if (err < 0)
> goto err;
> +
> + ifb_set_lock_class(&dev_ifb->queue_lock);
> +
> return 0;
>
> err:
--
Denys Fedoryshchenko
Technical Manager
Virtual ISP S.A.L.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists