lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20070626160753.f86d8803.akpm@linux-foundation.org>
Date:	Tue, 26 Jun 2007 16:07:53 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Jarek Poplawski <jarkao2@...pl>
Cc:	Folkert van Heusden <folkert@...heusden.com>,
	linux-kernel@...r.kernel.org,
	Jason Wessel <jason.wessel@...driver.com>,
	Thomas Gleixner <tglx@...utronix.de>, stable@...nel.org,
	netdev@...r.kernel.org
Subject: Re: [PATCH] Re: [2.6.21.1] soft lockup when removing netconsole
 module

On Wed, 13 Jun 2007 11:25:37 +0200
Jarek Poplawski <jarkao2@...pl> wrote:

> On Tue, Jun 12, 2007 at 01:02:33PM +0200, Jarek Poplawski wrote:
> ...
> > Of course such a problem should preferably be fixed by somebody who
> > knows the code (alas I don't know netconsole), to be sure all needed
> > cancels are still done after this change. I hope Jason's patch is
> > right but I'm a little surprised I can't see netdev in cc (I'll try
> > to fix this).
> 
> So, I've had a look into netpoll and, unfortunately, I don't
> think this patch is right... 
> 
> > > From: Jason Wessel <jason.wessel@...driver.com>
> > > 
> > > Do not call cancel_rearming_delayed_work() if there is no
> > > pending work.
> > > 
> > > Signed-off-by: Jason Wessel <jason.wessel@...driver.com>
> > > Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
> > > ---
> > > 
> > >  net/core/netpoll.c |    6 ++++--
> > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > 
> > > diff -puN net/core/netpoll.c~a net/core/netpoll.c
> > > --- a/net/core/netpoll.c~a
> > > +++ a/net/core/netpoll.c
> > > @@ -784,8 +784,10 @@ void netpoll_cleanup(struct netpoll *np)
> > >  			if (atomic_dec_and_test(&npinfo->refcnt)) {
> > >  				skb_queue_purge(&npinfo->arp_tx);
> > >  				skb_queue_purge(&npinfo->txq);
> > > -				cancel_rearming_delayed_work(&npinfo->tx_work);
> > > -				flush_scheduled_work();
> > > +				if (delayed_work_pending(&npinfo->tx_work)) {
> > > +					cancel_rearming_delayed_work(&npinfo->tx_work);
> > > +					flush_scheduled_work();
> > > +				}
> > >  
> > >  				kfree(npinfo);
> > >  			}
> > > _
> 
> There are such possibilities:
> 
> 1. After positive delayed_work_pending(&npinfo->tx_work) test
> some work is queued, but there is no guarantee that when running
> it'll rearm again, so cancel_rearming_delayed_work can loop again;
> 
> 2. After negative delayed_work_pending(&npinfo->tx_work) test
> a work is just running, eg. waiting on netif_tx_lock, while
> kfree(npinfo) is done here (oops?!).
> 
> I've found an additional problem here with or without this patch:
> after deleting a timer in cancel_rearming_delayed_work() there could
> stay a last skb queued in npinfo->txq, and after kfree(npinfo)
> we have small memory leak. If I'm right here similar fix is needed
> in the current netpoll code: additional npinfo->txq purging only
> or maybe the whole cancel_rearming_ changed like this.
> 
> I've tried to eliminate these problems in attached below patch
> proposal. I'm not sure it's all right: as I've written earlier I
> don't know netconsole enough, but it's probably a little better
> than above solution.
> 
> I've some doubts yet (I didn't have time to check this all):
> 
> 1. I hope this other schedule_delayed_work() from netpoll_send_skb()
> is not possible when netpoll_cleanup() runs - if I'm wrong additional
> check of npinfo->refcnt should be done there;
> 2. I also hope npinfo->refcnt before scheduling should be enough here
> - if not - another possibility is adding some locking eg.:
> netif_tx_lock before cancel for synchronization.
> 
> Of course it would be very nice if somebody could test or verify
> this patch more.
> 
> Regards,
> Jarek P.
> 
> 
> Signed-off-by: Jarek Poplawski <jarkao2@...pl>
> 
> ---
> 
> diff -Nurp 2.6.21-/net/core/netpoll.c 2.6.21/net/core/netpoll.c
> --- 2.6.21-/net/core/netpoll.c	2007-04-26 15:08:32.000000000 +0200
> +++ 2.6.21/net/core/netpoll.c	2007-06-12 21:05:23.000000000 +0200
> @@ -73,7 +73,8 @@ static void queue_process(struct work_st
>  			netif_tx_unlock(dev);
>  			local_irq_restore(flags);
>  
> -			schedule_delayed_work(&npinfo->tx_work, HZ/10);
> +			if (atomic_read(&npinfo->refcnt))
> +				schedule_delayed_work(&npinfo->tx_work, HZ/10);
>  			return;
>  		}
>  		netif_tx_unlock(dev);
> @@ -780,9 +781,15 @@ void netpoll_cleanup(struct netpoll *np)
>  			if (atomic_dec_and_test(&npinfo->refcnt)) {
>  				skb_queue_purge(&npinfo->arp_tx);
>  				skb_queue_purge(&npinfo->txq);
> -				cancel_rearming_delayed_work(&npinfo->tx_work);
> +				cancel_delayed_work(&npinfo->tx_work);
>  				flush_scheduled_work();
>  
> +				/* clean after last, unfinished work */
> +				if (!skb_queue_empty(&npinfo->txq)) {
> +					struct sk_buff *skb;
> +					skb = __skb_dequeue(&npinfo->txq);
> +					kfree_skb(skb);
> +				}
>  				kfree(npinfo);
>  			}
>  		}

Everything went quiet?

If this patch has been tested and fixes the bug, can you please send a
version which is ready for merging?  (ie: add a suitable description of
what it does).


Thanks.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ