netdev - Re: [RFC PATCH] e100: Fix workqueue race

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Fri, 22 Jan 2010 09:38:34 +0000
From:	Jarek Poplawski <jarkao2@...il.com>
To:	Alan Cox <alan@...ux.intel.com>
Cc:	jesse.brandeburg@...el.com, netdev@...r.kernel.org
Subject: Re: [RFC PATCH] e100: Fix workqueue race

On Fri, Jan 22, 2010 at 09:07:31AM +0000, Jarek Poplawski wrote:
> On Fri, Jan 22, 2010 at 08:42:00AM +0000, Jarek Poplawski wrote:
> > On 21-01-2010 17:48, Alan Cox wrote:
> > > (Incidentally this doesn't seem to be the only net driver that looks
> > > suspect here)
> > > 
> > > e100: Fix the TX workqueue race
> > > 
> > > From: Alan Cox <alan@...ux.intel.com>
> > > 
> > > Nothing stops the workqueue being left to run in parallel with close or a
> > > few other operations. This causes double unmaps and the like.
> > > 
> > > See kerneloops.org #1041230 for an example
> > > 
> > > Signed-off-by: Alan Cox <alan@...ux.intel.com>
> > > ---
> > > 
> > >  drivers/net/e100.c |   13 +++++++++++--
> > >  1 files changed, 11 insertions(+), 2 deletions(-)
> > > 
> > > 
> > > diff --git a/drivers/net/e100.c b/drivers/net/e100.c
> > > index 5c7a155..5e02e4f 100644
> > > --- a/drivers/net/e100.c
> > > +++ b/drivers/net/e100.c
> > > @@ -2232,7 +2232,7 @@ err_rx_clean_list:
> > >  	return err;
> > >  }
> > >  
> > > -static void e100_down(struct nic *nic)
> > > +static void e100_do_down(struct nic *nic)
> > >  {
> > >  	/* wait here for poll to complete */
> > >  	napi_disable(&nic->napi);
> > > @@ -2245,6 +2245,15 @@ static void e100_down(struct nic *nic)
> > >  	e100_rx_clean_list(nic);
> > >  }
> > >  
> > > +/* For the non TX timeout case we want to kill the tx timeout before
> > > +   we do this otherwise a parallel tx timeout will make a nasty mess. */
> > > +
> > > +static void e100_down(struct nic *nic)
> > > +{
> > > +	cancel_work_sync(&nic->tx_timeout_task);
> > 
> > Can't tx_timeout_task be triggered just between these two calls here?
> 
> More exactly: except when this is called from dev_close(), where it
> should work OK. (At least until tx_timeout_task doesn't take any lock
> held here - especially rtnl_lock.)

Hmm... Even more exactly, since tx_timeout_task can be triggered not
only by dev_watchdog(), dev_close() is suspicious too.

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html