linux-kernel - Re: [50/98] [SCSI] libfc: fix free of fc_rport

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100127020306.GA12276@suse.de>
Date:	Tue, 26 Jan 2010 18:03:06 -0800
From:	Greg KH <gregkh@...e.de>
To:	Joe Eykholt <jeykholt@...co.com>
Cc:	linux-kernel@...r.kernel.org, stable@...nel.org,
	stable-review@...nel.org, torvalds@...ux-foundation.org,
	akpm@...ux-foundation.org, alan@...rguk.ukuu.org.uk,
	Robert Love <robert.w.love@...el.com>,
	James Bottomley <James.Bottomley@...e.de>
Subject: Re: [50/98] [SCSI] libfc: fix free of fc_rport_priv with timer
 pending

On Tue, Jan 26, 2010 at 04:03:13PM -0800, Joe Eykholt wrote:
> Greg KH wrote:
> > 2.6.32-stable review patch.  If anyone has any objections, please let us know.
> > 
> > ------------------
> > 
> > From: Joe Eykholt <jeykholt@...co.com>
> > 
> > commit b4a9c7ede96e90f7b1ec009ce7256059295e76df upstream.
> > 
> > Timer crashes were caused by freeing a struct fc_rport_priv
> > with a timer pending, causing the timer facility list to be
> > corrupted.  This was during FC uplink flap tests with a lot
> > of targets.
> > 
> > After discovery, we were doing an PLOGI on an rdata that was
> > in DELETE state but not yet removed from the lookup list.
> > This moved the rdata from DELETE state to PLOGI state.
> > If the PLOGI exchange allocation failed and needed to be
> > retried, the timer scheduling could race with the free
> > being done by fc_rport_work().
> > 
> > When fc_rport_login() is called on a rport in DELETE state,
> > move it to a new state RESTART.  In fc_rport_work, when
> > handling a LOGO, STOPPED or FAILED event, look for restart
> > state.  In the RESTART case, don't take the rdata off the
> > list and after the transport remote port is deleted and
> > exchanges are reset, re-login to the remote port.
> > 
> > Note that the new RESTART state also corrects a problem we
> > had when re-discovering a port that had moved to DELETE state.
> > In that case, a new rdata was created, but the old rdata
> > would do an exchange manager reset affecting the FC_ID
> > for both the new rdata and old rdata.  With the new state,
> > the new port isn't logged into until after any old exchanges
> > are reset.
> > 
> > Signed-off-by: Joe Eykholt <jeykholt@...co.com>
> > Signed-off-by: Robert Love <robert.w.love@...el.com>
> > Signed-off-by: James Bottomley <James.Bottomley@...e.de>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@...e.de>
> > 
> > ---
> >  drivers/scsi/libfc/fc_rport.c |   69 ++++++++++++++++++++++++++++++------------
> >  include/scsi/libfc.h          |    1 
> >  2 files changed, 51 insertions(+), 19 deletions(-)
> > 
> > --- a/drivers/scsi/libfc/fc_rport.c
> > +++ b/drivers/scsi/libfc/fc_rport.c
> > @@ -86,6 +86,7 @@ static const char *fc_rport_state_names[
> >  	[RPORT_ST_LOGO] = "LOGO",
> >  	[RPORT_ST_ADISC] = "ADISC",
> >  	[RPORT_ST_DELETE] = "Delete",
> > +	[RPORT_ST_RESTART] = "Restart",
> >  };
> >  
> >  /**
> > @@ -99,8 +100,7 @@ static struct fc_rport_priv *fc_rport_lo
> >  	struct fc_rport_priv *rdata;
> >  
> >  	list_for_each_entry(rdata, &lport->disc.rports, peers)
> > -		if (rdata->ids.port_id == port_id &&
> > -		    rdata->rp_state != RPORT_ST_DELETE)
> > +		if (rdata->ids.port_id == port_id)
> >  			return rdata;
> >  	return NULL;
> >  }
> > @@ -235,6 +235,7 @@ static void fc_rport_work(struct work_st
> >  	struct fc_rport_operations *rport_ops;
> >  	struct fc_rport_identifiers ids;
> >  	struct fc_rport *rport;
> > +	int restart = 0;
> >  
> >  	mutex_lock(&rdata->rp_mutex);
> >  	event = rdata->event;
> > @@ -287,8 +288,19 @@ static void fc_rport_work(struct work_st
> >  		mutex_unlock(&rdata->rp_mutex);
> >  
> >  		if (port_id != FC_FID_DIR_SERV) {
> > +			/*
> > +			 * We must drop rp_mutex before taking disc_mutex.
> > +			 * Re-evaluate state to allow for restart.
> > +			 * A transition to RESTART state must only happen
> > +			 * while disc_mutex is held and rdata is on the list.
> > +			 */
> >  			mutex_lock(&lport->disc.disc_mutex);
> > -			list_del(&rdata->peers);
> > +			mutex_lock(&rdata->rp_mutex);
> > +			if (rdata->rp_state == RPORT_ST_RESTART)
> > +				restart = 1;
> > +			else
> > +				list_del(&rdata->peers);
> 
> There is a follow-up patch that adds this line at this point:
> 
>                          rdata->event = RPORT_EV_NONE;
> 
> If this patch is integrated, that one should be integrated
> as well.  That patch is:
> 
> commit 5543c72e2bbb30e5ba5938b18ec26617b8b3fb04
> Author: Abhijeet Joglekar <abjoglek@...co.com>
> Date:   Thu Dec 10 09:59:20 2009 -0800
> 
> [SCSI] libfc: remote port gets stuck in restart state without really restarting

Thank you so much for the information, I have queued up this patch as
well for the .32 tree.

thanks for the review,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/