linux-kernel - Re: [PATCH 0/2] avoid crashing when reading /proc/scsi/scsi and simultaneously removing devices

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1452547979.22112.42.camel@localhost.localdomain>
Date:	Mon, 11 Jan 2016 16:32:59 -0500
From:	Ewan Milne <emilne@...hat.com>
To:	James Bottomley <James.Bottomley@...senPartnership.com>
Cc:	linux-kernel@...r.kernel.org, linux-scsi@...r.kernel.org,
	gregkh@...uxfoundation.org, martin.petersen@...cle.com,
	hare@...e.com
Subject: Re: [PATCH 0/2] avoid crashing when reading /proc/scsi/scsi and
 simultaneously removing devices

On Mon, 2016-01-11 at 11:15 -0800, James Bottomley wrote:
> On Mon, 2016-01-11 at 12:28 -0500, Ewan D. Milne wrote:
> > From: "Ewan D. Milne" <emilne@...hat.com>
> > 
> > The klist traversal used by the reading of /proc/scsi/scsi is not
> > interlocked
> > against device removal.  It takes a reference on the containing
> > object, but
> > this does not prevent the device from being removed from the list. 
> >  Thus, we
> > get errors and eventually panic, as shown in the traces below.  Fix
> > this by
> > keeping a klist iterator in the seq_file private data.
> > 
> > The problem can be easily reproduced by repeatedly increasing
> > scsi_debug's
> > max_luns to 30 and then deleting the devices via sysfs, while
> > simulatenously
> > accessing /proc/scsi/scsi.
> >     
> > From a patch originally developed by David Jeffery <
> > djeffery@...hat.com>
> 
> OK, so it looks like this is a bug in the klist system.  When a
> starting point is used, there should be a check to see if it's still
> active otherwise the whole thing is racy.  If it's fixed in klist, the
> fix works for everyone, not just SCSI.
> 
> How about this?  It causes the iterator to start at the beginning if
> the node has been deleted.  That will produce double output during some
> of your test, but I think that's OK given that this is a rare race.
> 
> James

I'm running with your change now, it does appear to fix the problem.
I guess the question is whether this behavior would trip up any other
klist users, for /proc/scsi/scsi it is probably not a problem.  The
worst that might happen is that userspace tools that parse the output
would get duplicate entries.

-Ewan

> ---
> 
> diff --git a/lib/klist.c b/lib/klist.c
> index d74cf7a..0507fa5 100644
> --- a/lib/klist.c
> +++ b/lib/klist.c
> @@ -282,9 +282,9 @@ void klist_iter_init_node(struct klist *k, struct klist_iter *i,
>  			  struct klist_node *n)
>  {
>  	i->i_klist = k;
> -	i->i_cur = n;
> -	if (n)
> -		kref_get(&n->n_ref);
> +	i->i_cur = NULL;
> +	if (n && kref_get_unless_zero(&n->n_ref))
> +		i->i_cur = n;
>  }
>  EXPORT_SYMBOL_GPL(klist_iter_init_node);
>