lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 29 Oct 2015 14:22:32 -0700
From:	Greg KH <greg@...ah.com>
To:	Roman Gushchin <klamm@...dex-team.ru>
Cc:	Neil Brown <neilb@...e.de>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Shaohua Li <shli@...nel.org>,
	"linux-raid@...r.kernel.org" <linux-raid@...r.kernel.org>,
	"stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH] md/raid5: fix locking in handle_stripe_clean_event()

On Thu, Oct 29, 2015 at 05:15:48PM +0300, Roman Gushchin wrote:
> 29.10.2015, 03:35, "Neil Brown" <neilb@...e.de>:
> > On Wed, Oct 28 2015, Roman Gushchin wrote:
> >
> >>  After commit 566c09c53455 ("raid5: relieve lock contention in get_active_stripe()")
> >>  __find_stripe() is called under conf->hash_locks + hash.
> >>  But handle_stripe_clean_event() calls remove_hash() under
> >>  conf->device_lock.
> >>
> >>  Under some cirscumstances the hash chain can be circuited,
> >>  and we get an infinite loop with disabled interrupts and locked hash
> >>  lock in __find_stripe(). This leads to hard lockup on multiple CPUs
> >>  and following system crash.
> >>
> >>  I was able to reproduce this behavior on raid6 over 6 ssd disks.
> >>  The devices_handle_discard_safely option should be set to enable trim
> >>  support. The following script was used:
> >>
> >>  for i in `seq 1 32`; do
> >>      dd if=/dev/zero of=large$i bs=10M count=100 &
> >>  done
> >>
> >>  Signed-off-by: Roman Gushchin <klamm@...dex-team.ru>
> >>  Cc: Neil Brown <neilb@...e.de>
> >>  Cc: Shaohua Li <shli@...nel.org>
> >>  Cc: linux-raid@...r.kernel.org
> >>  Cc: <stable@...r.kernel.org> # 3.10 - 3.19
> >
> > Hi Roman,
> >  thanks for reporting this and providing a fix.
> >
> > I'm a bit confused by that stable range: 3.10 - 3.19
> >
> > The commit you identify as introducing the bug was added in 3.13, so
> > presumably 3.10, 3.11, 3.12 are not affected.
> 
> Sure, it's my mistake. Correct range is 3.13 - 3.19. Sorry.
> 
> > Also the bug is still present in mainline, so 4.0, 4.1, 4.2 are also
> > affected, though the patch needs to be revised a bit for 4.1 and later.
> 
> Yes, exactly, but things are a bit more complicated in mainline.
> I'll try to prepare a patch for mainline in a couple of days.

We can't do anything with a patch that is not already in Linus's tree,
which is why this isn't even in my patch queue anymore.  Please resend
this once the fix is in Linus's tree, with the git commit id of what it
is there and we will be glad to queue it up.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ