[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1431959798.19977.4.camel@memnix.com>
Date:	Mon, 18 May 2015 10:36:38 -0400
From:	Abelardo Ricart III <aricart@...nix.com>
To:	Brandon Smith <freedom@...rdencode.com>
Cc:	Mike Snitzer <snitzer@...hat.com>, dm-devel@...hat.com,
	mpatocka@...hat.com, linux-kernel@...r.kernel.org
Subject: Re: Regression: Disk corruption with dm-crypt and kernels >= 4.0
On Fri, 2015-05-15 at 08:04 -0700, Brandon Smith wrote:
> On 2015-05-01 (Fri) at 19:42:15 -0400, Abelardo Ricart III wrote:
> > > > The patchset in question was tested quite heavily so this is a
> > > > surprising report.  I'm noticing you are opting in to dm-crypt discard
> > > > support.  Have you tested without discards enabled?
> > > 
> > > I've disabled discards universally and rebuilt a vanilla kernel. After 
> > > running
> > > my heavy read-write-sync scripts, everything seems to be working fine now. 
> > > I
> > > suppose this could be something that used to fail silently before, but now
> > > produces bad behavior? I seem to remember having something in my message 
> > > log
> > > about "discards not supported on this device" when running with it enabled
> > > before.
> > 
> > Forgive me, but I spoke too soon. The corruption and libata errors are still
> > there, as was evidenced when I went to reboot and got treated to an eye full 
> > of
> > "read-only filesystem" and ata errors.
> > 
> > So no, disabling discards unfortunately did nothing to help.
> 
> I've been experiencing the same problem.  Vanilla 4.0 series kernels,
> dm-crypt, with/or without discards, on a ThinkPad X1 Carbon with a
> LiteOn LGT-256M6G SSD.   
> 
> After some of googling around, I found some chatter relating to changes
> in NCQ on SSDs in 4.0.   Been running w/o NCQ for a full kernel build so
> far without issue.  Perhaps there's been some change in the interaction
> between dm-crypt and NCQ?
> 
> Abelardo, can you try w/o NCQ and see if that helps your situation?
> 
> Best,
> 
> --Brandon
I've been running with NCQ disabled and been stress testing for awhile and the
issue is indeed gone. Thanks for the workaround!
So it seems the issue is somehow related to the combination of NCQ, dm-crypt,
and possibly (some?) SSDs.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
