[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131210152701.GE1543@quack.suse.cz>
Date: Tue, 10 Dec 2013 16:27:01 +0100
From: Jan Kara <jack@...e.cz>
To: George Spelvin <linux@...izon.com>
Cc: jack@...e.cz, linux-ext4@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
tytso@....edu, viro@...IV.linux.org.uk
Subject: Re: 3.11.4: kernel BUG at fs/buffer.c:1268
On Tue 10-12-13 04:35:28, George Spelvin wrote:
> One of those additional WARN_ON tests tripped, hooray!
> And it turned out to be in the ext4 metadata checksumming. To be
> precise, ext4_block_bitmap_csum_set() returned with irqs disabled,
> and kaboom.
Ha, great. Thanks for the persistence in testing.
> Since I have this experimental feature turned on and most people don't,
> this explains why I'm finding it and World+Dog aren't.
>
> I appear to be the designated finder of ext4 metadata_csum bugs, so tytso
> notified on general principles. I dropped the generic linux-fsdevel
> list from the Cc: list.
>
> But looking at the code, it just calls into the linux-crypto layer and
> Tim Chen's SSE CRC32C implementation which uses kernel_fpu_begin()
> and kernel_fpu_end() if the block is large enough.
Yup, that code was also my last hope but I can't say I see any problem in
there either.
> I was going to add and Herbert Xu and Tim Chen and all those mailing
> lists, but looking at the code, it sure *looks* like they're Doing The
> right Thing, so I'm holding off for a bit.
>
> I'm not sure quite where to pass th buck on this one.
>
> Relevant platform info:
> - Intel i7-2700K processor, with SSE4.2 and thus the CRC32C instruction.
> - CONFIG_PREEMPT_VOLUNTARY=y
> - # CONFIG_PREEMPT_NONE is not set
> - CONFIG_PREEMPT_VOLUNTARY=y
> - # CONFIG_PREEMPT is not set
> - CONFIG_PREEMPT_COUNT=y
> - CONFIG_DEBUG_ATOMIC_SLEEP=y
> - CONFIG_DEBUG_BUGVERBOSE=y
>
...
>
> === Discussion ===
> desc.shash.tfm is filled in from sbi->s_chksum_driver, which is filled in at
> ext4_fill_super() time by crypto_alloc_shash("crc32c", 0, 0).
>
> Thus, shash->update should turn into a call to crypto/crc32c.c:chksum_update(),
> which calls lib/crc32.c:__crc32c_le().
>
> Now, I happen to be running an i7-2700k which has sse4_2, and thus calls
> into the x86 specific code, and apparently for large blocks it uses PCLMULQDQ,
> which requires kernel_fpu_begin/end.
>
> At least that makes some degree of sense. The low level code, though
> uses the functions in a very simple way that I can't see how it could fail
> to unlock at the end.
Hum, can you try disabling the HW support of CRC32C implementation
(CRYPTO_CRC32C_INTEL)? If the problem disappears, we know there's some
problem in the HW support code...
Honza
--
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists