[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <AF3403B9-9791-4A77-A6FB-3281B8EA7A31@dilger.ca>
Date: Thu, 1 Mar 2018 13:20:16 -0700
From: Andreas Dilger <adilger@...ger.ca>
To: Theodore Ts'o <tytso@....edu>
Cc: Adrian Hunter <adrian.hunter@...el.com>,
Dmitry Osipenko <digetx@...il.com>,
Ulf Hansson <ulf.hansson@...aro.org>,
linux-mmc <linux-mmc@...r.kernel.org>,
linux-block <linux-block@...r.kernel.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Bough Chen <haibo.chen@....com>,
Alex Lemberg <alex.lemberg@...disk.com>,
Mateusz Nowak <mateusz.nowak@...el.com>,
Yuliy Izrailov <Yuliy.Izrailov@...disk.com>,
Jaehoon Chung <jh80.chung@...sung.com>,
Dong Aisheng <dongas86@...il.com>,
Das Asutosh <asutoshd@...eaurora.org>,
Zhangfei Gao <zhangfei.gao@...il.com>,
Sahitya Tummala <stummala@...eaurora.org>,
Harjani Ritesh <riteshh@...eaurora.org>,
Venu Byravarasu <vbyravarasu@...dia.com>,
Linus Walleij <linus.walleij@...aro.org>,
Shawn Lin <shawn.lin@...k-chips.com>,
Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>,
Christoph Hellwig <hch@....de>,
Thierry Reding <treding@...dia.com>,
Krishna Reddy <vdumpa@...dia.com>,
linux-ext4 <linux-ext4@...r.kernel.org>
Subject: Re: EXT4 Oops (Re: [PATCH V15 06/22] mmc: block: Add blk-mq support)
On Mar 1, 2018, at 9:04 AM, Theodore Ts'o <tytso@....edu> wrote:
> This doesn't seem to make sense; the PC is where we are currently
> executing, and LR is the "Link Register" where the flow of control
> will be returning after the current function returns, right? Well,
> dx_probe should *not* be returning to __wait_on_bit(). So this just
> seems.... weird.
>
> Ignoring the LR register, this stack trace looks sane... I can't see
> which pointer could be NULL and getting dereferenced, though. How
> easily can you reproduce the problem? Can you either (a) translate
> the PC into a line number, or better yet, if you can reproduce, add a
> series of BUG_ON's so we can see what's going on?
>
> + BUG_ON(frame);
I think you mean:
BUG_ON(frame == NULL);
or
BUG_ON(!frame);
> memset(frame_in, 0, EXT4_HTREE_LEVEL * sizeof(frame_in[0]));
> frame->bh = ext4_read_dirblock(dir, 0, INDEX);
> if (IS_ERR(frame->bh))
> return (struct dx_frame *) frame->bh;
>
> + BUG_ON(frame->bh);
> + BUG_ON(frame->bh->b_data);
Same here.
BUG_ON(frame->bh == NULL);
BUG_ON(frame->bh->b_data == NULL);
This is why I don't like implicit "is NULL" or "is non-zero" usage. Lustre
used to require "== NULL" or "!= NULL" to avoid bugs like this, but had to
abandon that because of upstream code style.
> root = (struct dx_root *) frame->bh->b_data;
> if (root->info.hash_version != DX_HASH_TEA &&
> root->info.hash_version != DX_HASH_HALF_MD4 &&
> root->info.hash_version != DX_HASH_LEGACY) {
>
> These are "could never" happen scenarios from looking at the code, but
> that will help explain what is going on.
>
> If this is reliably only happening with mq, the only way I could see
> that if is something is returning an error when it previously wasn't.
> This isn't a problem we're seeing with any of our testing, though.
>
> Cheers,
>
> - Ted
>
Cheers, Andreas
Download attachment "signature.asc" of type "application/pgp-signature" (874 bytes)
Powered by blists - more mailing lists