[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LSU.2.11.1507311207160.11122@eggly.anvils>
Date: Fri, 31 Jul 2015 12:42:30 -0700 (PDT)
From: Hugh Dickins <hughd@...gle.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
cc: "J. Bruce Fields" <bfields@...ldses.org>,
Dominique Martinet <dominique.martinet@....fr>,
Hugh Dickins <hughd@...gle.com>,
Al Viro <viro@...iv.linux.org.uk>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: v4.2-rc dcache regression, probably 75a6f82a0d10
On Fri, 31 Jul 2015, Linus Torvalds wrote:
> On Fri, Jul 31, 2015 at 10:46 AM, Hugh Dickins <hughd@...gle.com> wrote:
> >
> > Sounds like a dcache problem, and 75a6f82a0d10 seemed the only
> > likely candidate, so I experimented with reverting it yesterday,
> > and ran successfully for 24 hours.
>
> Hmm. Sounds odd. Are you running nfsd? That would explain why it
> happens on ext4 but not tmpfs: ext4 has a get_parent method that can
> get a disconnected entry, while tmpfs does not.
>
> That said, your load doesn't sound like it would actually ever trigger
> this, unless you just didn't mention that you also end up using that
> filesystem over nfs on another machine.
No, no nfsd nor any kind of networking filesystem stuff going on.
Right, I never looked to see what DCACHE_DISCONNECTED is actually
about, just rushed ahead and tried running with the revert.
>
> So leave it running a while longer, but maybe it's 4bf46a272647 like
> Dominique suspects. Although I don't see how that could trigger
> anything either..
I restarted with a slightly different version of the load this
morning, which has sometimes shown the issue more easily - I thought
it better to restart with a variant than persist with a run that
might have settled into a protected pattern. We'll see what that
shows later on.
It will indeed be weird and odd if it confirms that DCACHE_DISCONNECTED
revert is good. I agree that Dominique's 4bf46a272647 seems now more
likely, if still unlikely; but that was included in v4.1, and I saw
no problem with v4.1 once the rmap_walk() skip was fixed.
There may be some completely unrelated commit which alters the
timing enough to expose or mask whatever is the guilty commit.
Or something corrupting dentry->d_flags occasionally.
Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists