[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <877f4cr7ew.fsf@notabene.neil.brown.name>
Date: Mon, 27 Feb 2017 08:03:51 +1100
From: NeilBrown <neilb@...e.com>
To: James Bottomley <James.Bottomley@...senPartnership.com>,
Jeff Layton <jlayton@...hat.com>,
linux-mm <linux-mm@...ck.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Cc: lsf-pc <lsf-pc@...ts.linuxfoundation.org>,
Neil Brown <neilb@...e.de>,
linux-scsi <linux-scsi@...r.kernel.org>,
linux-block@...r.kernel.org
Subject: Re: [LSF/MM TOPIC] do we really need PG_error at all?
On Sun, Feb 26 2017, James Bottomley wrote:
> [added linux-scsi and linux-block because this is part of our error
> handling as well]
> On Sun, 2017-02-26 at 09:42 -0500, Jeff Layton wrote:
>> Proposing this as a LSF/MM TOPIC, but it may turn out to be me just
>> not understanding the semantics here.
>>
>> As I was looking into -ENOSPC handling in cephfs, I noticed that
>> PG_error is only ever tested in one place [1]
>> __filemap_fdatawait_range, which does this:
>>
>> if (TestClearPageError(page))
>> ret = -EIO;
>>
>> This error code will override any AS_* error that was set in the
>> mapping. Which makes me wonder...why don't we just set this error in
>> the mapping and not bother with a per-page flag? Could we potentially
>> free up a page flag by eliminating this?
>
> Note that currently the AS_* codes are only set for write errors not
> for reads and we have no mapping error handling at all for swap pages,
> but I'm sure this is fixable.
How is a read error different from a failure to set PG_uptodate?
Does PG_error suppress retries?
>
> From the I/O layer point of view we take great pains to try to pinpoint
> the error exactly to the sector. We reflect this up by setting the
> PG_error flag on the page where the error occurred. If we only set the
> error on the mapping, we lose that granularity, because the mapping is
> mostly at the file level (or VMA level for anon pages).
Are you saying that the IO layer finds the page in the bi_io_vec and
explicitly sets PG_error, rather than just passing an error indication
to bi_end_io ?? That would seem to be wrong as the page may not be in
the page cache. So I guess I misunderstand you.
>
> So I think the question for filesystem people from us would be do you
> care about this accuracy? If it's OK just to know an error occurred
> somewhere in this file, then perhaps we don't need it.
I had always assumed that a bio would either succeed or fail, and that
no finer granularity could be available.
I think the question here is: Do filesystems need the pagecache to
record which pages have seen an IO error?
I think that for write errors, there is no value in recording
block-oriented error status - only file-oriented status.
For read errors, it might if help to avoid indefinite read retries, but
I don't know the code well enough to be sure if this is an issue.
NeilBrown
>
> James
>
>> The main argument I could see for keeping it is that removing it
>> might subtly change the behavior of sync_file_range if you have tasks
>> syncing different ranges in a file concurrently. I'm not sure if that
>> would break any guarantees though.
>>
>> Even if we do need it, I think we might need some cleanup here
>> anyway. A lot of readpage operations end up setting that flag when
>> they hit an error. Isn't it wrong to return an error on fsync, just
>> because we had a read error somewhere in the file in a range that was
>> never dirtied?
>>
>> --
>> [1]: there is another place in f2fs, but it's more or less equivalent
>> to the call site in __filemap_fdatawait_range.
>>
Download attachment "signature.asc" of type "application/pgp-signature" (833 bytes)
Powered by blists - more mailing lists