linux-kernel - Re: [PATCH] afs: Fix ENOSPC, EDQUOT and other errors to fail a write rather than retrying

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YYNR2t+RmtFd+bT/@casper.infradead.org>
Date:   Thu, 4 Nov 2021 03:22:02 +0000
From:   Matthew Wilcox <willy@...radead.org>
To:     David Howells <dhowells@...hat.com>
Cc:     marc.dionne@...istor.com, Jeffrey E Altman <jaltman@...istor.com>,
        linux-afs@...ts.infradead.org, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, Jeff Layton <jlayton@...nel.org>
Subject: Re: [PATCH] afs: Fix ENOSPC, EDQUOT and other errors to fail a write
 rather than retrying

On Wed, Nov 03, 2021 at 11:43:20PM +0000, David Howells wrote:
> Currently, at the completion of a storage RPC from writepages, the errors
> ENOSPC, EDQUOT, ENOKEY, EACCES, EPERM, EKEYREJECTED and EKEYREVOKED cause
> the pages involved to be redirtied and the write to be retried by the VM at
> a future time.
> 
> However, this is probably not the right thing to do, and, instead, the
> writes should be discarded so that the system doesn't get blocked (though
> unmounting will discard the uncommitted writes anyway).

umm.  I'm not sure that throwing away the write is the best answer
for some of these errors.  Our whole story around error handling in
filesystems, the page cache and the VFS is pretty sad, but I don't think
that this is the right approach.

Ideally, we'd hold onto the writes in the page cache until (eg for ENOSPC
/ EDQUOT), the user has deleted some files, then retry the writes.

We should definitely stop the user dirtying more pages on this mount,
or at least throttle processes which are dirtying new pages (eg in
folio_mark_dirty()), which implies a check of the superblock.  Until the
ENOSPC is cleared up, at which time writeback can resume ... of course,
the server won't necessarily notify us when it is cleared up (because
it might be due to a different client filling the storage), so we might
need to peridically re-attempt writeback so that we know whether ENOSPC
has been resolved.