linux-kernel - Re: [RFC PATCH 0/3] Rework the gfs2 read and page fault locking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191125091741.firh7stqcpniwvga@box>
Date:   Mon, 25 Nov 2019 12:17:41 +0300
From:   "Kirill A. Shutemov" <kirill@...temov.name>
To:     Andreas Gruenbacher <agruenba@...hat.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Steven Whitehouse <swhiteho@...hat.com>,
        Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
        linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Johannes Weiner <hannes@...xchg.org>, cluster-devel@...hat.com,
        Ronnie Sahlberg <lsahlber@...hat.com>,
        Steve French <sfrench@...ba.org>,
        Bob Peterson <rpeterso@...hat.com>
Subject: Re: [RFC PATCH 0/3] Rework the gfs2 read and page fault locking

On Sat, Nov 23, 2019 at 12:53:21AM +0100, Andreas Gruenbacher wrote:
> Hello,
> 
> this patch series moves the glock lock taking in gfs2 from the
> ->readpage and ->readpages inode operations to the ->read_iter file and
> ->fault vm operations.  To achieve that, we add flags to the
> generic_file_read_iter and filemap_fault generic helpers.
> 
> This proposal was triggered by the following discussion:
> 
> https://lore.kernel.org/linux-fsdevel/157225677483.3442.4227193290486305330.stgit@buzz/
> 
> In that thread, Linus argued that filesystems should make sure the inode
> size is sufficiently up-to-date before calling the generic helpers, and
> that filesystems can do it themselves if they want more than that.
> That's surely doable.  However, implementing those operations properly
> at the filesystem level quickly becomes complicated when it gets to
> things like readahead.  In addition, those slightly modified copies of
> those helpers would surely diverge from their originals over time, and
> maintaining them properly would become hard.  So I hope the relatively
> small changes to make the original helpers slightly more flexible will
> be acceptable instead.
> 
> With the IOCB_CACHED flag added by one of the patches in this series,
> the code that Konstantin's initial patch adds to
> generic_file_buffered_read could be made conditional on the IOCB_CACHED
> flag being cleared.  That way, it won't misfire on filesystems that
> allow a stale inode size.  (I'm not sure if any filesystems other than
> gfs2 are actually affected.)
> 
> Some additional explanation:
> 
> The cache consistency model of filesystems like gfs2 is such that if
> pages are found in an inode's address space, those pages as well as the
> inode size are up to date and can be used without taking any filesystem
> locks.  If a page is not cached, filesystem locks must be taken before
> the page can be read; this will also bring the inode size up to date.
> 
> Thus far, gfs2 has taken the filesystem locks inside the ->readpage and
> ->readpages address space operations.  A better approach seems to be to
> take those locks earlier, in the ->read_iter file and ->fault vm
> operations.  This would also avoid a lock inversion in ->readpages.
> 
> We obviously want to avoid taking the filesystem locks unnecessarily
> when the pages we are looking for are already cached; otherwise, we
> would cripple performance.  So we need to check if those pages are
> present first.  That's actually exactly what the generic_file_read_iter
> and filemap_fault helpers do already anyway, except that they will call
> into ->readpage and ->readpages when they find pages missing.  Instead
> of that, we'd like those helpers to return with an error code that
> allows us to retry the operation after taking the filesystem locks.

Do you see IOCB_CACHED/FAULT_FLAG_CACHED semantics being usable for
anyting beyond gfs2?

-- 
 Kirill A. Shutemov