linux-kernel - Re: [PATCH v2] mm : sync ra->ra_pages with bdi->ra

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200821115744.GP17456@casper.infradead.org>
Date:   Fri, 21 Aug 2020 12:57:44 +0100
From:   Matthew Wilcox <willy@...radead.org>
To:     Zhaoyang Huang <huangzhaoyang@...il.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Minchan Kim <minchan@...nel.org>,
        Zhaoyang Huang <zhaoyang.huang@...soc.com>,
        "open list:MEMORY MANAGEMENT" <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>, chunyan.zhang@...soc.com,
        Baolin Wang <baolin.wang7@...il.com>
Subject: Re: [PATCH v2] mm : sync ra->ra_pages with bdi->ra_pages

On Fri, Aug 21, 2020 at 05:31:52PM +0800, Zhaoyang Huang wrote:
> This patch has been verified on an android system and reduces 15% of
> UNITERRUPTIBLE_SLEEP_BLOCKIO which was used to be caused by wrong
> ra->ra_pages.

Wait, what?  Readahead doesn't sleep on the pages it's requesting.
Unless ... your file access pattern is random, so you end up submitting
a readahead I/O that's bigger than needed, so takes longer for the page
you actually wanted to be returned.  I know we have the LOTSAMISS
logic, but that's not really enough.

OK, assuming this problem is really about sync mmap (ie executables),
this makes a bit more sense.  I think the real problem is here:

        ra->start = max_t(long, 0, offset - ra->ra_pages / 2);
        ra->size = ra->ra_pages;
        ra->async_size = ra->ra_pages / 4;
        ra_submit(ra, mapping, file);

which actually skips all the logic we have in ondemand_readahead()
for adjusting the readahead size.  Ugh, this is a mess.

I think a quick fix to your problem will be just replacing ra->ra_pages
with bdi->ra_pages in do_sync_mmap_readahead() and leaving ra->ra_pages
alone everywhere else.

We need a smarter readahead algorithm for mmap'ed files, and I don't have
time to work on it right now.  So let's stick to the same dumb algorithm,
but make it responsive to bdi ra_pages being reset.