lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 15 Oct 2014 16:07:42 -0700
From:	Jamie Liu <jamieliu@...gle.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Johannes Weiner <hannes@...xchg.org>, Mel Gorman <mgorman@...e.de>,
	Greg Thelen <gthelen@...gle.com>,
	Hugh Dickins <hughd@...gle.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: vmscan: count only dirty pages as congested

wait_iff_congested() only waits if ZONE_CONGESTED is set (and at least
one BDI is still congested). Modulo concurrent changes to BDI
congestion status:

After this change, the probability that a given shrink_inactive_list()
sets ZONE_CONGESTED increases monotonically with the fraction of dirty
pages on the LRU, to 100% if all dirty pages are backed by a
write-congested BDI. This is in line with what appears to intended,
judging by the comment:

/*
* Tag a zone as congested if all the dirty pages scanned were
* backed by a congested BDI and wait_iff_congested will stall.
*/
if (nr_dirty && nr_dirty == nr_congested)
set_bit(ZONE_CONGESTED, &zone->flags);

Before this change, the probability that a given
shrink_inactive_list() sets ZONE_CONGESTED varies erratically. Because
the ZONE_CONGESTED condition is nr_dirty && nr_dirty == nr_congested,
the probability peaks when the fraction of dirty pages is equal to the
fraction of file pages backed by congested BDIs. So under some
circumstances, an increase in the fraction of dirty pages or in the
fraction of congested pages can actually result in an *decreased*
probability that reclaim will stall for writeback congestion, and vice
versa; which is both counterintuitive and counterproductive.

On Wed, Oct 15, 2014 at 1:05 PM, Andrew Morton
<akpm@...ux-foundation.org> wrote:
> On Wed, 15 Oct 2014 12:58:35 -0700 Jamie Liu <jamieliu@...gle.com> wrote:
>
>> shrink_page_list() counts all pages with a mapping, including clean
>> pages, toward nr_congested if they're on a write-congested BDI.
>> shrink_inactive_list() then sets ZONE_CONGESTED if nr_dirty ==
>> nr_congested. Fix this apples-to-oranges comparison by only counting
>> pages for nr_congested if they count for nr_dirty.
>>
>> ...
>>
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -875,7 +875,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>>                * end of the LRU a second time.
>>                */
>>               mapping = page_mapping(page);
>> -             if ((mapping && bdi_write_congested(mapping->backing_dev_info)) ||
>> +             if (((dirty || writeback) && mapping &&
>> +                  bdi_write_congested(mapping->backing_dev_info)) ||
>>                   (writeback && PageReclaim(page)))
>>                       nr_congested++;
>
> What are the observed runtime effects of this change?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ