lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Sat, 25 May 2019 10:42:03 +0800
From:   Yang Shi <yang.shi@...ux.alibaba.com>
To:     Hillf Danton <hdanton@...a.com>
Cc:     ying.huang@...el.com, hannes@...xchg.org, mhocko@...e.com,
        mgorman@...hsingularity.net, kirill.shutemov@...ux.intel.com,
        josef@...icpanda.com, hughd@...gle.com, shakeelb@...gle.com,
        akpm@...ux-foundation.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [v4 PATCH 2/2] mm: vmscan: correct some vmscan counters for



On 5/24/19 2:00 PM, Yang Shi wrote:
>
>
> On 5/24/19 1:51 PM, Hillf Danton wrote:
>> On Fri, 24 May 2019 09:27:02 +0800 Yang Shi wrote:
>>> On 5/23/19 11:51 PM, Hillf Danton wrote:
>>>> On Thu, 23 May 2019 10:27:38 +0800 Yang Shi wrote:
>>>>> @ -1642,14 +1650,14 @@ static unsigned long 
>>>>> isolate_lru_pages(unsigned long nr_to_scan,
>>>>>        unsigned long nr_zone_taken[MAX_NR_ZONES] = { 0 };
>>>>>        unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
>>>>>        unsigned long skipped = 0;
>>>>> -    unsigned long scan, total_scan, nr_pages;
>>>>> +    unsigned long scan, total_scan;
>>>>> +    unsigned long nr_pages;
>>>> Change for no earn:)
>>> Aha, yes.
>>>
>>>>>        LIST_HEAD(pages_skipped);
>>>>>        isolate_mode_t mode = (sc->may_unmap ? 0 : ISOLATE_UNMAPPED);
>>>>> +    total_scan = 0;
>>>>>        scan = 0;
>>>>> -    for (total_scan = 0;
>>>>> -         scan < nr_to_scan && nr_taken < nr_to_scan && 
>>>>> !list_empty(src);
>>>>> -         total_scan++) {
>>>>> +    while (scan < nr_to_scan && !list_empty(src)) {
>>>>>            struct page *page;
>>>> AFAICS scan currently prevents us from looping for ever, while 
>>>> nr_taken bails
>>>> us out once we get what's expected, so I doubt it makes much sense 
>>>> to cut
>>>> nr_taken off.
>>> It is because "scan < nr_to_scan && nr_taken >= nr_to_scan" is
>>> impossible now with the units fixed.
>>>
>> With the units fixed, nr_taken is no longer checked.
>
> It is because scan would be always >= nr_taken.
>
>>
>>>>>            page = lru_to_page(src);
>>>>> @@ -1657,9 +1665,12 @@ static unsigned long 
>>>>> isolate_lru_pages(unsigned long nr_to_scan,
>>>>>            VM_BUG_ON_PAGE(!PageLRU(page), page);
>>>>> +        nr_pages = 1 << compound_order(page);
>>>>> +        total_scan += nr_pages;
>>>>> +
>>>>>            if (page_zonenum(page) > sc->reclaim_idx) {
>>>>>                list_move(&page->lru, &pages_skipped);
>>>>> -            nr_skipped[page_zonenum(page)]++;
>>>>> +            nr_skipped[page_zonenum(page)] += nr_pages;
>>>>>                continue;
>>>>>            }
>>>>> @@ -1669,10 +1680,9 @@ static unsigned long 
>>>>> isolate_lru_pages(unsigned long nr_to_scan,
>>>>>             * ineligible pages.  This causes the VM to not reclaim 
>>>>> any
>>>>>             * pages, triggering a premature OOM.
>>>>>             */
>>>>> -        scan++;
>>>>> +        scan += nr_pages;
>>>> The comment looks to defy the change if we fail to add a huge page to
>>>> the dst list; otherwise nr_taken knows how to do the right thing. What
>>>> I prefer is to let scan to do one thing a time.
>>> I don't get your point. Do you mean the comment "Do not count skipped
>>> pages because that makes the function return with no isolated pages if
>>> the LRU mostly contains ineligible pages."? I'm supposed the comment is
>>> used to explain why not count skipped page.
>>>
>> Well consider the case where there is a huge page in the second place
>> reversely on the src list along with other 20 regular pages, and we are
>> not able to add the huge page to the dst list. Currently we can go on 
>> and
>> try to scan other pages, provided nr_to_scan is 32; with the units 
>> fixed,
>> however, scan goes over nr_to_scan, leaving us no chance to scan any 
>> page
>> that may be not busy. I wonder that triggers a premature OOM, because I
>> think scan means the number of list nodes we try to isolate, and
>> nr_taken the number of regular pages successfully isolated.
>
> Yes, good point. I think I just need roll back to what v3 did here to 
> get scan accounted for each case separately to avoid the possible 
> over-account.

By rethinking the code, I think "scan" here still should mean the number 
of base pages. If the case you mentioned happens, the right behavior 
should be to raise priority to give another round of scan.

And, vmscan uses sync isolation (mode = (sc->may_unmap ? 0 : 
ISOLATE_UNMAPPED)), it returns -EBUSY only when the page is freed 
somewhere else, so this should not cause premature OOM.

>
>>>>>            switch (__isolate_lru_page(page, mode)) {
>>>>>            case 0:
>>>>> -            nr_pages = hpage_nr_pages(page);
>>>>>                nr_taken += nr_pages;
>>>>>                nr_zone_taken[page_zonenum(page)] += nr_pages;
>>>>>                list_move(&page->lru, dst);
>>>>> -- 
>>>>> 1.8.3.1
>> Best Regards
>> Hillf
>

Powered by blists - more mailing lists