lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 18 Mar 2019 13:38:01 +0300
From:   Kirill Tkhai <ktkhai@...tuozzo.com>
To:     Pankaj Suryawanshi <pankaj.suryawanshi@...fochips.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Michal Hocko <mhocko@...nel.org>,
        "aneesh.kumar@...ux.ibm.com" <aneesh.kumar@...ux.ibm.com>
Cc:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "minchan@...nel.org" <minchan@...nel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "khandual@...ux.vnet.ibm.com" <khandual@...ux.vnet.ibm.com>
Subject: Re: [External] Re: vmscan: Reclaim unevictable pages

On 18.03.2019 12:59, Pankaj Suryawanshi wrote:
>   
> From: Kirill Tkhai <ktkhai@...tuozzo.com>
> Sent: 18 March 2019 15:17:56
> To: Pankaj Suryawanshi; Vlastimil Babka; Michal Hocko; aneesh.kumar@...ux.ibm.com
> Cc: linux-kernel@...r.kernel.org; minchan@...nel.org; linux-mm@...ck.org; khandual@...ux.vnet.ibm.com; hillf.zj@...baba-inc.com
> Subject: Re: [External] Re: vmscan: Reclaim unevictable pages

Also, please, avoid irritating quoting like below ^^^. They just distract attention.
 
> On 18.03.2019 12:43, Pankaj Suryawanshi wrote:
>> Hi Kirill Tkhai,
>>
> 
> Please, do not top posting:  https://kernelnewbies.org/mailinglistguidelines
> 
> Okay.
> 
> mailinglistguidelines - Linux Kernel Newbies
> kernelnewbies.org
> Set of FAQs for kernelnewbies mailing list. If you are new to this list please read this page before you go on your quest for squeezing all the knowledge from fellow members.

And this spew ^^^.

>> Please see mm/vmscan.c in which it first added to list and than throw the error :
>> --------------------------------------------------------------------------------------------------
>> keep:
>>                  list_add(&page->lru, &ret_pages);
>>                  VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page), page);
>> ---------------------------------------------------------------------------------------------------
>>
>> Before throwing error, pages are added to list, this is under iteration of shrink_page_list().
> 
> I say about about the list, which is passed to shrink_page_list() as first argument.
> Did you mean candidate list which is passed to shrink_page_list().
> 
> shrink_inactive_list()
> {
>         isolate_lru_pages(&page_list); // <-- you can't obtain unevictable pages here.
>         shrink_page_list(&page_list);
> }
> 
> below is the overview of flow of calls for your information.
> 
> cma_alloc() ->
> alloc_contig_range() ->
> start_isolate_page_range() ->
> __alloc_contig_migrate_range() ->
> isolate_migratepages_range() ->
> reclaim_clean_pages_from_list() ->
> shrink_page_list()

Hm, isolate_migratepages_range() can take unevictable pages,
but then with your patch we just skip them in shrink_page_list().
Without your patch we bump into bug on.

These both look incorrect for me. Let's wait someone who familiar
with this logic.
  
>> From: Kirill Tkhai <ktkhai@...tuozzo.com>
>> Sent: 18 March 2019 15:03:15
>> To: Pankaj Suryawanshi; Vlastimil Babka; Michal Hocko; aneesh.kumar@...ux.ibm.com
>> Cc: linux-kernel@...r.kernel.org; minchan@...nel.org; linux-mm@...ck.org; khandual@...ux.vnet.ibm.com; hillf.zj@...baba-inc.com
>> Subject: Re: [External] Re: vmscan: Reclaim unevictable pages
>>   
>>
>> Hi, Pankaj,
>>
>> On 18.03.2019 12:09, Pankaj Suryawanshi wrote:
>>>
>>> Hello
>>>
>>> shrink_page_list() returns , number of pages reclaimed, when pages is unevictable it returns VM_BUG_ON_PAGE(PageLRU(page) || PageUnevicatble(page),page);
>>
>> the general idea is shrink_page_list() can't iterate PageUnevictable() pages.
>> PageUnevictable() pages are never being added to lists, which shrink_page_list()
>> uses for iteration. Also, a page can't be marked as PageUnevictable(), when
>> it's attached to a shrinkable list.
>>
>> So, the problem should be somewhere outside shrink_page_list().
>>
>> I won't suggest you something about CMA, since I haven't dived in that code.
>>
>>> We can add the unevictable pages in reclaim list in shrink_page_list(), return total number of reclaim pages including unevictable pages, let the caller handle unevictable pages.
>>>
>>> I think the problem is shrink_page_list is awkard. If page is unevictable it goto activate_locked->keep_locked->keep lables, keep lable list_add the unevictable pages and throw the VM_BUG instead of passing it to caller while it relies on caller for non-reclaimed-non-unevictable    page's putback.
>>> I think we can make it consistent so that shrink_page_list could return non-reclaimed pages via page_list and caller can handle it. As an advance, it could try to migrate mlocked pages without retrial.
>>>
>>>
>>> Below is the issue of CMA_ALLOC of large size buffer : (Kernel version - 4.14.65 (On Android pie [ARM])).
>>>
>>> [   24.718792] page dumped because: VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page))
>>> [   24.726949] page->mem_cgroup:bd008c00
>>> [   24.730693] ------------[ cut here ]------------
>>> [   24.735304] kernel BUG at mm/vmscan.c:1350!
>>> [   24.739478] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
>>>
>>>
>>> Below is the patch which solved this issue :
>>>
>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> index be56e2e..12ac353 100644
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -998,7 +998,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>>>                 sc->nr_scanned++;
>>>  
>>>                 if (unlikely(!page_evictable(page)))
>>> -                       goto activate_locked;
>>> +                      goto cull_mlocked;
>>>  
>>>                 if (!sc->may_unmap && page_mapped(page))
>>>                         goto keep_locked;
>>> @@ -1331,7 +1331,12 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>>>                 } else
>>>                         list_add(&page->lru, &free_pages);
>>>                 continue;
>>> -
>>> +cull_mlocked:
>>> +                if (PageSwapCache(page))
>>> +                        try_to_free_swap(page);
>>> +                unlock_page(page);
>>> +                list_add(&page->lru, &ret_pages);
>>> +                continue;
>>>  activate_locked:
>>>                 /* Not a candidate for swapping, so reclaim swap space. */
>>>                 if (PageSwapCache(page) && (mem_cgroup_swap_full(page) ||
>>>
>>>
>>>
>>>
>>> It fixes the below issue.
>>>
>>> 1. Large size buffer allocation using cma_alloc successful with unevictable pages.
>>>
>>> cma_alloc of current kernel will fail due to unevictable page
>>>
>>> Please let me know if anything i am missing.
>>>
>>> Regards,
>>> Pankaj
>>>    
>>> From: Vlastimil Babka <vbabka@...e.cz>
>>> Sent: 18 March 2019 14:12:50
>>> To: Pankaj Suryawanshi; Kirill Tkhai; Michal Hocko; aneesh.kumar@...ux.ibm.com
>>> Cc: linux-kernel@...r.kernel.org; minchan@...nel.org; linux-mm@...ck.org; khandual@...ux.vnet.ibm.com; hillf.zj@...baba-inc.com
>>> Subject: Re: [External] Re: vmscan: Reclaim unevictable pages
>>>   
>>>
>>> On 3/15/19 11:11 AM, Pankaj Suryawanshi wrote:
>>>>
>>>> [ cc Aneesh kumar, Anshuman, Hillf, Vlastimil]
>>>
>>> Can you send a proper patch with changelog explaining the change? I
>>> don't know the context of this thread.
>>>
>>>> From: Pankaj Suryawanshi
>>>> Sent: 15 March 2019 11:35:05
>>>> To: Kirill Tkhai; Michal Hocko
>>>> Cc: linux-kernel@...r.kernel.org; minchan@...nel.org; linux-mm@...ck.org
>>>> Subject: Re: Re: [External] Re: vmscan: Reclaim unevictable pages
>>>>
>>>>
>>>>
>>>> [ cc linux-mm ]
>>>>
>>>>
>>>> From: Pankaj Suryawanshi
>>>> Sent: 14 March 2019 19:14:40
>>>> To: Kirill Tkhai; Michal Hocko
>>>> Cc: linux-kernel@...r.kernel.org; minchan@...nel.org
>>>> Subject: Re: Re: [External] Re: vmscan: Reclaim unevictable pages
>>>>
>>>>
>>>>
>>>> Hello ,
>>>>
>>>> Please ignore the curly braces, they are just for debugging.
>>>>
>>>> Below is the updated patch.
>>>>
>>>>
>>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>>> index be56e2e..12ac353 100644
>>>> --- a/mm/vmscan.c
>>>> +++ b/mm/vmscan.c
>>>> @@ -998,7 +998,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>>>>                  sc->nr_scanned++;
>>>>
>>>>                  if (unlikely(!page_evictable(page)))
>>>> -                       goto activate_locked;
>>>> +                      goto cull_mlocked;
>>>>
>>>>                  if (!sc->may_unmap && page_mapped(page))
>>>>                          goto keep_locked;
>>>> @@ -1331,7 +1331,12 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>>>>                  } else
>>>>                          list_add(&page->lru, &free_pages);
>>>>                  continue;
>>>> -
>>>> +cull_mlocked:
>>>> +                if (PageSwapCache(page))
>>>> +                        try_to_free_swap(page);
>>>> +                unlock_page(page);
>>>> +                list_add(&page->lru, &ret_pages);
>>>> +                continue;
>>>>   activate_locked:
>>>>                  /* Not a candidate for swapping, so reclaim swap space. */
>>>>                  if (PageSwapCache(page) && (mem_cgroup_swap_full(page) ||
>>>>
>>>>
>>>>
>>>> Regards,
>>>> Pankaj
>>>>
>>>>
>>>> From: Kirill Tkhai <ktkhai@...tuozzo.com>
>>>> Sent: 14 March 2019 14:55:34
>>>> To: Pankaj Suryawanshi; Michal Hocko
>>>> Cc: linux-kernel@...r.kernel.org; minchan@...nel.org
>>>> Subject: Re: Re: [External] Re: vmscan: Reclaim unevictable pages
>>>>
>>>>
>>>> On 14.03.2019 11:52, Pankaj Suryawanshi wrote:
>>>>>
>>>>> I am using kernel version 4.14.65 (on Android pie [ARM]).
>>>>>
>>>>> No additional patches applied on top of vanilla.(Core MM).
>>>>>
>>>>> If  I change in the vmscan.c as below patch, it will work.
>>>>
>>>> Sorry, but 4.14.65 does not have braces around trylock_page(),
>>>> like in your patch below.
>>>>
>>>> See        https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/mm/vmscan.c?h=v4.14.65
>>>>
>>>> [...]
>>>>
>>>>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>>>>> index be56e2e..2e51edc 100644
>>>>>> --- a/mm/vmscan.c
>>>>>> +++ b/mm/vmscan.c
>>>>>> @@ -990,15 +990,17 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>>>>>>                   page = lru_to_page(page_list);
>>>>>>                   list_del(&page->lru);
>>>>>>
>>>>>>                  if (!trylock_page(page)) {
>>>>>>                           goto keep;
>>>>>>                  }
>>>>
>>>> ************************************************************************************************************************************************************* eInfochips Business Disclaimer: This e-mail message and all attachments transmitted with it are   intended  solely for the use of the addressee and may contain legally privileged and confidential information. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient,   you  are hereby notified that any dissemination, distribution, copying, or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately by replying to this message and   please  delete it from your computer. Any views expressed in this message are those of the individual sender unless otherwise stated. Company has taken enough precautions to prevent the spread of viruses. However the company accepts no liability for any damage   caused  by any virus transmitted by this email. *************************************************************************************************************************************************************
>>>>
>>>
>>>      
>>>
>>      
>>
>     
> 

Powered by blists - more mailing lists