lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4ede47fd-d01e-6d9c-288a-2ec97b5392af@redhat.com>
Date:   Thu, 4 Mar 2021 11:32:29 +0100
From:   David Hildenbrand <david@...hat.com>
To:     Oscar Salvador <osalvador@...e.de>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Muchun Song <songmuchun@...edance.com>,
        Michal Hocko <mhocko@...nel.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 1/2] mm: Make alloc_contig_range handle free hugetlb
 pages

>> I think we should not swallo such return values in
>> isolate_or_dissolve_huge_page() and instead properly report esp. -ENOMEM
>> properly up this callchain now. Otherwise we'll end up retrying / reporting
>> -EBUSY, which is misleading.
> 
> I am not sure I follow you here.
> So, atm, alloc_and_dissolve_huge_page can either generate -ENOMEM or
> -EBUSY wrt. error codes.
> -ENOMEM when we cannot allocate a page, and -EBUSY when we raced with
> someone.
> You mean to only report ENOMEM down the chain?

Yes, fatal errors.

> 
>>  From isolate_migratepages_range()/isolate_migratepages_block() we'll keep
>> reporting "pfn > 0".
>>
>> a) In isolate_migratepages_range() we'll keep iterating over pageblocks
>> although we should just fail with -ENOMEM right away.
>>
>> b) In __alloc_contig_migrate_range() we'll keep retrying up to 5 times
>> although we should just fail with -ENOMEM. We end up returning "-EBUSY"
>> after retrying.
>>
>> c) In alloc_contig_range() we'll continue trying to isolate although we
>> should just return -ENOMEM.
> 
> Yes, "fatal" errors get masked, and hence we treat everything as "things
> are busy, let us try again", and this is rather unforunate.
> 
>> I think we have should start returning proper errors from
>> isolate_migratepages_range()/isolate_migratepages_block() on critical issues
>> (-EINTR, -ENOMEM) instead of going via "!pfn vs. pfn" and retrying on "pfn".
>>
>> So we should then fail with -ENOMEM during isolate_migratepages_range()
>> cleanly, just as we would do when we get -ENOMEM during migrate_pages().
> 
> I guess we could rework the interface and make isolate_migratepages_range and
> isolate_migratepages_block to report the right thing.
> I yet have to check that this does not mess up a lot with the compaction
> interface.
> 
> But overall I agree with your point here, and I am willing to to tackle
> this.
> 
> The question is whether we want to do this as part of this series, or
> after this series gets picked up.
> IMHO, we could do it after, as a follow-up, unless you feel strong about
> it.
> 
> What do you think?

I think this is now the second fatal error we can have (-EINTR, 
-ENOMEM), thus the current interface (return "NULL" on fatal errros) no 
longer works properly.

No strong opinion about fixing this up on top - could be it's cleaner to 
just do it right away.

-- 
Thanks,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ