[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <26cb01ff-a094-79f4-7ceb-291e5e053c58@suse.cz>
Date: Mon, 22 Oct 2018 15:15:38 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Michal Hocko <mhocko@...nel.org>,
"Kirill A. Shutemov" <kirill@...temov.name>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Mel Gorman <mgorman@...e.de>,
David Rientjes <rientjes@...gle.com>,
Andrea Argangeli <andrea@...nel.org>,
Zi Yan <zi.yan@...rutgers.edu>,
Stefan Priebe - Profihost AG <s.priebe@...fihost.ag>,
linux-mm@...ck.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] mm, thp: consolidate THP gfp handling into
alloc_hugepage_direct_gfpmask
On 9/26/18 4:22 PM, Michal Hocko wrote:
> On Wed 26-09-18 16:17:08, Michal Hocko wrote:
>> On Wed 26-09-18 16:30:39, Kirill A. Shutemov wrote:
>>> On Tue, Sep 25, 2018 at 02:03:26PM +0200, Michal Hocko wrote:
>>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>>> index c3bc7e9c9a2a..c0bcede31930 100644
>>>> --- a/mm/huge_memory.c
>>>> +++ b/mm/huge_memory.c
>>>> @@ -629,21 +629,40 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf,
>>>> * available
>>>> * never: never stall for any thp allocation
>>>> */
>>>> -static inline gfp_t alloc_hugepage_direct_gfpmask(struct vm_area_struct *vma)
>>>> +static inline gfp_t alloc_hugepage_direct_gfpmask(struct vm_area_struct *vma, unsigned long addr)
>>>> {
>>>> const bool vma_madvised = !!(vma->vm_flags & VM_HUGEPAGE);
>>>> + gfp_t this_node = 0;
>>>> +
>>>> +#ifdef CONFIG_NUMA
>>>> + struct mempolicy *pol;
>>>> + /*
>>>> + * __GFP_THISNODE is used only when __GFP_DIRECT_RECLAIM is not
>>>> + * specified, to express a general desire to stay on the current
>>>> + * node for optimistic allocation attempts. If the defrag mode
>>>> + * and/or madvise hint requires the direct reclaim then we prefer
>>>> + * to fallback to other node rather than node reclaim because that
>>>> + * can lead to excessive reclaim even though there is free memory
>>>> + * on other nodes. We expect that NUMA preferences are specified
>>>> + * by memory policies.
>>>> + */
>>>> + pol = get_vma_policy(vma, addr);
>>>> + if (pol->mode != MPOL_BIND)
>>>> + this_node = __GFP_THISNODE;
>>>> + mpol_cond_put(pol);
>>>> +#endif
>>>
>>> I'm not very good with NUMA policies. Could you explain in more details how
>>> the code above is equivalent to the code below?
>>
>> MPOL_PREFERRED is handled by policy_node() before we call __alloc_pages_nodemask.
>> __GFP_THISNODE is applied only when we are not using
>> __GFP_DIRECT_RECLAIM which is handled in alloc_hugepage_direct_gfpmask
>> now.
>> Lastly MPOL_BIND wasn't handled explicitly but in the end the removed
>> late check would remove __GFP_THISNODE for it as well. So in the end we
>> are doing the same thing unless I miss something
>
> Forgot to add. One notable exception would be that the previous code
> would allow to hit
> WARN_ON_ONCE(policy->mode == MPOL_BIND && (gfp & __GFP_THISNODE));
> in policy_node if the requested node (e.g. cpu local one) was outside of
> the mbind nodemask. This is not possible now. We haven't heard about any
> such warning yet so it is unlikely that it happens though.
I don't think the previous code could hit the warning, as the hugepage
path that would add __GFP_THISNODE didn't call policy_node() (containing
the warning) at all. IIRC early of your patch did hit the warning
though, which is why you added the MPOL_BIND policy check.
Powered by blists - more mailing lists