[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <45f1bffe-8a0b-2969-32d4-e24a911a647d@redhat.com>
Date: Fri, 19 Feb 2021 10:30:12 +0100
From: David Hildenbrand <david@...hat.com>
To: Michal Hocko <mhocko@...e.com>, Minchan Kim <minchan@...nel.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
linux-mm <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>, joaodias@...gle.com
Subject: Re: [PATCH] mm: be more verbose for alloc_contig_range faliures
On 19.02.21 10:28, Michal Hocko wrote:
> On Thu 18-02-21 08:19:50, Minchan Kim wrote:
>> On Thu, Feb 18, 2021 at 10:43:21AM +0100, David Hildenbrand wrote:
>>> On 18.02.21 10:35, Michal Hocko wrote:
>>>> On Thu 18-02-21 10:02:43, David Hildenbrand wrote:
>>>>> On 18.02.21 09:56, Michal Hocko wrote:
>>>>>> On Wed 17-02-21 08:36:03, Minchan Kim wrote:
>>>>>>> alloc_contig_range is usually used on cma area or movable zone.
>>>>>>> It's critical if the page migration fails on those areas so
>>>>>>> dump more debugging message like memory_hotplug unless user
>>>>>>> specifiy __GFP_NOWARN.
>>>>>>
>>>>>> I agree with David that this has a potential to generate a lot of output
>>>>>> and it is not really clear whether it is worth it. Page isolation code
>>>>>> already has REPORT_FAILURE mode which currently used only for the memory
>>>>>> hotplug because this was just too noisy from the CMA path - d381c54760dc
>>>>>> ("mm: only report isolation failures when offlining memory").
>>>>>>
>>>>>> Maybe migration failures are less likely to fail but still.
>>>>>
>>>>> Side note: I really dislike that uncontrolled error reporting on memory
>>>>> offlining path we have enabled as default. Yeah, it might be useful for
>>>>> ZONE_MOVABLE in some cases, but otherwise it's just noise.
>>>>>
>>>>> Just do a "sudo stress-ng --memhotplug 1" and see the log getting flooded
>>>>
>>>> Anyway we can discuss this in a separate thread but I think this is not
>>>> a representative workload.
>>>
>>> Sure, but the essence is "this is noise", and we'll have more noise on
>>> alloc_contig_range() as we see these calls more frequently. There should be
>>> an explicit way to enable such *debug* messages.
>>
>> alloc_contig_range already has gfp_mask and it respects __GFP_NOWARN.
>> Why shouldn't people use it if they don't care the failure?
>> Semantically, it makes sense to me.
>
> Well, alloc_contig_range doesn't really have to implement all the gfp
> flags. This is a matter of practicality. alloc_contig_range is quite
> different from the page allocator because it is to be expected that it
> can fail the request. This is avery optimistic allocation request. That
> would suggest that complaining about allocation failures is rather
> noisy.
>
> Now I do understand that some users would like to see why those
> allocations have failed. The question is whether that information is
> generally useful or it is more of a debugging aid. The amount of
> information is also an important aspect. It would be rather unfortunate
> to dump thousands of pages just because they cannot be migrated.
>
> I do not have a strong opinion here. We can make all alloc_contig_range
> users use GFP_NOWARN by default and only skip the flag from the cma
> allocator but I am slowly leaning towards (ab)using dynamic debugging
> infrastructure for this.
Just so I understand what you are referring to - trace points?
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists