[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <92938a5c-ad19-4571-04f3-03530f687f61@redhat.com>
Date: Wed, 1 Apr 2020 18:15:59 +0200
From: David Hildenbrand <david@...hat.com>
To: Michal Hocko <mhocko@...nel.org>,
Daniel Jordan <daniel.m.jordan@...cle.com>
Cc: Shile Zhang <shile.zhang@...ux.alibaba.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Kirill Tkhai <ktkhai@...tuozzo.com>,
Pavel Tatashin <pasha.tatashin@...een.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] mm: fix tick timer stall during deferred page init
On 01.04.20 18:12, Michal Hocko wrote:
> On Wed 01-04-20 12:09:29, Daniel Jordan wrote:
>> On Wed, Apr 01, 2020 at 06:00:48PM +0200, Michal Hocko wrote:
>>> On Wed 01-04-20 17:50:22, David Hildenbrand wrote:
>>>> On 01.04.20 17:42, Michal Hocko wrote:
>>>>> I am sorry but I have completely missed this patch.
>>>>>
>>>>> On Wed 11-03-20 20:38:48, Shile Zhang wrote:
>>>>>> When 'CONFIG_DEFERRED_STRUCT_PAGE_INIT' is set, 'pgdatinit' kthread will
>>>>>> initialise the deferred pages with local interrupts disabled. It is
>>>>>> introduced by commit 3a2d7fa8a3d5 ("mm: disable interrupts while
>>>>>> initializing deferred pages").
>>>>>>
>>>>>> On machine with NCPUS <= 2, the 'pgdatinit' kthread could be bound to
>>>>>> the boot CPU, which could caused the tick timer long time stall, system
>>>>>> jiffies not be updated in time.
>>>>>>
>>>>>> The dmesg shown that:
>>>>>>
>>>>>> [ 0.197975] node 0 initialised, 32170688 pages in 1ms
>>>>>>
>>>>>> Obviously, 1ms is unreasonable.
>>>>>>
>>>>>> Now, fix it by restore in the pending interrupts for every 32*1204 pages
>>>>>> (128MB) initialized, give the chance to update the systemd jiffies.
>>>>>> The reasonable demsg shown likes:
>>>>>>
>>>>>> [ 1.069306] node 0 initialised, 32203456 pages in 894ms
>>>>>>
>>>>>> Fixes: 3a2d7fa8a3d5 ("mm: disable interrupts while initializing deferred pages").
>>>>>
>>>>> I dislike this solution TBH. It effectivelly conserves the current code
>>>>> and just works around the problem. Why do we hold the IRQ lock here in
>>>>> the first place? This is an early init code and a very limited code is
>>>>> running at this stage. Certainly nothing memory hotplug related which
>>>>> should be the only path really interested in the resize lock AFAIR.
>>>>
>>>> Yeah, I don't think ACPI and friends are up yet.
>>>
>>> Just to save somebody time to check. The deferred initialization blocks
>>> the further boot until all workders are done - see page_alloc_init_late
>>> (kernel_init path).
>>
>> Ha, I just finished following all the hotplug paths to check this out, and as
>> you all know there are a *lot* :-) Well at least we're in agreement.
>
> Good to have it double checked!
>
>>>>> This needs a double checking but I strongly believe that the lock can be
>>>>> simply dropped in this path.
>>
>> This is what my fix does, it limits the time the resize lock is held.
>
> Just remove it from the deferred intialization and add a comment that we
> deliberately not taking the lock here because abc
>
We should just double check what
commit 3a2d7fa8a3d5ae740bd0c21d933acc6220857ed0
Author: Pavel Tatashin <pasha.tatashin@...cle.com>
Date: Thu Apr 5 16:22:27 2018 -0700
mm: disable interrupts while initializing deferred pages
Vlastimil Babka reported about a window issue during which when deferred
pages are initialized, and the current version of on-demand
initialization is finished, allocations may fail. While this is highly
unlikely scenario, since this kind of allocation request must be large,
and must come from interrupt handler, we still want to cover it.
We solve this by initializing deferred pages with interrupts disabled,
and holding node_size_lock spin lock while pages in the node are being
initialized. The on-demand deferred page initialization that comes
later will use the same lock, and thus synchronize with
deferred_init_memmap().
It is unlikely for threads that initialize deferred pages to be
interrupted. They run soon after smp_init(), but before modules are
initialized, and long before user space programs. This is why there is
no adverse effect of having these threads running with interrupts
disabled.
tried to solve does not apply.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists