linux-kernel - Re: [PATCH v3] mm: fix tick timer stall during deferred page init

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <427ce795-0274-56e5-16e4-7be00c7145f7@linux.alibaba.com>
Date:   Fri, 27 Mar 2020 12:39:18 +0800
From:   Shile Zhang <shile.zhang@...ux.alibaba.com>
To:     Pavel Tatashin <pasha.tatashin@...een.com>,
        Daniel Jordan <daniel.m.jordan@...cle.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Kirill Tkhai <ktkhai@...tuozzo.com>,
        linux-mm <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3] mm: fix tick timer stall during deferred page init



On 2020/3/27 03:36, Pavel Tatashin wrote:
> I agree with Daniel, we should look into approach where
> pgdat_resize_lock is taken only for the duration of updating tracking
> values such as pgdat->first_deferred_pfn (perhaps we would need to add
> another tracker that would show chunks that are currently being worked
> on).
>
> The vast duration of struct page initialization process should happen
> outside of this lock, and only be taken when we update globally seen
> data structures: lists, tracking variables. This way we can solve
> several problems: 1. allow interrupt threads to grow zones if
> required. 2. keep jiffies happy. 3. allow future scaling when we will
> add inner node threads to initialize struct pages (i.e. ktasks from
> Daniel).

It make sense, looking forward to the inner node parallel init.

@Daniel
Is there schedule about ktasks?

Thanks!

>
> Pasha
>
> On Thu, Mar 26, 2020 at 2:58 PM Daniel Jordan
> <daniel.m.jordan@...cle.com> wrote:
>> On Thu, Mar 19, 2020 at 03:05:12PM -0400, Daniel Jordan wrote:
>>> Regardless,
>>> Reviewed-by: Daniel Jordan <daniel.m.jordan@...cle.com>
>> Darn, I spoke too soon.
>>
>> On a two-socket Xeon, smaller values of TICK_PAGE_COUNT caused the deferred
>> init timestamp to grow by over 25%.  This was with pgdatinit0 bound to the
>> timer interrupt CPU to make sure the issue always reproduces.
>>
>>                 TICK_PAGE_COUNT     node 0 deferred
>>                                     init time (ms)
>>                 ---------------     ---------------
>>                            4096                 610
>>                            8192                 587
>>                           16384                 487
>>                           32768                 480    // used in the patch
>>
>> Instead of trying to find a constant that lets the timer interrupt run often
>> enough, I think a better way forward is to reconsider how we handle the resize
>> lock.  I plan to prototype something and reply back with what I get.

Yes, the time spend of pages init depends on the CPU frequency,
and the jiffies update controlled by HZ, so it's hard to find a general 
constant.

It seems we need a bigger refactors about the lock to get a better solution.