lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <17b6a3a3-bd7d-f57e-8762-96258b16247a@intel.com>
Date:   Tue, 10 Aug 2021 11:56:12 -0700
From:   Dave Hansen <dave.hansen@...el.com>
To:     Andi Kleen <ak@...ux.intel.com>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Borislav Petkov <bp@...en8.de>,
        Andy Lutomirski <luto@...nel.org>,
        Sean Christopherson <seanjc@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Joerg Roedel <jroedel@...e.de>
Cc:     Kuppuswamy Sathyanarayanan 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        David Rientjes <rientjes@...gle.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Tom Lendacky <thomas.lendacky@....com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Ingo Molnar <mingo@...hat.com>,
        Varad Gautam <varad.gautam@...e.com>,
        Dario Faggioli <dfaggioli@...e.com>, x86@...nel.org,
        linux-mm@...ck.org, linux-coco@...ts.linux.dev,
        linux-kernel@...r.kernel.org,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: [PATCH 1/5] mm: Add support for unaccepted memory

On 8/10/21 11:30 AM, Andi Kleen wrote:
>> So, this is right in the fast path of the page allocator.  It's a
>> one-time thing per 2M page, so it's not permanent.
>>
>> *But* there's both a global spinlock and a firmware call hidden in
>> clear_page_offline().  That's *GOT* to hurt if you were, for instance,
>> running a benchmark while this code path is being tickled.  Not just to
>>
>> That could be just downright catastrophic for scalability, albeit
>> temporarily
> 
> This would be only a short blib at initialization until the system
> reaches steady state. So yes it would be temporary, but very short at that.

But it can't be *that* short or we wouldn't be going to all this trouble
in the first place.  This can't simultaneously be both bad enough that
this series exists, but minor enough that nobody will notice or care at
runtime.

In general, I'd rather have a system which is running userspace, slowly,
than one where I'm waiting for the kernel.  The trade-off being made is
a *good* trade-off for me.  But, not everyone is going to agree with me.

This also begs the question of how folks know when this "blip" is over.
 Do we have a counter for offline pages?  Is there any way to force page
acceptance?  Or, are we just stuck allocating a bunch of memory to warm
up the system?

How do folks who care about these new blips avoid them?

Again, I don't particularly care about how this affects the
benchmarkers.  But, I do care that they're going to hound us when these
blips start impacting their 99th percentile tail latencies.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ