linux-kernel - Re: [PATCH] mm: introduce MADV_CLR

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170601080909.GD32677@dhcp22.suse.cz>
Date:   Thu, 1 Jun 2017 10:09:09 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Mike Rapoport <rppt@...ux.vnet.ibm.com>
Cc:     Andrea Arcangeli <aarcange@...hat.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Arnd Bergmann <arnd@...db.de>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Pavel Emelyanov <xemul@...tuozzo.com>,
        linux-mm <linux-mm@...ck.org>,
        lkml <linux-kernel@...r.kernel.org>,
        Linux API <linux-api@...r.kernel.org>
Subject: Re: [PATCH] mm: introduce MADV_CLR_HUGEPAGE

On Thu 01-06-17 09:53:02, Mike Rapoport wrote:
> On Tue, May 30, 2017 at 04:39:41PM +0200, Michal Hocko wrote:
> > On Tue 30-05-17 16:04:56, Andrea Arcangeli wrote:
> > > 
> > > UFFDIO_COPY while not being a major slowdown for sure, it's likely
> > > measurable at the microbenchmark level because it would add a
> > > enter/exit kernel to every 4k memcpy. It's not hard to imagine that as
> > > measurable. How that impacts the total precopy time I don't know, it
> > > would need to be benchmarked to be sure.
> > 
> > Yes, please!
> 
> I've run a simple test (below) that fills 1G of memory either with memcpy
> of ioctl(UFFDIO_COPY) in 4K chunks.
> The machine I used has two "Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz" and
> 128G of RAM.
> I've averaged elapsed time reported by /usr/bin/time over 100 runs and here
> what I've got:
> 
> memcpy with THP on: 0.3278 sec
> memcpy with THP off: 0.5295 sec
> UFFDIO_COPY: 0.44 sec

I assume that the standard deviation is small?
 
> That said, for the CRIU usecase UFFDIO_COPY seems faster that disabling THP
> and then doing memcpy.

That is a bit surprising. I didn't think that the userfault syscall
(ioctl) can be faster than a regular #PF but considering that
__mcopy_atomic bypasses the page fault path and it can be optimized for
the anon case suggests that we can save some cycles for each page and so
the cumulative savings can be visible.

-- 
Michal Hocko
SUSE Labs