linux-kernel - Re: [HMM 12/15] mm/migrate: new memory migration helper for use with device memory v4

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ca12b033-8ec5-84b0-c2aa-ea829e1194fa@nvidia.com>
Date:   Fri, 14 Jul 2017 12:43:51 -0700
From:   Evgeny Baskakov <ebaskakov@...dia.com>
To:     Jerome Glisse <jglisse@...hat.com>
CC:     "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        John Hubbard <jhubbard@...dia.com>,
        David Nellans <dnellans@...dia.com>,
        Mark Hairgrove <mhairgrove@...dia.com>,
        Sherry Cheung <SCheung@...dia.com>,
        Subhash Gutti <sgutti@...dia.com>
Subject: Re: [HMM 12/15] mm/migrate: new memory migration helper for use with
 device memory v4

On 7/13/17 1:16 PM, Jerome Glisse wrote:

> ...
>

Hi Jerome,

I have hit another kind of hang. Briefly, if a not yet allocated page 
faults on CPU during migration to device memory, any subsequent 
migration will fail for such page. Such a situation can trigger if a CPU 
page fault happens just immediately after migrate_vma() starts unmapping 
pages to migrate.

Please find attached a reproducer based on the sample driver. In the 
hmm_test() function, an HMM_DMIRROR_MIGRATE request is triggered from a 
separate thread for not yet allocated pages (coming from malloc). In the 
same time, a HMM_DMIRROR_READ request is made for the same pages. This 
results in a sporadic app-side hang, because random number of pages 
never migrate to device memory.

Note that if the pages are touched (initialized with data) prior to 
that, everything works as expected: all HMM_DMIRROR_READ and 
HMM_DMIRROR_MIGRATE requests eventually succeed. See comments in the 
hmm_test() function.

Thanks!

-- 
Evgeny Baskakov
NVIDIA

Download attachment "sanity_rmem004_repeated_faults_threaded_notallocated.tgz" of type "application/x-gzip" (5786 bytes)