[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ce3a91db-0fa0-8dda-492d-2ddd281070a7@oracle.com>
Date: Fri, 13 Oct 2017 10:18:34 -0600
From: Khalid Aziz <khalid.aziz@...cle.com>
To: Anthony Yznaga <anthony.yznaga@...cle.com>
Cc: David Miller <davem@...emloft.net>, dave.hansen@...ux.intel.com,
corbet@....net, Bob Picco <bob.picco@...cle.com>,
STEVEN_SISTARE <steven.sistare@...cle.com>,
Pasha Tatashin <pasha.tatashin@...cle.com>,
Mike Kravetz <mike.kravetz@...cle.com>,
Rob Gardner <rob.gardner@...cle.com>, mingo@...nel.org,
Nitin Gupta <nitin.m.gupta@...cle.com>,
kirill.shutemov@...ux.intel.com,
Tom Hromatka <tom.hromatka@...cle.com>,
Eric Saint Etienne <eric.saint.etienne@...cle.com>,
Allen Pais <allen.pais@...cle.com>, cmetcalf@...lanox.com,
akpm@...ux-foundation.org, geert@...ux-m68k.org, pmladek@...e.com,
tklauser@...tanz.ch, Atish Patra <atish.patra@...cle.com>,
Shannon Nelson <shannon.nelson@...cle.com>,
Vijay Kumar <vijay.ac.kumar@...cle.com>, peterz@...radead.org,
mhocko@...e.com, jack@...e.cz, lstoakes@...il.com,
punit.agrawal@....com, hughd@...gle.com, thomas.tai@...cle.com,
paul.gortmaker@...driver.com, ross.zwisler@...ux.intel.com,
dave.jiang@...el.com, willy@...radead.org, ying.huang@...el.com,
zhongjiang@...wei.com, minchan@...nel.org,
imbrenda@...ux.vnet.ibm.com, aneesh.kumar@...ux.vnet.ibm.com,
aarcange@...hat.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, sparclinux@...r.kernel.org,
linux-mm@...ck.org, Khalid Aziz <khalid@...ehiking.org>
Subject: Re: [PATCH v8 9/9] sparc64: Add support for ADI (Application Data
Integrity)
On 10/13/2017 08:14 AM, Khalid Aziz wrote:
> On 10/12/2017 02:27 PM, Anthony Yznaga wrote:
>>
>>> On Oct 12, 2017, at 7:44 AM, Khalid Aziz <khalid.aziz@...cle.com> wrote:
>>>
>>>
>>> On 10/06/2017 04:12 PM, Anthony Yznaga wrote:
>>>>> On Sep 25, 2017, at 9:49 AM, Khalid Aziz <khalid.aziz@...cle.com>
>>>>> wrote:
>>>>>
>>>>> This patch extends mprotect to enable ADI (TSTATE.mcde),
>>>>> enable/disable
>>>>> MCD (Memory Corruption Detection) on selected memory ranges, enable
>>>>> TTE.mcd in PTEs, return ADI parameters to userspace and
>>>>> save/restore ADI
>>>>> version tags on page swap out/in or migration. ADI is not enabled by
>>>> I still don't believe migration is properly supported. Your
>>>> implementation is relying on a fault happening on a page while its
>>>> migration is in progress so that do_swap_page() will be called, but
>>>> I don't see how do_swap_page() will be called if a fault does not
>>>> happen until after the migration has completed.
>>>
>>> User pages are on LRU list and for the mapped pages on LRU list,
>>> migrate_pages() ultimately calls try_to_unmap_one and makes a
>>> migration swap entry for the page being migrated. This forces a page
>>> fault upon access on the destination node and the page is swapped
>>> back in from swap cache. The fault is forced by the migration swap
>>> entry, rather than fault being an accidental event. If page fault
>>> happens on the destination node while migration is in progress,
>>> do_swap_page() waits until migration is done. Please take a look at
>>> the code in __unmap_and_move().
>>
>> I looked at the code again, and I now believe ADI tags are never
>> restored for migrated pages. Here's why:
>>
>
> I will take a look at it again. I have run extensive tests migrating
> pages of a process across multiple NUMA nodes over and over again and
> ADI tags were never lost, so this does work. I won't rule out the
> possibility of having missed a code path where tags are not restored and
> I will look for it.
Anthony,
I just ran my migration test again which:
- malloc's 16 GB of memory
- Assigns a rotating ADI tag every 64 bytes to the malloc'd buffer
- Writes a pattern to the entire buffer
- Verifies the pattern it wrote using ADI tagged addresses.
While this test was running, I had a script migrate test program pages
across two NUMA nodes every 30 seconds using migratepages command. I did
not see an ADI tag mismatch over multiple runs of this test. This test
shows migration is working.
Can you give me a test that shows the failure you think we should see
and I will debug it.
Thanks,
Khalid
Powered by blists - more mailing lists