lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <beb4dfb5-e9d2-a76c-f965-28cff5e4658b@redhat.com>
Date:   Mon, 8 Feb 2021 09:21:42 +0100
From:   David Hildenbrand <david@...hat.com>
To:     "Song Bao Hua (Barry Song)" <song.bao.hua@...ilicon.com>,
        Matthew Wilcox <willy@...radead.org>
Cc:     "Wangzhou (B)" <wangzhou1@...ilicon.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "linux-api@...r.kernel.org" <linux-api@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "jgg@...pe.ca" <jgg@...pe.ca>,
        "kevin.tian@...el.com" <kevin.tian@...el.com>,
        "jean-philippe@...aro.org" <jean-philippe@...aro.org>,
        "eric.auger@...hat.com" <eric.auger@...hat.com>,
        "Liguozhu (Kenneth)" <liguozhu@...ilicon.com>,
        "zhangfei.gao@...aro.org" <zhangfei.gao@...aro.org>,
        "chensihang (A)" <chensihang1@...ilicon.com>
Subject: Re: [RFC PATCH v3 1/2] mempinfd: Add new syscall to provide memory
 pin

On 08.02.21 03:27, Song Bao Hua (Barry Song) wrote:
> 
> 
>> -----Original Message-----
>> From: owner-linux-mm@...ck.org [mailto:owner-linux-mm@...ck.org] On Behalf Of
>> Matthew Wilcox
>> Sent: Monday, February 8, 2021 2:31 PM
>> To: Song Bao Hua (Barry Song) <song.bao.hua@...ilicon.com>
>> Cc: Wangzhou (B) <wangzhou1@...ilicon.com>; linux-kernel@...r.kernel.org;
>> iommu@...ts.linux-foundation.org; linux-mm@...ck.org;
>> linux-arm-kernel@...ts.infradead.org; linux-api@...r.kernel.org; Andrew
>> Morton <akpm@...ux-foundation.org>; Alexander Viro <viro@...iv.linux.org.uk>;
>> gregkh@...uxfoundation.org; jgg@...pe.ca; kevin.tian@...el.com;
>> jean-philippe@...aro.org; eric.auger@...hat.com; Liguozhu (Kenneth)
>> <liguozhu@...ilicon.com>; zhangfei.gao@...aro.org; chensihang (A)
>> <chensihang1@...ilicon.com>
>> Subject: Re: [RFC PATCH v3 1/2] mempinfd: Add new syscall to provide memory
>> pin
>>
>> On Sun, Feb 07, 2021 at 10:24:28PM +0000, Song Bao Hua (Barry Song) wrote:
>>>>> In high-performance I/O cases, accelerators might want to perform
>>>>> I/O on a memory without IO page faults which can result in dramatically
>>>>> increased latency. Current memory related APIs could not achieve this
>>>>> requirement, e.g. mlock can only avoid memory to swap to backup device,
>>>>> page migration can still trigger IO page fault.
>>>>
>>>> Well ... we have two requirements.  The application wants to not take
>>>> page faults.  The system wants to move the application to a different
>>>> NUMA node in order to optimise overall performance.  Why should the
>>>> application's desires take precedence over the kernel's desires?  And why
>>>> should it be done this way rather than by the sysadmin using numactl to
>>>> lock the application to a particular node?
>>>
>>> NUMA balancer is just one of many reasons for page migration. Even one
>>> simple alloc_pages() can cause memory migration in just single NUMA
>>> node or UMA system.
>>>
>>> The other reasons for page migration include but are not limited to:
>>> * memory move due to CMA
>>> * memory move due to huge pages creation
>>>
>>> Hardly we can ask users to disable the COMPACTION, CMA and Huge Page
>>> in the whole system.
>>
>> You're dodging the question.  Should the CMA allocation fail because
>> another application is using SVA?
>>
>> I would say no.
> 
> I would say no as well.
> 
> While IOMMU is enabled, CMA almost has one user only: IOMMU driver
> as other drivers will depend on iommu to use non-contiguous memory
> though they are still calling dma_alloc_coherent().
> 
> In iommu driver, dma_alloc_coherent is called during initialization
> and there is no new allocation afterwards. So it wouldn't cause
> runtime impact on SVA performance. Even there is new allocations,
> CMA will fall back to general alloc_pages() and iommu drivers are
> almost allocating small memory for command queues.
> 
> So I would say general compound pages, huge pages, especially
> transparent huge pages, would be bigger concerns than CMA for
> internal page migration within one NUMA.
> 
> Not like CMA, general alloc_pages() can get memory by moving
> pages other than those pinned.
> 
> And there is no guarantee we can always bind the memory of
> SVA applications to single one NUMA, so NUMA balancing is
> still a concern.
> 
> But I agree we need a way to make CMA success while the userspace
> pages are pinned. Since pin has been viral in many drivers, I
> assume there is a way to handle this. Otherwise, APIs like
> V4L2_MEMORY_USERPTR[1] will possibly make CMA fail as there
> is no guarantee that usersspace will allocate unmovable memory
> and there is no guarantee the fallback path- alloc_pages() can
> succeed while allocating big memory.
> 

Long term pinnings cannot go onto CMA-reserved memory, and there is 
similar work to also fix ZONE_MOVABLE in that regard.

https://lkml.kernel.org/r/20210125194751.1275316-1-pasha.tatashin@soleen.com

One of the reasons I detest using long term pinning of pages where it 
could be avoided. Take VFIO and RDMA as an example: these things 
currently can't work without them.

What I read here: "DMA performance will be affected severely". That does 
not sound like a compelling argument to me for long term pinnings. 
Please find another way to achieve the same goal without long term 
pinnings controlled by user space - e.g., controlling when migration 
actually happens.

For example, CMA/alloc_contig_range()/memory unplug are corner cases 
that happen rarely, you shouldn't have to worry about them messing with 
your DMA performance.

-- 
Thanks,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ