linux-kernel - Re: [RFC] Make use of non-dynamic dmabuf in RDMA

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f049576e-4f36-6407-883d-24fac47c4491@nvidia.com>
Date:   Tue, 24 Aug 2021 23:47:51 -0700
From:   John Hubbard <jhubbard@...dia.com>
To:     Christian König <christian.koenig@....com>,
        Jason Gunthorpe <jgg@...pe.ca>
Cc:     Gal Pressman <galpress@...zon.com>,
        Daniel Vetter <daniel@...ll.ch>,
        Sumit Semwal <sumit.semwal@...aro.org>,
        Doug Ledford <dledford@...hat.com>,
        "open list:DMA BUFFER SHARING FRAMEWORK" 
        <linux-media@...r.kernel.org>,
        dri-devel <dri-devel@...ts.freedesktop.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-rdma <linux-rdma@...r.kernel.org>,
        Oded Gabbay <ogabbay@...ana.ai>,
        Tomer Tayar <ttayar@...ana.ai>,
        Yossi Leybovich <sleybo@...zon.com>,
        Alexander Matushevsky <matua@...zon.com>,
        Leon Romanovsky <leonro@...dia.com>,
        Jianxin Xiong <jianxin.xiong@...el.com>
Subject: Re: [RFC] Make use of non-dynamic dmabuf in RDMA

On 8/24/21 11:17 PM, Christian König wrote:
...
>> I think it depends on the user, if the user creates memory which is
>> permanently located on the GPU then it should be pinnable in this way
>> without force migration. But if the memory is inherently migratable
>> then it just cannot be pinned in the GPU at all as we can't
>> indefinately block migration from happening eg if the CPU touches it
>> later or something.
> 
> Yes, exactly that's the point. Especially GPUs have a great variety of setups.
> 
> For example we have APUs where the local memory is just stolen system memory and all buffers must be 
> migrate-able because you might need all of this stolen memory for scanout or page tables. In this 
> case P2P only makes sense to avoid the migration overhead in the first place.
> 
> Then you got dGPUs where only a fraction of the VRAM is accessible from the PCIe BUS. Here you also 
> absolutely don't want to pin any buffers because that can easily crash when we need to migrate 
> something into the visible window for CPU access.
> 
> The only real option where you could do P2P with buffer pinning are those compute boards where we 
> know that everything is always accessible to everybody and we will never need to migrate anything. 
> But even then you want some mechanism like cgroups to take care of limiting this. Otherwise any 
> runaway process can bring down your whole system.
> 
> Key question at least for me as GPU maintainer is if we are going to see modern compute boards 
> together with old non-ODP setups. Since those compute boards are usually used with new hardware 
> (like PCIe v4 for example) the answer I think is most likely "no".
> 

That is a really good point. Times have changed and I guess ODP is on most (all?) of
the new Infiniband products now, and maybe we don't need to worry so much about
providing first-class support for non-ODP setups.

I've got to drag my brain into 2021+! :)

thanks,
-- 
John Hubbard
NVIDIA