[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f049576e-4f36-6407-883d-24fac47c4491@nvidia.com>
Date: Tue, 24 Aug 2021 23:47:51 -0700
From: John Hubbard <jhubbard@...dia.com>
To: Christian König <christian.koenig@....com>,
Jason Gunthorpe <jgg@...pe.ca>
Cc: Gal Pressman <galpress@...zon.com>,
Daniel Vetter <daniel@...ll.ch>,
Sumit Semwal <sumit.semwal@...aro.org>,
Doug Ledford <dledford@...hat.com>,
"open list:DMA BUFFER SHARING FRAMEWORK"
<linux-media@...r.kernel.org>,
dri-devel <dri-devel@...ts.freedesktop.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-rdma <linux-rdma@...r.kernel.org>,
Oded Gabbay <ogabbay@...ana.ai>,
Tomer Tayar <ttayar@...ana.ai>,
Yossi Leybovich <sleybo@...zon.com>,
Alexander Matushevsky <matua@...zon.com>,
Leon Romanovsky <leonro@...dia.com>,
Jianxin Xiong <jianxin.xiong@...el.com>
Subject: Re: [RFC] Make use of non-dynamic dmabuf in RDMA
On 8/24/21 11:17 PM, Christian König wrote:
...
>> I think it depends on the user, if the user creates memory which is
>> permanently located on the GPU then it should be pinnable in this way
>> without force migration. But if the memory is inherently migratable
>> then it just cannot be pinned in the GPU at all as we can't
>> indefinately block migration from happening eg if the CPU touches it
>> later or something.
>
> Yes, exactly that's the point. Especially GPUs have a great variety of setups.
>
> For example we have APUs where the local memory is just stolen system memory and all buffers must be
> migrate-able because you might need all of this stolen memory for scanout or page tables. In this
> case P2P only makes sense to avoid the migration overhead in the first place.
>
> Then you got dGPUs where only a fraction of the VRAM is accessible from the PCIe BUS. Here you also
> absolutely don't want to pin any buffers because that can easily crash when we need to migrate
> something into the visible window for CPU access.
>
> The only real option where you could do P2P with buffer pinning are those compute boards where we
> know that everything is always accessible to everybody and we will never need to migrate anything.
> But even then you want some mechanism like cgroups to take care of limiting this. Otherwise any
> runaway process can bring down your whole system.
>
> Key question at least for me as GPU maintainer is if we are going to see modern compute boards
> together with old non-ODP setups. Since those compute boards are usually used with new hardware
> (like PCIe v4 for example) the answer I think is most likely "no".
>
That is a really good point. Times have changed and I guess ODP is on most (all?) of
the new Infiniband products now, and maybe we don't need to worry so much about
providing first-class support for non-ODP setups.
I've got to drag my brain into 2021+! :)
thanks,
--
John Hubbard
NVIDIA
Powered by blists - more mailing lists