[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZL6my550xedf/L0Z@ziepe.ca>
Date: Mon, 24 Jul 2023 13:28:59 -0300
From: Jason Gunthorpe <jgg@...pe.ca>
To: Jesper Dangaard Brouer <jbrouer@...hat.com>
Cc: Mina Almasry <almasrymina@...gle.com>, brouer@...hat.com,
Christian König <christian.koenig@....com>,
Hari Ramakrishnan <rharix@...gle.com>,
David Ahern <dsahern@...nel.org>,
Samiullah Khawaja <skhawaja@...gle.com>,
Willem de Bruijn <willemb@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Christoph Hellwig <hch@....de>,
John Hubbard <jhubbard@...dia.com>,
Dan Williams <dan.j.williams@...el.com>,
Alexander Duyck <alexander.duyck@...il.com>,
Yunsheng Lin <linyunsheng@...wei.com>, davem@...emloft.net,
pabeni@...hat.com, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, Lorenzo Bianconi <lorenzo@...nel.org>,
Yisen Zhuang <yisen.zhuang@...wei.com>,
Salil Mehta <salil.mehta@...wei.com>,
Eric Dumazet <edumazet@...gle.com>,
Sunil Goutham <sgoutham@...vell.com>,
Geetha sowjanya <gakula@...vell.com>,
Subbaraya Sundeep <sbhatta@...vell.com>,
hariprasad <hkelam@...vell.com>, Saeed Mahameed <saeedm@...dia.com>,
Leon Romanovsky <leon@...nel.org>, Felix Fietkau <nbd@....name>,
Ryder Lee <ryder.lee@...iatek.com>,
Shayne Chen <shayne.chen@...iatek.com>,
Sean Wang <sean.wang@...iatek.com>, Kalle Valo <kvalo@...nel.org>,
Matthias Brugger <matthias.bgg@...il.com>,
AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Ilias Apalodimas <ilias.apalodimas@...aro.org>,
linux-rdma@...r.kernel.org, linux-wireless@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org,
linux-mediatek@...ts.infradead.org,
Jonathan Lemon <jonathan.lemon@...il.com>, logang@...tatee.com,
Bjorn Helgaas <bhelgaas@...gle.com>
Subject: Re: Memory providers multiplexing (Was: [PATCH net-next v4 4/5]
page_pool: remove PP_FLAG_PAGE_FRAG flag)
On Mon, Jul 24, 2023 at 04:56:27PM +0200, Jesper Dangaard Brouer wrote:
> These massive throughput numbers are important, because they *exceed*
> the physical host RAM/DIMM memory speeds.
That's right, this HW is all designed to use the high memory bandwidth
of the parallel GPUs. The CPU must not be involved in the data
movement.
If you look at the reference block diagrams for a DGX-H100 you can see
that the each GPU is directly wired to a 400Gb/s NIC, and there is
even software that allows the GPU to directly operate its attached
NIC interface.
Each of the 8 GPUs in the block diagram has 800 Gb/s full duplex RDMA,
and 7200 Gb/sec full duplex on the nvlink interconnect directly
connected to it.
The GPU is the center of all the interconnect in these systems.
Jason
Powered by blists - more mailing lists