[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c583be1b-17db-1ed3-0f5a-bd119edc8bfe@deltatee.com>
Date: Thu, 6 Dec 2018 13:11:28 -0700
From: Logan Gunthorpe <logang@...tatee.com>
To: Dave Hansen <dave.hansen@...el.com>,
Jerome Glisse <jglisse@...hat.com>
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org,
"Rafael J . Wysocki" <rafael@...nel.org>,
Matthew Wilcox <willy@...radead.org>,
Ross Zwisler <ross.zwisler@...ux.intel.com>,
Keith Busch <keith.busch@...el.com>,
Dan Williams <dan.j.williams@...el.com>,
Haggai Eran <haggaie@...lanox.com>,
Balbir Singh <bsingharora@...il.com>,
"Aneesh Kumar K . V" <aneesh.kumar@...ux.ibm.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Felix Kuehling <felix.kuehling@....com>,
Philip Yang <Philip.Yang@....com>,
Christian König <christian.koenig@....com>,
Paul Blinzer <Paul.Blinzer@....com>,
John Hubbard <jhubbard@...dia.com>,
Ralph Campbell <rcampbell@...dia.com>,
Michal Hocko <mhocko@...nel.org>,
Jonathan Cameron <jonathan.cameron@...wei.com>,
Mark Hairgrove <mhairgrove@...dia.com>,
Vivek Kini <vkini@...dia.com>,
Mel Gorman <mgorman@...hsingularity.net>,
Dave Airlie <airlied@...hat.com>,
Ben Skeggs <bskeggs@...hat.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Rik van Riel <riel@...riel.com>,
Ben Woodard <woodard@...hat.com>, linux-acpi@...r.kernel.org
Subject: Re: [RFC PATCH 00/14] Heterogeneous Memory System (HMS) and hbind()
On 2018-12-06 12:31 p.m., Dave Hansen wrote:
> On 12/6/18 11:20 AM, Jerome Glisse wrote:
>>>> For case 1 you can pre-parse stuff but this can be done by helper library
>>> How would that work? Would each user/container/whatever do this once?
>>> Where would they keep the pre-parsed stuff? How do they manage their
>>> cache if the topology changes?
>> Short answer i don't expect a cache, i expect that each program will have
>> a init function that query the topology and update the application codes
>> accordingly.
>
> My concern with having folks do per-program parsing, *and* having a huge
> amount of data to parse makes it unusable. The largest systems will
> literally have hundreds of thousands of objects in /sysfs, even in a
> single directory. That makes readdir() basically impossible, and makes
> even open() (if you already know the path you want somehow) hard to do fast.
Is this actually realistic? I find it hard to imagine an actual hardware
bus that can have even thousands of devices under a single node, let
alone hundreds of thousands. At some point the laws of physics apply.
For example, in present hardware, the most ports a single PCI switch can
have these days is under one hundred. I'd imagine any such large systems
would have a hierarchy of devices (ie. layers of switch-like devices)
which implies the existing sysfs bus/devices should have a path through
it without navigating a directory with that unreasonable a number of
objects in it. HMS, on the other hand, has all possible initiators
(,etc) under a single directory.
The caveat to this is, that to find an initial starting point in the bus
hierarchy you might have to go through /sys/dev/{block|char} or
/sys/class which may have directories with a large number of objects.
Though, such a system would necessarily have a similarly large number of
objects in /dev which means means you will probably never get around the
readdir/open bottleneck you mention... and, thus, this doesn't seem
overly realistic to me.
Logan
Powered by blists - more mailing lists