[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7fc0b153-9eea-af2c-cd42-c66a2d4087bc@amd.com>
Date: Wed, 15 Jan 2025 16:01:54 +0000
From: Alejandro Lucero Palau <alucerop@....com>
To: Dan Williams <dan.j.williams@...el.com>, alejandro.lucero-palau@....com,
linux-cxl@...r.kernel.org, netdev@...r.kernel.org, martin.habets@...inx.com,
edward.cree@....com, davem@...emloft.net, kuba@...nel.org,
pabeni@...hat.com, edumazet@...gle.com, dave.jiang@...el.com
Subject: Re: [PATCH v8 01/27] cxl: add type2 device basic support
On 1/14/25 22:52, Dan Williams wrote:
> Alejandro Lucero Palau wrote:
>> On 1/8/25 14:32, Alejandro Lucero Palau wrote:
>>> On 1/8/25 01:33, Dan Williams wrote:
>>>> Dan Williams wrote:
>>>>> alejandro.lucero-palau@ wrote:
>>>>>> From: Alejandro Lucero <alucerop@....com>
>>>>>>
>>>>>> Differentiate CXL memory expanders (type 3) from CXL device
>>>>>> accelerators
>>>>>> (type 2) with a new function for initializing cxl_dev_state.
>>>>>>
>>>>>> Create accessors to cxl_dev_state to be used by accel drivers.
>>>>>>
>>>>>> Based on previous work by Dan Williams [1]
>>>>>>
>>>>>> Link: [1]
>>>>>> https://lore.kernel.org/linux-cxl/168592160379.1948938.12863272903570476312.stgit@dwillia2-xfh.jf.intel.com/
>>>>>> Signed-off-by: Alejandro Lucero <alucerop@....com>
>>>>>> Co-developed-by: Dan Williams <dan.j.williams@...el.com>
>>>>>> Reviewed-by: Dave Jiang <dave.jiang@...el.com>
>>>>>> Reviewed-by: Fan Ni <fan.ni@...sung.com>
>>>>> This patch causes
>>>> Whoops, forgot to complete this thought. Someting in this series causes:
>>>>
>>>> depmod: ERROR: Cycle detected: ecdh_generic
>>>> depmod: ERROR: Cycle detected: tpm
>>>> depmod: ERROR: Cycle detected: cxl_mock -> cxl_core -> cxl_mock
>>>> depmod: ERROR: Cycle detected: encrypted_keys
>>>> depmod: ERROR: Found 2 modules in dependency cycles!
>>>>
>>>> I think the non CXL ones are false likely triggered by the CXL causing
>>>> depmod to exit early.
>>>>
>>>> Given cxl-test is unfamiliar territory to many submitters I always offer
>>>> to fix up the breakage. I came up with the below incremental patch to
>>>> fold in that also addresses my other feedback.
>>>>
>>>> Now the depmod error is something Alison saw too, and while I can also
>>>> see it on patch1 if I do:
>>>>
>>>> - apply whole series
>>>> - build => see the error
>>>> - rollback patch1
>>>> - build => see the error
>>>>
>>>> ...a subsequent build the error goes away, so I think that transient
>>>> behavior is a quirk of how cxl-test is built, but some later patch in
>>>> that series makes the failure permanent.
>>>>
>>>> In any event I figured that out after creating the below fixup and
>>>> realizing that it does not fix the cxl-test build issue:
>>>
>>> Ok. but it is a good way of showing what you had in your mind about
>>> the suggested changes.
>>>
>>> I'll use it for v10.
>>>
>>> Thanks
>>>
>> Hi Dan,
>>
>>
>> There's a problem with this approach and it is the need of the driver
>> having access to internal cxl structs like cxl_dev_state.
> Apologies for stepping away for a few days, there was a chance to get
> the long pending DAX page reference count series into v6.14 that I
> needed to devote some review cycles.
No worries.
>> Your patch does not cover it but for an accel driver that struct needs
>> to be allocated before using the new cxl_dev_state_init.
> Why does the cxl core need to wrap malloc on behalf of the driver?
>
>> I think we reached an agreement in initial discussions about avoiding
>> this need through an API for accel drivers indirectly doing whatever is
>> needed regarding internal CXL structs. Initially it was stated this
>> being necessary for avoiding drivers doing wrong things but Jonathan
>> pointed out the main concern being changing those internal structs in
>> the future could benefit from this approach. Whatever the reason, that
>> was the assumption.
> I think there is a benefit from a driver being able to do someting like:
>
> struct my_cxl_accelerator_context {
> ...
> struct cxl_dev_state cxlds;
> ...
> };
>
> Even if the rule is that direct consumption of 'struct cxl_dev_state'
> outside of the cxl core is unwanted.
>
> C does not make this easy, so it is either make the definition of
> 'struct cxl_dev_state' public to CXL accelerator drivers so that they
> know the size, or add an allocation API that takes in the extra size
> that accelerator needs to allocate the core CXL context.
>
> Unless and until we run into a real life problem of accelerator drivers
> misusing 'struct cxl_dev_state' I think I prefer the explicit approach
> of make the data structure embeddable and only require the core to do
> the initialization.
Ok then. I'll make cxl_dev_state public in v10 and do the allocation by
the driver.
>> I could add a function for accel drivers doing the allocation as with
>> current v9 code, and then using your changes for having common code.
> Let me go look at what you have there, but the design principle of the
> CXL core is a library and enabling (but not requiring) users to have a container_of()
> relationship between the core context and their local context feels the
> most maintainable at this point.
>
>> Also, I completely agree with merging the serial and dvsec
>> initializations through arguments to cxl_dev_state_init, but we need the
>> cxl_set_resource function for accel drivers. The current code for adding
>> resources with memdev is relying on mbox commands, and although we could
>> change that code for supporting accel drivers without an mbox, I would
>> say the function/code added is simple enough for not requiring that
>> effort. Note my goal is for an accel device without an mbox, but we will
>> see devices with one in the future, so I bet for leaving any change
>> there to that moment.
> ...but the way it was adding those resources was just wrong. This also
> collides with some of the changes proposed for DCD partition management.
> I needs a rethink in terms of a data structure and API to describe the
> DPA address map of a CXL.mem capable device to the core and the core
> should not be hard-coding a memory-expander-class device definition to
> that layout.
I think you say is wrong because you did not look at patch 8 where the
resources are requested based on the parent resource (dpa).
After seeing Ira's DCD patchset regarding the resources allocation
changes, and your comment there, I think I know that you have in mind.
But for Type2 the way resources information is obtained changes, at
least for the case of one without mailbox. In our case we are hardcoding
the resource definitions, although in the future (and likely other
drivers) we will use an internal/firmware path for obtaining the
information. So we have two cases:
1) accel driver with mailbox: an additional API function should allow
such accel driver to obtain the info or trigger the resource allocation
based on that command.
2) accel driver without mailbox: a function for allocating the resources
based on hardcoded or driver-dependent dynamically-obtained data.
The current patchset is supporting the second one, and with the linked
use case, the first one should be delayed until some accel driver
requires it.
I can adapt the current API for using the resource array exposed by
DCD's patches, and useĀ add_dpa_res function instead of current patchset
code.
> I am imagining something similar to the way that resource range
> resources are transmitted to platform device registration.
I guess you mean to haveĀ static/dynamic resource array with __init flag
for freeing the data after initialization, a CXL core function for
processing the array and calling add_dpa_res and aware of DCD patch
needs, and the resources linked to the device and released automatically
when unloading the driver.
Powered by blists - more mailing lists