lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <96586036-bc44-463e-ab49-44c07a6e6c26@amd.com>
Date: Fri, 19 Sep 2025 13:51:22 +0100
From: Alejandro Lucero Palau <alucerop@....com>
To: alejandro.lucero-palau@....com, linux-cxl@...r.kernel.org,
 netdev@...r.kernel.org, dan.j.williams@...el.com, edward.cree@....com,
 davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
 edumazet@...gle.com, dave.jiang@...el.com
Subject: Re: [PATCH v18 00/20] Type2 device basic support


On 9/18/25 10:17, alejandro.lucero-palau@....com wrote:
> From: Alejandro Lucero <alucerop@....com>
>
> First of all, the patchset should be applied on the described base
> commit then applying Terry's v11 about CXL error handling plus last four
> pathces from Dan's for-6.18/cxl-probe-order branch.
>
> Secondly, this is another try being aware it will not be the last since
> there are main aspects to agree on. The most important thing is to decide
> how to solve the problem of type2 stability under CXL core events. Let me
> start then defining that problem listing the events or situations pointed
> out but, I think, not clearly defined and therefore creating confusion, at
> least to me.
>
> We have different situations to be aware of:
>
>
> 1) CXL topology not there or nor properly configured yet.
>
> 2) accelerator relying on pcie instead of CXL.io
>
> 3) potential removal of cxl_mem, cxl_acpi or cxl_port
>
> 4) cxl initialization failing due to dynamic modules dependencies
>
> 5) CXL errors
>
>
> Dan's patches from the cxl-probe-order branch will hopefully fix the last
> situation. I have tested this and it works as expected: type2 driver
> initialization can not start because module dependencies. This solves
> #4.


FWIW, I did not realize that those patches solve partially #3, as 
cxl_mem and cxl_port has now dependencies and can not be removed. 
However, cxl_acpi is still remaining as a potential issue.


> Using Terry's patchset, specifically pcie_is_cxl function, solves #2.
>
> Regarding #5, I think Terry's patchset introduces the proper handling for
> this, or at least some initial work which will surely require adjustments,
> and of course Type2 using it, which is not covered in v18 and I prefer
> to add in a followup work.
>
> About #3, the only way to be protected is partially during initialization
> with cxl_acquire (patch 8), and afer initialization with a callback to the
> driver when cxl objects are removed (callback given when creating cxl
> region at patch 16, used by sfc driver in patch 18). Initially, cxl_acquire
> was implemented with other goal (next point) but as it can give
> protection during initialization, I kept it. About the callback, Dan
> does not like it, and Jonathan not keen of it. I think we agreed the
> right solution is those modules should not be allowed to be removed if
> there are dependencies, and it requires work in the cxl core for
> support that as a follow-up work. Therefore, or someone gives another
> idea about how to handle this now, temporarily, without that proper
> solution, or I should delay this full patchset until that is done.
>
> Then we have #1 which I admit is the most confusing (at least to me).
> If we can not solve the problem of the proper initialization based on the
> probe() calls for those cxl devices to work with, then an explanation
> about this case is needed and, I guess, to document it.
>
> AFAIK, the BIOS will perform a good bunch of CXL initialization (BTW, I
> think we should discuss this as well at some point for having same
> expectations about what and how things are done, and also when) then the
> kernel CXL initialization will perform its own bunch based on what the
> BIOS is given. That implies CXL Root ports, downstream/upstream cxl
> ports to be register, switches, ... . If I am not wrong, that depends on
> subsys_initcall level, and therefore earlier than any accelerator driver
> initialization. Am I right assuming once those modules are done the
> kernel cxl topology/infrastructure is ready to deal with an accelerator
> initializing its cxl functionality? If not, what is the problem or
> problems? Is this due to longer than expected hardware initialization by
> the kernel? if so, could it not be left to the BIOS somehow? is this due
> to some asynchronous initialization impossible to avoid or be certain
> of? If so, can we document it?
>
> I understand with CXL could/will come complex topologies where maybe
> initialization by a single host is not possible without synchronizing
> with other hosts or CXL infrastructure. Is this what is all this about?
>
>
> v18 changes:
>
>    patch 1: minor changes and fixing docs generation (Jonathan, Dan)
>   
>    patch4: merged with v17 patch5
>
>    patch 5: merging v17 patches 6 and 7
>
>    patch 6: adding helpers for clarity
>
>    patch 9:
> 	- minor changes (Dave)
> 	- simplifying flags check (Dan)
>
>    patch 10: minor changes (Jonathan)
>
>    patch 11:
> 	- minor changes (Dave)
> 	- fix mess (Jonathan, Dave)
>
>    patch 18: minor changes (Jonathan, Dan)
>    
> v17 changes: (Dan Williams review)
>   - use devm for cxl_dev_state allocation
>   - using current cxl struct for checking capability registers found by
>     the driver.
>   - simplify dpa initialization without a mailbox not supporting pmem
>   - add cxl_acquire_endpoint for protection during initialization
>   - add callback/action to cxl_create_region for a driver notified about cxl
>     core kernel modules removal.
>   - add sfc function to disable CXL-based PIO buffers if such a callback
>     is invoked.
>   - Always manage a Type2 created region as private not allowing DAX.
>
> v16 changes:
>   - rebase against rc4 (Dave Jiang)
>   - remove duplicate line (Ben Cheatham)
>
> v15 changes:
>   - remove reference to unused header file (Jonathan Cameron)
>   - add proper kernel docs to exported functions (Alison Schofield)
>   - using an array to map the enums to strings (Alison Schofield)
>   - clarify comment when using bitmap_subset (Jonathan Cameron)
>   - specify link to type2 support in all patches (Alison Schofield)
>
>    Patches changed (minor): 4, 11
>
> v14 changes:
>   - static null initialization of bitmaps (Jonathan Cameron)
>   - Fixing cxl tests (Alison Schofield)
>   - Fixing robot compilation problems
>
>    Patches changed (minor): 1, 4, 6, 13
>
> v13 changes:
>   - using names for headers checking more consistent (Jonathan Cameron)
>   - using helper for caps bit setting (Jonathan Cameron)
>   - provide generic function for reporting missing capabilities (Jonathan Cameron)
>   - rename cxl_pci_setup_memdev_regs to cxl_pci_accel_setup_memdev_regs (Jonathan Cameron)
>   - cxl_dpa_info size to be set by the Type2 driver (Jonathan Cameron)
>   - avoiding rc variable when possible (Jonathan Cameron)
>   - fix spelling (Simon Horman)
>   - use scoped_guard (Dave Jiang)
>   - use enum instead of bool (Dave Jiang)
>   - dropping patch with hardware symbols
>
> v12 changes:
>   - use new macro cxl_dev_state_create in pci driver (Ben Cheatham)
>   - add public/private sections in now exported cxl_dev_state struct (Ben
>     Cheatham)
>   - fix cxl/pci.h regarding file name for checking if defined
>   - Clarify capabilities found vs expected in error message. (Ben
>     Cheatham)
>   - Clarify new CXL_DECODER_F flag (Ben Cheatham)
>   - Fix changes about cxl memdev creation support moving code to the
>     proper patch. (Ben Cheatham)
>   - Avoid debug and function duplications (Ben Cheatham)
>
> v11 changes:
>   - Dropping the use of cxl_memdev_state and going back to using
>     cxl_dev_state.
>   - Using a helper for an accel driver to allocate its own cxl-related
>     struct embedding cxl_dev_state.
>   - Exporting the required structs in include/cxl/cxl.h for an accel
>     driver being able to know the cxl_dev_state size required in the
>     previously mentioned helper for allocation.
>   - Avoid using any struct for dpa initialization by the accel driver
>     adding a specific function for creating dpa partitions by accel
>     drivers without a mailbox.
>
> v10 changes:
>   - Using cxl_memdev_state instead of cxl_dev_state for type2 which has a
>     memory after all and facilitates the setup.
>   - Adapt core for using cxl_memdev_state allowing accel drivers to work
>     with them without further awareness of internal cxl structs.
>   - Using last DPA changes for creating DPA partitions with accel driver
>     hardcoding mds values when no mailbox.
>   - capabilities not a new field but built up when current register maps
>     is performed and returned to the caller for checking.
>   - HPA free space supporting interleaving.
>   - DPA free space droping max-min for a simple alloc size.
>
> v9 changes:
>   - adding forward definitions (Jonathan Cameron)
>   - using set_bit instead of bitmap_set (Jonathan Cameron)
>   - fix rebase problem (Jonathan Cameron)
>   - Improve error path (Jonathan Cameron)
>   - fix build problems with cxl region dependency (robot)
>   - fix error path (Simon Horman)
>
> v8 changes:
>   - Change error path labeling inside sfc cxl code (Edward Cree)
>   - Properly handling checks and error in sfc cxl code (Simon Horman)
>   - Fix bug when checking resource_size (Simon Horman)
>   - Avoid bisect problems reordering patches (Edward Cree)
>   - Fix buffer allocation size in sfc (Simon Horman)
>
> v7 changes:
>
>   - fixing kernel test robot complains
>   - fix type with Type3 mandatory capabilities (Zhi Wang)
>   - optimize code in cxl_request_resource (Kalesh Anakkur Purayil)
>   - add sanity check when dealing with resources arithmetics (Fan Ni)
>   - fix typos and blank lines (Fan Ni)
>   - keep previous log errors/warnings in sfc driver (Martin Habets)
>   - add WARN_ON_ONCE if region given is NULL
>
> v6 changes:
>
>   - update sfc mcdi_pcol.h with full hardware changes most not related to
>     this patchset. This is an automatic file created from hardware design
>     changes and not touched by software. It is updated from time to time
>     and it required update for the sfc driver CXL support.
>   - remove CXL capabilities definitions not used by the patchset or
>     previous kernel code. (Dave Jiang, Jonathan Cameron)
>   - Use bitmap_subset instead of reinventing the wheel ... (Ben Cheatham)
>   - Use cxl_accel_memdev for new device_type created (Ben Cheatham)
>   - Fix construct_region use of rwsem (Zhi Wang)
>   - Obtain region range instead of region params (Allison Schofield, Dave
>     Jiang)
>
> v5 changes:
>
>   - Fix SFC configuration based on kernel CXL configuration
>   - Add subset check for capabilities.
>   - fix region creation when HDM decoders programmed by firmware/BIOS (Ben
>     Cheatham)
>   - Add option for creating dax region based on driver decission (Ben
>     Cheatham)
>   - Using sfc probe_data struct for keeping sfc cxl data
>
> v4 changes:
>
>   - Use bitmap for capabilities new field (Jonathan Cameron)
>   - Use cxl_mem attributes for sysfs based on device type (Dave Jian)
>   - Add conditional cxl sfc compilation relying on kernel CXL config (kernel test robot)
>   - Add sfc changes in different patches for facilitating backport (Jonathan Cameron)
>   - Remove patch for dealing with cxl modules dependencies and using sfc kconfig plus
>     MODULE_SOFTDEP instead.
>
> v3 changes:
>
>   - cxl_dev_state not defined as opaque but only manipulated by accel drivers
>     through accessors.
>   - accessors names not identified as only for accel drivers.
>   - move pci code from pci driver (drivers/cxl/pci.c) to generic pci code
>     (drivers/cxl/core/pci.c).
>   - capabilities field from u8 to u32 and initialised by CXL regs discovering
>     code.
>   - add capabilities check and removing current check by CXL regs discovering
>     code.
>   - Not fail if CXL Device Registers not found. Not mandatory for Type2.
>   - add timeout in acquire_endpoint for solving a race with the endpoint port
>     creation.
>   - handle EPROBE_DEFER by sfc driver.
>   - Limiting interleave ways to 1 for accel driver HPA/DPA requests.
>   - factoring out interleave ways and granularity helpers from type2 region
>     creation patch.
>   - restricting region_creation for type2 to one endpoint decoder.
>
> v2 changes:
>
> I have removed the introduction about the concerns with BIOS/UEFI after the
> discussion leading to confirm the need of the functionality implemented, at
> least is some scenarios.
>
> There are two main changes from the RFC:
>
> 1) Following concerns about drivers using CXL core without restrictions, the CXL
> struct to work with is opaque to those drivers, therefore functions are
> implemented for modifying or reading those structs indirectly.
>
> 2) The driver for using the added functionality is not a test driver but a real
> one: the SFC ethernet network driver. It uses the CXL region mapped for PIO
> buffers instead of regions inside PCIe BARs.
>
> RFC:
>
> Current CXL kernel code is focused on supporting Type3 CXL devices, aka memory
> expanders. Type2 CXL devices, aka device accelerators, share some functionalities
> but require some special handling.
>
> First of all, Type2 are by definition specific to drivers doing something and not just
> a memory expander, so it is expected to work with the CXL specifics. This implies the CXL
> setup needs to be done by such a driver instead of by a generic CXL PCI driver
> as for memory expanders. Most of such setup needs to use current CXL core code
> and therefore needs to be accessible to those vendor drivers. This is accomplished
> exporting opaque CXL structs and adding and exporting functions for working with
> those structs indirectly.
>
> Some of the patches are based on a patchset sent by Dan Williams [1] which was just
> partially integrated, most related to making things ready for Type2 but none
> related to specific Type2 support. Those patches based on Dan´s work have Dan´s
> signing as co-developer, and a link to the original patch.
>
> A final note about CXL.cache is needed. This patchset does not cover it at all,
> although the emulated Type2 device advertises it. From the kernel point of view
> supporting CXL.cache will imply to be sure the CXL path supports what the Type2
> device needs. A device accelerator will likely be connected to a Root Switch,
> but other configurations can not be discarded. Therefore the kernel will need to
> check not just HPA, DPA, interleave and granularity, but also the available
> CXL.cache support and resources in each switch in the CXL path to the Type2
> device. I expect to contribute to this support in the following months, and
> it would be good to discuss about it when possible.
>
> [1] https://lore.kernel.org/linux-cxl/98b1f61a-e6c2-71d4-c368-50d958501b0c@intel.com/T/
>
> Alejandro Lucero (20):
>    cxl: Add type2 device basic support
>    sfc: add cxl support
>    cxl: Move pci generic code
>    cxl: allow Type2 drivers to map cxl component regs
>    cxl: Support dpa initialization without a mailbox
>    cxl: Prepare memdev creation for type2
>    sfc: create type2 cxl memdev
>    cx/memdev: Indicate probe deferral
>    cxl: Define a driver interface for HPA free space enumeration
>    sfc: get root decoder
>    cxl: Define a driver interface for DPA allocation
>    sfc: get endpoint decoder
>    cxl: Make region type based on endpoint type
>    cxl/region: Factor out interleave ways setup
>    cxl/region: Factor out interleave granularity setup
>    cxl: Allow region creation by type2 drivers
>    cxl: Avoid dax creation for accelerators
>    sfc: create cxl region
>    cxl: Add function for obtaining region range
>    sfc: support pio mapping based on cxl
>
>   .../driver-api/cxl/theory-of-operation.rst    |   3 +
>   drivers/cxl/core/core.h                       |   9 +-
>   drivers/cxl/core/hdm.c                        |  83 ++++
>   drivers/cxl/core/mbox.c                       |  63 +--
>   drivers/cxl/core/memdev.c                     | 154 +++++-
>   drivers/cxl/core/pci.c                        |  63 +++
>   drivers/cxl/core/port.c                       |   3 +-
>   drivers/cxl/core/region.c                     | 442 ++++++++++++++++--
>   drivers/cxl/core/regs.c                       |   2 +-
>   drivers/cxl/cxl.h                             | 125 +----
>   drivers/cxl/cxlmem.h                          |  94 +---
>   drivers/cxl/cxlpci.h                          |  21 +-
>   drivers/cxl/mem.c                             |  53 ++-
>   drivers/cxl/pci.c                             |  86 +---
>   drivers/cxl/port.c                            |   5 +-
>   drivers/net/ethernet/sfc/Kconfig              |  10 +
>   drivers/net/ethernet/sfc/Makefile             |   1 +
>   drivers/net/ethernet/sfc/ef10.c               |  62 ++-
>   drivers/net/ethernet/sfc/efx.c                |  15 +-
>   drivers/net/ethernet/sfc/efx.h                |   1 +
>   drivers/net/ethernet/sfc/efx_cxl.c            | 191 ++++++++
>   drivers/net/ethernet/sfc/efx_cxl.h            |  40 ++
>   drivers/net/ethernet/sfc/net_driver.h         |  12 +
>   drivers/net/ethernet/sfc/nic.h                |   3 +
>   include/cxl/cxl.h                             | 295 ++++++++++++
>   include/cxl/pci.h                             |  40 ++
>   tools/testing/cxl/Kbuild                      |   1 -
>   tools/testing/cxl/test/mem.c                  |   3 +-
>   tools/testing/cxl/test/mock.c                 |  17 -
>   29 files changed, 1449 insertions(+), 448 deletions(-)
>   create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
>   create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h
>   create mode 100644 include/cxl/cxl.h
>   create mode 100644 include/cxl/pci.h
>
>
> base-commit: f11a5f89910a7ae970fbce4fdc02d86a8ba8570f
> prerequisite-patch-id: 44c914dd079e40d716f3f2d91653247eca731594
> prerequisite-patch-id: b13ca5c11c44a736563477d67b1dceadfe3ea19e
> prerequisite-patch-id: d0d82965bbea8a2b5ea2f763f19de4dfaa8479c3
> prerequisite-patch-id: dd0f24b3bdb938f2f123bc26b31cd5fe659e05eb
> prerequisite-patch-id: 2ea41ec399f2360a84e86e97a8f940a62561931a
> prerequisite-patch-id: 367b61b5a313db6324f9cf917d46df580f3bbd3b
> prerequisite-patch-id: 1805332a9f191bc3547927d96de5926356dac03c
> prerequisite-patch-id: 40657fd517f8e835a091c07e93d6abc08f85d395
> prerequisite-patch-id: 901eb0d91816499446964b2a9089db59656da08d
> prerequisite-patch-id: 79856c0199d6872fd2f76a5829dba7fa46f225d6
> prerequisite-patch-id: 6f3503e59a3d745e5ecff4aaed668e2d32da7e4b
> prerequisite-patch-id: e9dc88f1b91dce5dc3d46ff2b5bf184aba06439d
> prerequisite-patch-id: 196fe106100aad619d5be7266959bbeef29b7c8b
> prerequisite-patch-id: 7e719ed404f664ee8d9b98d56f58326f55ea2175
> prerequisite-patch-id: 560f95992e13a08279034d5f77aacc9e971332dd
> prerequisite-patch-id: 8656445ee654056695ff2894e28c8f1014df919e
> prerequisite-patch-id: 001d831149eb8f9ae17b394e4bcd06d844dd39d9
> prerequisite-patch-id: 421368aa5eac2af63ef2dc427af2ec11ad45c925
> prerequisite-patch-id: 18fd00d4743711d835ad546cfbb558d9f97dcdfc
> prerequisite-patch-id: d89bf9e6d3ea5d332ec2c8e441f1fe6d84e726d3
> prerequisite-patch-id: 3a6953d11b803abeb437558f3893a3b6a08acdbb
> prerequisite-patch-id: 0dd42a82e73765950bd069d421d555ded8bfeb25
> prerequisite-patch-id: da6e0df31ad0d5a945e0a0d29204ba75f0c97344
> prerequisite-patch-id: ed7d9c768af2ac4e6ce87df2efd0ec359856c6e5
> prerequisite-patch-id: ed7f4dce80b4f80ccafb57efcd6189a6e14c9208
> prerequisite-patch-id: ccadb682c5edc3babaef5fe7ecb76ee5daa27ea4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ