lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <691e3542c1b2a_1a37510046@dwillia2-mobl4.notmuch>
Date: Wed, 19 Nov 2025 13:23:14 -0800
From: <dan.j.williams@...el.com>
To: Terry Bowman <terry.bowman@....com>, <dave@...olabs.net>,
	<jonathan.cameron@...wei.com>, <dave.jiang@...el.com>,
	<alison.schofield@...el.com>, <dan.j.williams@...el.com>,
	<bhelgaas@...gle.com>, <shiju.jose@...wei.com>, <ming.li@...omail.com>,
	<Smita.KoralahalliChannabasappa@....com>, <rrichter@....com>,
	<dan.carpenter@...aro.org>, <PradeepVineshReddy.Kodamati@....com>,
	<lukas@...ner.de>, <Benjamin.Cheatham@....com>,
	<sathyanarayanan.kuppuswamy@...ux.intel.com>, <linux-cxl@...r.kernel.org>,
	<alucerop@....com>, <ira.weiny@...el.com>
CC: <linux-kernel@...r.kernel.org>, <linux-pci@...r.kernel.org>,
	<terry.bowman@....com>
Subject: Re: [RESEND v13 12/25] cxl/pci: Unify CXL trace logging for CXL
 Endpoints and CXL Ports

Terry Bowman wrote:
> CXL currently has separate trace routines for CXL Port errors and CXL
> Endpoint errors. This is inconvenient for the user because they must enable
> 2 sets of trace routines. Make updates to the trace logging such that a
> single trace routine logs both CXL Endpoint and CXL Port protocol errors.

No, this is not inconvient, this is required for compatible evolution of
tracepoints. The change in this patch breaks compatibility as it
violates the expectation the type and order of TP_ARGS does not change
from one kernel to next.

> Keep the trace log fields 'memdev' and 'host'. While these are not accurate
> for non-Endpoints the fields will remain as-is to prevent breaking
> userspace RAS trace consumers.
> 
> Add serial number parameter to the trace logging. This is used for EPs
> and 0 is provided for CXL port devices without a serial number.
> 
> Leave the correctable and uncorrectable trace routines' TP_STRUCT__entry()
> unchanged with respect to member data types and order.
> 
> Below is output of correctable and uncorrectable protocol error logging.
> CXL Root Port and CXL Endpoint examples are included below.
> 
> Root Port:
> cxl_aer_correctable_error: memdev=0000:0c:00.0 host=pci0000:0c serial: 0 status='CRC Threshold Hit'
> cxl_aer_uncorrectable_error: memdev=0000:0c:00.0 host=pci0000:0c serial: 0 status: 'Cache Byte Enable Parity Error' first_error: 'Cache Byte Enable Parity Error'

A root port is not a "memdev", another awkward side effect of trying to
combine 2 trace points with different use cases.

So a NAK from me for this change (unless there is an strong reason for
Linux to inflict the compat breakage), please keep the separate
tracepoints they are for distinctly different use cases. A memdev
protocol error is contained to that memdev, a port protocol error
implicates every CXL.cachemem descendant of that port.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ