lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <zcgn4xw2xghyna2eysavujbzbiydyki7p7upzzv7one5mdyjy6@sj7f75kc4vwu>
Date: Mon, 13 Oct 2025 15:31:10 +0300
From: Dmitry Baryshkov <dmitry.baryshkov@....qualcomm.com>
To: Robin Murphy <robin.murphy@....com>
Cc: Charan Teja Kalla <charan.kalla@....qualcomm.com>, joro@...tes.org,
        will@...nel.org, saravanak@...gle.com, conor+dt@...nel.org,
        robh@...nel.org, mchehab@...nel.org, bod@...nel.org,
        krzk+dt@...nel.org, abhinav.kumar@...ux.dev,
        vikash.garodia@....qualcomm.com, dikshita.agarwal@....qualcomm.com,
        Konrad Dybcio <konrad.dybcio@....qualcomm.com>,
        bjorn.andersson@....qualcomm.com, linux-media@...r.kernel.org,
        linux-arm-msm@...r.kernel.org, devicetree@...r.kernel.org,
        linux-kernel@...r.kernel.org, iommu@...ts.linux.dev
Subject: Re: [RFC PATCH 0/3] Introduce iommu-map-masked for platform devices

On Mon, Oct 13, 2025 at 12:20:54PM +0100, Robin Murphy wrote:
> On 2025-10-09 7:25 pm, Dmitry Baryshkov wrote:
> > On Thu, Oct 09, 2025 at 06:03:29PM +0100, Robin Murphy wrote:
> > > On 2025-10-09 2:19 pm, Dmitry Baryshkov wrote:
> > > > On Thu, Oct 09, 2025 at 11:46:55AM +0100, Robin Murphy wrote:
> > > > > On 2025-10-08 8:10 pm, Charan Teja Kalla wrote:
> > > > > > 
> > > > > > On 9/29/2025 3:50 PM, Robin Murphy wrote:
> > > > > > > > USECASE [1]:
> > > > > > > > -----------
> > > > > > > > Video IP, 32bit, have 2 hardware sub blocks(or can be called as
> > > > > > > > functions) called as pixel and nonpixel blocks, that does decode and
> > > > > > > > encode of the video stream. These sub blocks are __configured__ to
> > > > > > > > generate different stream IDs.
> > > > > > > 
> > > > > > > So please clarify why you can't:
> > > > > > > 
> > > > > > > a) Describe the sub-blocks as individual child nodes each with their own
> > > > > > > distinct "iommus" property
> > > > > > > 
> > > > > > 
> > > > > > Thanks Robin for your time. Sorry for late reply as I really didn't have
> > > > > > concrete answer for this question.
> > > > > > 
> > > > > > First let me clarify the word "sub blocks" -- This is just the logical
> > > > > > separation with no separate address space to really able to define them
> > > > > > as sub devices. Think of it like a single video IP with 2 dma
> > > > > > engines(used for pixel and non-pixel purpose).
> > > > > > 
> > > > > > I should agree that the child-nodes in the device tree is the easy one
> > > > > > and infact, it is how being used in downstream.
> > > > > > 
> > > > > > For upstream -- Since there is no real address space to interact with
> > > > > > these sub-blocks(or logical blocks), does it really qualify to define as
> > > > > > child nodes in the device tree? I see there is some push back[1].
> > > > > 
> > > > > Who says you need an address space? Child nodes without "reg" properties,
> > > > > referenced by name, compatible or phandle, exist all over the place for all
> > > > > manner of reasons. If there are distinct logical functions with their own
> > > > > distinct hardware properties, then I would say having child nodes to
> > > > > describe and associate those properties with their respective functions is
> > > > > entirely natural and appropriate. The first example that comes to mind of
> > > > > where this is a well-established practice is PMICs - to pick one at random:
> > > > > Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.yaml
> > > > 
> > > > Logical function, that's correct. And also note, for PMICs that practice
> > > > has bitten us back. For PM8008 we switched back to a non-subdevice
> > > > representation.
> > > > 
> > > > > For bonus irony, you can't take the other approaches without inherently
> > > > > *introducing* a notional address space in the form of your logical function
> > > > > IDs anyway.
> > > > > 
> > > > > >      > or:
> > > > > > > 
> > > > > > > b) Use standard "iommu-map" which already supports mapping a masked
> > > > > > > input ID to an arbitrary IOMMU specifier
> > > > > > > 
> > > > > > 
> > > > > > I think clients is also required to program non-zero smr mask, where as
> > > > > > iommu-map just maps the id to an IOMMU specifier(sid). Please LMK if I
> > > > > > am unable to catch your thought here.
> > > > > An IOMMU specifier is whatever the target IOMMU node's #iommu-cells says it
> > > > > is. The fact that Linux's parsing code only works properly for #iommu-cells
> > > > > = 1 is not really a DT binding problem (other than it stemming from a loose
> > > > > assumption stated in the PCI binding's use of the property).
> > > > 
> > > > I really don't like the idea of extending the #iommu-cells. The ARM SMMU
> > > > has only one cell, which is correct even for our platforms. The fact
> > > > that we need to identify different IOMMU SIDs (and handle them in a
> > > > differnt ways) is internal to the video device (and several other
> > > > devices). There is nothing to be handled on the ARM SMMU side.
> > > 
> > > Huh? So if you prefer not to change anything, are you suggesting this series
> > > doesn't need to exist at all? Now I'm thoroughly confused...
> > 
> > Hmm. We need changes, but I don't feel like adding the FUNCTION_ID to
> > #iommu-cells is the best idea.
> 
> What? No, any function ID would be an *input* to a map, not part of the
> output specifier; indeed it should never go anywhere near the IOMMU, I don't
> think anyone suggested that.

It was Bryan, https://lore.kernel.org/linux-arm-msm/9bae595a-597e-46e6-8eb2-44424fe21db6@linaro.org

> 
> > > If you want to use SMR masks, then you absolutely need #iommu-cells = 2,
> > > because that is the SMMU binding for using SMR masks. It would definitely
> > 
> > I'm sorry. Yes, we have #iommu-cells = <2>.
> > 
> > > not be OK to have some magic property trying to smuggle
> > > IOMMU-driver-specific data contrary to what the IOMMU node itself says. As
> > > for iommu-map, I don't see what would be objectionable about improving the
> > > parsing to respect a real #iommu-cells value rather than hard-coding an
> > > assumption. Yes, we'd probably need to forbid entries with length > 1
> > > targeting IOMMUs with #iommu-cells > 1, since the notion of a linear
> > 
> > This will break e.g. PCIe on Qualcomm platforms:
> > 
> >                          iommu-map = <0x0   &apps_smmu 0x1400 0x1>,
> >                                      <0x100 &apps_smmu 0x1401 0x1>;
> > 
> > 
> > But this seems unlogical anyway wrt. apps_smmu having #iommu-cells =
> > <2>. It depends on ARM SMMU ignoring the second cell when it's not
> > present.
> 
> Urgh, yes, that's just broken already :(
> 
> At least they all seem to be a sufficiently consistent pattern that a
> targeted workaround to detect old DTBs looks feasible (I'm thinking, if
> iommu-map size % 4 == 0 and cells n*4 + 3 are all 1 and cells n*4 + 1 are
> all the same phandle to an IOMMU with #iommu-cells == 2, then parse as if
> #iommu-cells == 1)

How do we handle the case of #iommu-cells = <2>? I.e. what should be the
"fixed" representation of the map above? Should we have usual cells and
one extra "length" just for the sake of it?

               iommu-map = <0x0   &apps_smmu 0x1400 0x0 0x1>,
                           <0x100 &apps_smmu 0x1401 0x0 0x1>;


I really like the idea of fixing iommu-map as that would remove the need
for other properties, but 

> 
> > > relationship between the input ID and the output specifier falls apart when
> > > the specifier is complex, but that seems simple enough to implement and
> > > document (even if it's too fiddly to describe in the schema itself), and
> > > still certainly no worse than having another property that *is* just
> > > iommu-map with implicit length = 1.
> > > 
> > > And if you want individual StreamIDs for logical functions to be attachable
> > > to distinct contexts then those functions absolutely must be visible to the
> > > IOMMU layer and the SMMU driver as independent devices with their own unique
> > > properties, which means either they come that way from the DT as of_platform
> > > devices in the first place, or you implement a full bus_type abstraction
> > 
> > Not necessarily. Tegra display driver creates a device for each context
> > on its own.
> No, the *display* driver does not; the host1x bus driver does, which is the
> point I was making - that has a proper bus abstraction tied into the IOMMU
> layer, such that the devices are correctly configured long before the actual
> DRM driver(s) get anywhere near them.

Ack. I agree. it's drivers/gpu/host1x/context, not drivers/gpu/drm/

> 
> > In fact, using OF to create context devices is _less_
> > robust, because now the driver needs to sync, checking that there is a
> > subdevice, that it has probed, etc. Using manually created devices seems
> > better from my POV.
> 
> Huh? A simple call to of_platform_populate() is somehow less robust than
> open-coding much of the same logic that of_platform_populate() does plus a
> bunch of hackery to try to fake up an of_node to make the new device appear
> to own the appropriate properties?
> 
> Having entire sub-*drivers* for child devices or not is an orthogonal issue
> regardless of whichever way they are created.

I was (again) looking at host1x. It doesn't fake of_node (nor does it
have actual OF nodes). Instead it just mapps IOMMUs directly to the
context devices. Compare this to misc/fastrpc.c, which has subdevices
and drivers to map contexts. The latter one looks less robust.

And from DT perspective compare:

		fastrpc {
			compatible = "qcom,fastrpc";
			#address-cells = <1>;
			#size-cells = <0>;

			compute-cb@3 {
				compatible = "qcom,fastrpc-compute-cb";
				reg = <3>;
				iommus = <&apps_smmu 0x1803 0x0>;
			};

			compute-cb@4 {
				compatible = "qcom,fastrpc-compute-cb";
				reg = <4>;
				iommus = <&apps_smmu 0x1804 0x0>;
			};

			compute-cb@5 {
				compatible = "qcom,fastrpc-compute-cb";
				reg = <5>;
				iommus = <&apps_smmu 0x1805 0x0>;
			};
		};

VS (note, it doesn't have 'length', it can be added back with no issues):

		fastrpc {
			compatible = "qcom,fastrpc";
			#address-cells = <1>;
			#size-cells = <0>;

			iommu-map = <3 &apps_smmu 0x1803 0x0>,
				    <4 &apps_smmu 0x1804 0x0>,
				    <5 &apps_smmu 0x1805 0x0>;
		};


I think the latter is more compact, and more robust.

Note, to make a complete example, it should be probably something like
(sc7280, cdsp, note duplicate IDs in the map, again, I omitted length):

	       fastrpc {
			compatible = "qcom,fastrpc";

			iommu-map = <1 &apps_smmu 0x11a1 0x0420>,
				    <1 &apps_smmu 0x1181 0x0420>,
				    <2 &apps_smmu 0x11a2 0x0420>,
				    <2 &apps_smmu 0x1182 0x0420>,
				    <3 &apps_smmu 0x11a3 0x0420>,
				    <3 &apps_smmu 0x1183 0x0420>;
			dma-coherent;
		};


> > > which will have to be hooked up to the IOMMU layer. You cannot make IOMMU
> > > configuration "internal" to the actual client driver which is only allowed
> > > to bind *after* said IOMMU configuration has already been made.
> > 
> > I'm not sure I follow this, I'm sorry.
> I mean IOMMU configuration is designed to happen at device_add() time, and
> client drivers must not assume otherwise (the mechanisms for handling IOMMU
> drivers registering "late" from modules are internal details that can and
> will change). If you're under the impression that a straightforward platform
> driver for the video codec itself would be able to invoke IOMMU
> configuration for the video codec platform device (without unacceptable
> levels of hackery) then you are mistaken, sorry.
> 
> Again, to be able to assign StreamIDs to different contexts, those StreamIDs
> must uniquely belong to different struct devices. Thus in terms of how you
> get to those struct devices from a DT representation, either they come from
> distinct DT nodes with standard "iommus" properties that the generic
> of_platform code can create and configure accordingly, or you're doing a
> non-trivial amount of work to implement your own bus layer like
> host1x_context_bus to manage your own type of sub-device. There is no valid
> middle ground of trying to stuff driver-specific knowledge of arbitrarily
> made-up function IDs into the generic platform bus code.


I'd totally prefer something like:

	video-codec@...bar {
		compatible = "qcom,video";

		iommus = <&apps_smmu 0x1234 0xca>;
		iommu-maps = <PIXEL &apps_smmu 0xabcdef 0xac>,
			     <SECURE_PIXEL &apps_smmu 0x898989 0xac>,
			     <SECURE_BITSTREAM &apps_smmu 0x898998 0xac>;
	};

-- 
With best wishes
Dmitry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ