linux-kernel - Re: [PATCH v2] dt-bindings: media: Add sram-size Property for Wave5

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240205192003.3qns6cxqurqnnj7c@udba0500997>
Date: Mon, 5 Feb 2024 13:20:03 -0600
From: Brandon Brnich <b-brnich@...com>
To: Nishanth Menon <nm@...com>
CC: Krzysztof Kozlowski <krzysztof.kozlowski@...aro.org>,
        Nas Chung
	<nas.chung@...psnmedia.com>,
        Jackson Lee <jackson.lee@...psnmedia.com>,
        Mauro
 Carvalho Chehab <mchehab@...nel.org>,
        Rob Herring <robh+dt@...nel.org>,
        Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
        Conor Dooley
	<conor+dt@...nel.org>, <linux-media@...r.kernel.org>,
        <devicetree@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        Vignesh
 Raghavendra <vigneshr@...com>,
        Darren Etheridge <detheridge@...com>
Subject: Re: [PATCH v2] dt-bindings: media: Add sram-size Property for Wave5

On 08:12-20240205, Nishanth Menon wrote:
> On 17:08-20240202, Krzysztof Kozlowski wrote:
> > On 02/02/2024 16:58, Nishanth Menon wrote:
> > > On 14:56-20240202, Krzysztof Kozlowski wrote:
> > >> On 02/02/2024 13:52, Nishanth Menon wrote:
> > >>> On 11:47-20240202, Krzysztof Kozlowski wrote:
> > >>>> On 01/02/2024 19:42, Brandon Brnich wrote:
> > >>>>> Wave521c has capability to use SRAM carveout to store reference data with
> > >>>>> purpose of reducing memory bandwidth. To properly use this pool, the driver
> > >>>>> expects to have an sram and sram-size node. Without sram-size node, driver
> > >>>>> will default value to zero, making sram node irrelevant.
> > >>>>
> > >>>> I am sorry, but what driver expects should not be rationale for new
> > >>>> property. This justification suggests clearly it is not a property for DT.
> > >>>>
> > >>>
> > >>> Yup, the argumentation in the commit message is from the wrong
> > >>> perspective. bindings are OS agnostic hardware description, and what
> > >>> driver does with the description is driver's problem.
> > >>>
> > >>> I will at least paraphrase my understanding:
> > >>> In this case, however, the hardware block will limp along with
> > >>> the usage of DDR (as is the current description), due to the
> > >>> latencies involved for DDR accesses. However, the hardware block
> > >>> has capability to use a substantially lower latency SRAM to provide
> > >>> proper performance and hence for example, deal with higher resolution
> > >>> data streams. This SRAM is instantiated at SoC level rather than
> > >>> embedded within the hardware block itself.
> > >>
> > >> That sounds like OS policy. Why would different boards with the same
> > >> component have this set differently? Based on amount of available
> > >> memory? This, I believe, is runtime configuration because it might
> > >> depend on user-space you run. Based on purpose (e.g. optimize for
> > >> decoding or general usage)? Again, run-time because same hardware board
> > >> can be used for different purposes.
> > >>
> > > 
> > > Why is this OS policy? It is a hardware capability.
> > 
> > How amount of SRAM size is hardware capability? Each hardware can work
> > probably with 1, 2 or 100 pages.
> > 
> > > Traditionally
> > > many similar hardware blocks would have allocated local SRAM for
> > > worst case inside the hardware block itself and don't need to use
> > > DDR in the first place. However, for this hardware block, it has
> > > capability to use some part of one of the many SRAM blocks in the SoC,
> > > not be shared for some part of the system - so from a hardware
> > > description perspective, we will need to call that out as to which
> > > SRAM is available for the hardware block.
> > 
> > Just because more than one device wants some memory, does not mean this
> > is hardware property. Drivers can ask how much memory available there
> > is. OS knows how many users of memory there is, so knows how much to
> > allocate for each device.
> > 
> > > 
> > > Why would different boards need this differently? simply because
> > > different cameras have different resolution and framerates - and you
> > > dont want to pay the worst case sram penalty for all product
> > > configuration.
> > 
> > Choice of resolution and framerate is runtime choice or use-case
> > dependent, not board level configuration, therefore amount of SRAM size
> > to use is as well.
> 
> I am arguing this is similar to what we have for remote-procs. Yes,
> usecases usage come to a conclusion that sram size X is needed. Sure.
> Lets even argue that sram = <&sram> has X+100 bytes available, so as
> far as allocator is concerned, it can allocate. But how does the driver
> know that 1k of that sram is already used by a remote core or some other
> function?
> 
> This is no different from a remote proc usecase, following which, I
> wonder if "memory-region" is the right way to describe this usage? That
> would be a very specific bucket that is made available to the driver.
> And I'd say sram and memory-region are two mutually exclusive option?

Wouldn't this just be a static allocation of the SRAM then? I believe
the correct way to do this is highlighted in Rob's[0] response. This is
also something we have done in the past[1], but I thought dynamic
allocation was preferred method so that the VPU didn't hog a portion of
SRAM. Seems wasteful in cases where the VPU is not being used.

The device itself has the capability of doing runtime allocation before
any decoder/encoder stream instances are opened. All of these opened
streams will share this one allocated pool, meaning first open stream
allocates and the rest share. Because of this, the goal is to allocate
enough to meet maximum use case of VPU (4K60) or max case supported by
the SoC itself if the SoC is unable to handle 4K60.

Is there preferred method for dynamic SRAM allocation? I understand
point that framerate and resolution are runtime choice, but these
properties are not guaranteed to be known when streams are being opened.

If static SRAM allocation is the correct method to go, then this series
can be ignored and I will add section in device tree and remove check
for parameter in driver as that will now be a bug.

[0]: https://patchwork.kernel.org/project/linux-media/patch/20240201184238.2542695-1-b-brnich@ti.com/#25696671
[1]: https://patchwork.kernel.org/project/linux-media/patch/20231108-wave5-v14-rebased-v14-8-0b4af1258656@collabora.com/
 
> > 
> > > 
> > > Further, Linux is not the only thing that runs on these SoCs.. these are
> > > mixed systems with autonomous operations of uC cores who may or maynot
> > > (typically not) even need to communicate with MPU to state which part of
> > > resource they are hogging (hence the board level definition).
> > 
> > OK that could be the case but I could also say choice of RTOS or any
> > other is also board-independent.
> -- 
> Regards,
> Nishanth Menon
> Key (0xDDB5849D1736249D) / Fingerprint: F8A2 8693 54EB 8232 17A3  1A34 DDB5 849D 1736 249D

Thanks,
Brandon