[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <cc8fd552-c6cb-46d6-a781-28d004b49d52@gmail.com>
Date: Thu, 27 Jun 2024 13:10:38 +0200
From: Alex Bee <knaerzche@...il.com>
To: Jonas Karlman <jonas@...boo.se>,
Detlev Casanova <detlev.casanova@...labora.com>
Cc: Ezequiel Garcia <ezequiel@...guardiasur.com.ar>,
Mauro Carvalho Chehab <mchehab@...nel.org>, Rob Herring <robh@...nel.org>,
Krzysztof Kozlowski <krzk+dt@...nel.org>, Conor Dooley
<conor+dt@...nel.org>, Heiko Stuebner <heiko@...ech.de>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Sebastian Reichel <sebastian.reichel@...labora.com>,
Dragan Simic <dsimic@...jaro.org>, Diederik de Haas <didi.debian@...ow.org>,
Andy Yan <andy.yan@...k-chips.com>,
Boris Brezillon <boris.brezillon@...labora.com>,
Hans Verkuil <hverkuil-cisco@...all.nl>,
Daniel Almeida <daniel.almeida@...labora.com>,
Paul Kocialkowski <paul.kocialkowski@...tlin.com>,
Nicolas Dufresne <nicolas.dufresne@...labora.com>,
Benjamin Gaignard <benjamin.gaignard@...labora.com>,
linux-media@...r.kernel.org, linux-rockchip@...ts.infradead.org,
devicetree@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-staging@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 2/4] media: rockchip: Introduce the rkvdec2 driver
Am 27.06.24 um 11:41 schrieb Jonas Karlman:
> Hi Alex,
>
> On 2024-06-26 11:12, Alex Bee wrote:
>> Hi Detlev,
>>
>> Am 25.06.24 um 18:56 schrieb Detlev Casanova:
>>> Hi Alex,
>>>
>>> On Sunday, June 23, 2024 5:33:28 A.M. EDT you wrote:
>>>> Hi Detlev,
>>>>
>>>> Am 20.06.24 um 16:19 schrieb Detlev Casanova:
>>>>> This driver supports the second generation of the Rockchip Video
>>>>> decoder, also known as vdpu34x.
>>>>> It is currently only used on the RK3588(s) SoC.
>>>>>
>>>>> There are 2 decoders on the RK3588 SoC that can work in pair to decode
>>>>> 8K video at 30 FPS but currently, only using one core at a time is
>>>>> supported.
>>>>>
>>>>> Scheduling requests between the two cores will be implemented later.
>>>>>
>>>>> The core supports H264, HEVC, VP9 and AVS2 decoding but this driver
>>>>> currently only supports H264.
>>>>>
>>>>> The driver is based on rkvdec and they may share some code in the
>>>>> future.
>>>>> The decision to make a different driver is mainly because rkvdec2 has
>>>>> more features and can work with multiple cores.
>>>>>
>>>>> The registers are mapped in a struct in RAM using bitfields. It is IO
>>>>> copied to the HW when all values are configured.
>>>>> The decision to use such a struct instead of writing buffers one by one
>>>>>
>>>>> is based on the following reasons:
>>>>> - Rockchip cores are known to misbehave when registers are not written
>>>>>
>>>>> in address order,
>>>>>
>>>>> - Those cores also need the software to write all registers, even if
>>>>>
>>>>> they are written their default values or are not related to the task
>>>>> (this core will not start decoding some H264 frames if some VP9
>>>>> registers are not written to 0)
>>>>>
>>>>> - In the future, to support multiple cores, the scheduler could be
>>>>>
>>>>> optimized by storing the precomputed registers values and copy them
>>>>> to the HW as soos as a core becomes available.
>>>>>
>>>>> This makes the code more readable and may bring performance improvements
>>>>> in future features.
>>>>>
>>>>> Signed-off-by: Detlev Casanova <detlev.casanova@...labora.com>
>>>>> ---
>>>>>
>>>>> drivers/staging/media/Kconfig | 1 +
>>>>> drivers/staging/media/Makefile | 1 +
>>>>> drivers/staging/media/rkvdec2/Kconfig | 15 +
>>>>> drivers/staging/media/rkvdec2/Makefile | 3 +
>>>>> drivers/staging/media/rkvdec2/TODO | 9 +
>>>>> drivers/staging/media/rkvdec2/rkvdec2-h264.c | 739 +++++++++++
>>>>> drivers/staging/media/rkvdec2/rkvdec2-regs.h | 345 +++++
>>>>> drivers/staging/media/rkvdec2/rkvdec2.c | 1253 ++++++++++++++++++
>>>>> drivers/staging/media/rkvdec2/rkvdec2.h | 130 ++
>>>>> 9 files changed, 2496 insertions(+)
>>>>> create mode 100644 drivers/staging/media/rkvdec2/Kconfig
>>>>> create mode 100644 drivers/staging/media/rkvdec2/Makefile
>>>>> create mode 100644 drivers/staging/media/rkvdec2/TODO
>>>>> create mode 100644 drivers/staging/media/rkvdec2/rkvdec2-h264.c
>>>>> create mode 100644 drivers/staging/media/rkvdec2/rkvdec2-regs.h
>>>>> create mode 100644 drivers/staging/media/rkvdec2/rkvdec2.c
>>>>> create mode 100644 drivers/staging/media/rkvdec2/rkvdec2.h
>>>> ...
>>>>
>>>>> +static inline void rkvdec2_memcpy_toio(void __iomem *dst, void *src,
>>>>> size_t len) +{
>>>>> +#ifdef CONFIG_ARM64
>>>>> + __iowrite32_copy(dst, src, len);
>>>>> +#elif defined(CONFIG_ARM)
>>>> I guess that can get an "#else" since memcpy_toio exists for all archs.
>>>>
>>>>> + memcpy_toio(dst, src, len);
>>>>> +#endif
>>>>> +}
>>>>> +
>>>> ...
>>>>
>>>>> + /* Set timeout threshold */
>>>>> + if (pixels < RKVDEC2_1080P_PIXELS)
>>>>> + regs->common.timeout_threshold = RKVDEC2_TIMEOUT_1080p;
>>>>> + else if (pixels < RKVDEC2_4K_PIXELS)
>>>>> + regs->common.timeout_threshold = RKVDEC2_TIMEOUT_4K;
>>>>> + else if (pixels < RKVDEC2_8K_PIXELS)
>>>>> + regs->common.timeout_threshold = RKVDEC2_TIMEOUT_8K;
>>>>> +
>>>> Did you test if it works with anything > 8K? If so, you propably want to
>>>> make the check above
>>>>
>>>> + else
>>>> + regs->common.timeout_threshold = RKVDEC2_TIMEOUT_8K;
>>>>
>>>> Otherwise the timeout may not be set/contain invalid values from any former
>>>> stream.
>>> That's right, but it would be set to 0 because of the memset.
>>> RKVDEC2_TIMEOUT_8K might not be enough for bigger frame sizes, so I'll set it
>>> to the maximum value (0xffffffff) when frames are bigger than 8K and also adapt
>>> the watchdog time: RKVDEC2_TIMEOUT_8K is around 100 ms, but 0xffffffff is arnoud
>>> 5.3 seconds (reg032/axi_clock_freq)
>>>
>>> I'll do more tests with this as well.
>>>
>>>> ...
>>>>
>>>>> +
>>>>> +static const struct rkvdec2_coded_fmt_desc rkvdec2_coded_fmts[] = {
>>>>> + {
>>>>> + .fourcc = V4L2_PIX_FMT_H264_SLICE,
>>>>> + .frmsize = {
>>>>> + .min_width = 16,
>>>>> + .max_width = 65520,
>>>>> + .step_width = 16,
>>>>> + .min_height = 16,
>>>>> + .max_height = 65520,
>>>>> + .step_height = 16,
>>>>> + },
>>>>> + .ctrls = &rkvdec2_h264_ctrls,
>>>>> + .ops = &rkvdec2_h264_fmt_ops,
>>>>> + .num_decoded_fmts =
>>> ARRAY_SIZE(rkvdec2_h264_decoded_fmts),
>>>>> + .decoded_fmts = rkvdec2_h264_decoded_fmts,
>>>>> + .subsystem_flags =
>>> VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF,
>>>>> + },
>>>>> +};
>>>>> +
>>>> Note, that this is also given to userspace (VIDIOC_ENUM_FRAMESIZES) and
>>>> this is already incorrect in the old rkvdec driver (and hantro): From
>>>> userspace perspective we do not have a restriction in
>>>> step_width/step_width, as we are aligning any given width/height to HW
>>>> requirements in the driver - what we should give to userspace is
>>>> fsize->type = V4L2_FRMSIZE_TYPE_CONTINUOUS; fsize->stepwise.min_height =
>>>> 1; fsize->stepwise.min_width = 1; fsize->stepwise.max_height = 65520;
>>>> fsize->stepwise.max_width = 65520;
>>> Is fsize->stepwise.min_height = 1; and fsize->stepwise.min_width = 1 correct ?
>>> Or do you mean fsize->stepwise.step_height = 1; and fsize->stepwise.setp_width
>>> = 1 ?
>>>
>>> It would give this instead:
>>>
>>> .frmsize = {
>>> .min_width = 16,
>>> .max_width = 65520,
>>> .step_width = 1,
>>> .min_height = 16,
>>> .max_height = 65520,
>>> .step_height = 1,
>>> },
>>>
>>> and .vidioc_enum_framesizes sets fsize->type = V4L2_FRMSIZE_TYPE_CONTINUOUS;
>> You can't adapt this here, because this .frmsize is also given to the
>> v4l2_apply_frmsize_constraints helper, which does the actual alignment to
>> HW requirements and requires the HW step_with and step_height.
>> IIRC, we also align framesizes which are below minimum HW requirement, at
>> least in rkvdec1 driver and it looks a lot like this is done here the same:
>> so this should be .min_height = 1 and .min_width = 1. (I remember because
>> there are VP9 conformance tests with very small framesizes). And yes, it
>> looks like you've had to set .step_width and .step_height to 1 for
>> V4L2_FRMSIZE_TYPE_CONTINUOUS, not sure why that is required.
>>
>> So, imho, the final rkvdec2_enum_framesizes should look like
>>
>> +static int rkvdec2_enum_framesizes(struct file *file, void *priv,
>> + struct v4l2_frmsizeenum *fsize)
>> ....
>> + fmt = rkvdec2_find_coded_fmt_desc(fsize->pixel_format);
>> + if (!fmt)
>> + return -EINVAL;
>> +
>> + fsize->type = V4L2_FRMSIZE_TYPE_CONTINUOUS;
>> + fsize->stepwise.min_height = 1;
>> + fsize->stepwise.max_height = fmt->frmsize.max_height;
>> + fsize->stepwise.min_width = 1;
>> + fsize->stepwise.max_width = fmt->frmsize.max_width;
>> + fsize->stepwise.min_width = 1;
>> + fsize->stepwise.step_height = 1;
>> + fsize->stepwise.step_width = 1;
>> + return 0;
>> +}
>>
>> Note: Not even build tested :)
>> Jonas: maybe you can add a fixup patch to your rkvdec patches as well.
> Thanks, will take a closer look and include something for rkvdec high10
> v6 later today/tomorrow.
Great, thanks.
To back up the "IIRC" part of my previous reply about .min_width and
.min_height: v4l2_apply_frmsize_constraints does clamp_roundup on width and
height, which means it sets height = hw_min_height and/or width =
hw_min_width if the given width/height is smaller. Thus userspace doesn't
need to care about width/height as long as it's 1 <=
hw_max_height/hw_max_width.
Alex
> Regards,
> Jonas
>
>> Regards,
>>
>> Alex
>>
>>>> I guess this new driver should be an
>>>> opportunity to fix that and distinguish between internal and external
>>>> frame size requirements and the .vidioc_enum_framesizes callback should
>>>> adapted accordingly. Regards, Alex
>>> Detlev.
Powered by blists - more mailing lists