[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87frcmxlnl.fsf@AUSNATLYNCH.amd.com>
Date: Tue, 16 Sep 2025 14:06:06 -0500
From: Nathan Lynch <nathan.lynch@....com>
To: Jonathan Cameron <jonathan.cameron@...wei.com>
CC: Vinod Koul <vkoul@...nel.org>, Wei Huang <wei.huang2@....com>, "Mario
Limonciello" <mario.limonciello@....com>, Bjorn Helgaas
<bhelgaas@...gle.com>, <linux-pci@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <dmaengine@...r.kernel.org>, Kees Cook
<kees@...nel.org>
Subject: Re: [PATCH RFC 03/13] dmaengine: sdxi: Add descriptor encoding and
unit tests
Jonathan Cameron <jonathan.cameron@...wei.com> writes:
> On Mon, 15 Sep 2025 14:30:23 -0500
> Nathan Lynch <nathan.lynch@....com> wrote:
>
> +CC Kees given I refer to a prior discussion Kees helped out with
> and this is a different related case.
>
>> Jonathan Cameron <jonathan.cameron@...wei.com> writes:
>> > On Fri, 05 Sep 2025 13:48:26 -0500
>> > Nathan Lynch via B4 Relay <devnull+nathan.lynch.amd.com@...nel.org> wrote:
>> >> +++ b/drivers/dma/sdxi/descriptor.c
>> >
>> >> +enum {
>> >> + SDXI_PACKING_QUIRKS = QUIRK_LITTLE_ENDIAN | QUIRK_LSW32_IS_FIRST,
>> >> +};
>> >> +
>> >> +#define sdxi_desc_field(_high, _low, _member) \
>> >> + PACKED_FIELD(_high, _low, struct sdxi_desc_unpacked, _member)
>> >> +#define sdxi_desc_flag(_bit, _member) \
>> >> + sdxi_desc_field(_bit, _bit, _member)
>> >> +
>> >> +static const struct packed_field_u16 common_descriptor_fields[] = {
>> >> + sdxi_desc_flag(0, vl),
>> >> + sdxi_desc_flag(1, se),
>> >> + sdxi_desc_flag(2, fe),
>> >> + sdxi_desc_flag(3, ch),
>> >> + sdxi_desc_flag(4, csr),
>> >> + sdxi_desc_flag(5, rb),
>> >> + sdxi_desc_field(15, 8, subtype),
>> >> + sdxi_desc_field(26, 16, type),
>> >> + sdxi_desc_flag(448, np),
>> >> + sdxi_desc_field(511, 453, csb_ptr),
>> >
>> > I'm not immediately seeing the advantage of dealing with unpacking in here
>> > when patch 2 introduced a bunch of field defines that can be used directly
>> > in the tests.
>>
>> My idea is to use the bitfield macros (GENMASK etc) for the real code
>> that encodes descriptors while using the packing API in the tests for
>> those functions.
>>
>> By limiting what's shared between the real code and the tests I get more
>> confidence in both. If both the driver code and the tests rely on the
>> bitfield macros, and then upon adding a new descriptor field I
>> mistranslate the bit numbering from the spec, that error is more likely
>> to propagate to the tests undetected than if the test code relies on a
>> separate mechanism for decoding descriptors.
>
> That's a fair reason. Perhaps add a comment just above the first
> instance of this or top of file to express that?
OK. Looks like sdxi_desc_unpack() and the related field description
structure could be moved to the test code too.
>> I find the packing API quite convenient to use for the SDXI descriptor
>> tests since the spec defines the fields in terms of bit offsets that can
>> be directly copied to a packed_field_ array.
>>
>>
>> >> +};
>
>> >> + u64 csb_ptr;
>> >> + u32 opcode;
>> >> +
>> >> + opcode = (FIELD_PREP(SDXI_DSC_VL, 1) |
>> >> + FIELD_PREP(SDXI_DSC_FE, 1) |
>> >> + FIELD_PREP(SDXI_DSC_SUBTYPE, SDXI_DSC_OP_SUBTYPE_CXT_STOP) |
>> >> + FIELD_PREP(SDXI_DSC_TYPE, SDXI_DSC_OP_TYPE_ADMIN));
>> >> +
>> >> + cxt_start = params->range.cxt_start;
>> >> + cxt_end = params->range.cxt_end;
>> >> +
>> >> + csb_ptr = FIELD_PREP(SDXI_DSC_NP, 1);
>> >> +
>> >> + desc_clear(desc);
>> >
>> > Not particularly important, but I'd be tempted to combine these with
>> >
>> > *desc = (struct sdxi_desc) {
>> > .ctx_stop = {
>> > .opcode = cpu_to_le32(opcode),
>> > .cxt_start = cpu_to_le16(cxt_start),
>> > .cxt_end = cpu_to_le16(cxt_end),
>> > .csb_ptr = cpu_to_le64(csb_ptr),
>> > },
>> > };
>> >
>> > To me that more clearly shows what is set and that the
>> > rest is zeroed.
>>
>> Maybe I prefer your version too. Just mentioning in case it's not clear:
>> cxt_stop is a union member with the same size as the enclosing struct
>> sdxi_desc. Each member of struct sdxi_desc's interior anonymous union is
>> intended to completely overlay the entire object.
>>
>> The reason for the preceding desc_clear() is that the designated
>> initializer construct does not necessarily zero padding bytes in the
>> object. Now, there *shouldn't* be any padding bytes in SDXI descriptors
>> as I've defined them, so I'm hoping the redundant stores are discarded
>> in the generated code. But I haven't checked this.
>
> So, this one is 'fun' (and I can hopefully find the references)
> The C spec has had some updates that might cover this though
> I'm not sure and too lazy to figure it out today. Anyhow,
> that doesn't help anyway as we care about older compilers.
>
> So we cheat and just check the compiler does fill them ;)
>
> Via a reply Kees sent on a discussion of the somewhat related {}
> https://lore.kernel.org/linux-iio/202505090942.48EBF01B@keescook/
>
> https://elixir.bootlin.com/linux/v6.17-rc6/source/lib/tests/stackinit_kunit.c
>
> I think the relevant one is __dynamic_all which is used with various hole sizes
> and with both bare structures and unions.
>
> +CC Kees who might have time to shout if I have this particular case
> wrong ;)
Thanks for the references, when making this decision I consulted:
https://gustedt.wordpress.com/2012/10/24/c11-defects-initialization-of-padding/
and
https://interrupt.memfault.com/blog/c-struct-padding-initialization
But we seem to agree that it's a moot point for this code if I make the
changes discussed below.
>> And it looks like I neglected to mark all the descriptor structs __packed,
>> oops.
>>
>> I think I can add the __packed to struct sdxi_desc et al, use your
>> suggested initializer, and discard desc_clear().
>
> That would indeed work.
>
>>
>>
>> >> + desc->cxt_stop = (struct sdxi_dsc_cxt_stop) {
>> >> + .opcode = cpu_to_le32(opcode),
>> >> + .cxt_start = cpu_to_le16(cxt_start),
>> >> + .cxt_end = cpu_to_le16(cxt_end),
>> >> + .csb_ptr = cpu_to_le64(csb_ptr),
>> >> + };
Powered by blists - more mailing lists