linux-kernel - Re: [PATCH v2] drm: add modifiers for Apple GPU layouts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAPj87rMpSbaOe0qEU8x-VFCGOvoWpyRURr=0bJ80=cdkTQiAbQ@mail.gmail.com>
Date: Wed, 26 Feb 2025 13:55:40 +0000
From: Daniel Stone <daniel@...ishbar.org>
To: Alyssa Rosenzweig <alyssa@...enzweig.io>
Cc: David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>, 
	Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>, Maxime Ripard <mripard@...nel.org>, 
	Thomas Zimmermann <tzimmermann@...e.de>, dri-devel@...ts.freedesktop.org, 
	linux-kernel@...r.kernel.org, asahi@...ts.linux.dev
Subject: Re: [PATCH v2] drm: add modifiers for Apple GPU layouts

Hey,

On Tue, 25 Feb 2025 at 22:18, Alyssa Rosenzweig <alyssa@...enzweig.io> wrote:
> > > These layouts are notably not used for interchange across hardware
> > > blocks (e.g. with the display controller). There are other layouts for
> > > that but we don't support them either in userspace or kernelspace yet
> > > (even downstream), so we don't add modifiers here.
> >
> > Yeah, when those have users with good definitions matching these, we
> > can add them here, even if they are downstream. Anything the GPU would
> > share out of context, or the codec, would be good for this.
>
> Sure. The mentioned "other layouts" are for scanout (GPU->display), and
> apparently the GPU can render but not sample them... You can imagine
> about how well that would go without a vendor extension + compositor
> patches......
>
> (Currently we use linear framebuffers for scanout and avoid that rat's
> nest.)

Heh, impressive. Are those the twiddled-but-not-tiled ones?

> > > +/*
> > > + * Apple GPU-tiled layout.
> > > + *
> > > + * GPU-tiled images are divided into tiles. Tiles are always 16KiB, with
> > > + * dimensions depending on the base-format. Within a tile, pixels are fully
> > > + * interleaved (Morton order). Tiles themselves are raster-order.
> > > + *
> > > + * Images must be 16-byte aligned.
> > > + *
> > > + * For more information see
> > > + * https://docs.mesa3d.org/drivers/asahi.html#image-layouts
> >
> > This writeup is really nice. I would prefer it was inlined here though
> > (similar to AFBC), with Mesa then referring to the kernel, but tbf
> > Vivante does have a similar external reference.
>
> Thanks :-) I wasn't sure which way would be better. Most of the
> complexity in that writeup relates to mipmapping and arrayed images,
> which I don't think WSI hits...?

Yeah, anything that wouldn't be exported out of a GPU API context
doesn't need to be in here!

> Also, the Mesa docs are easier for me
> to update, support tables and LaTeX, have other bits of hw writeups on
> the same page, etc... so they feel like a better home for the unabridged
> version.  And since Vivante did this, I figured it was ok.
>
> If people feel strongly I can of course inline it, it's just not clear
> to me that dumping all that info into the header here is actually
> desired. (And there's even more info Marge queued ...
> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33743/diffs?commit_id=5ddb57ceb46d42096a34cbef1df6b4109ac2b511
> )

I don't feel strongly about this btw, was just thinking out loud, and
also lets you move asahi.html around (e.g. split into subpages)
without having to worry about breaking old links.

> > The one thing it's missing is an explicit description of how stride is
> > computed and used. I can see the 'obvious' way to do it for this and
> > compression, but it would be good to make it explicit, given some
> > misadventures in the past. CCS is probably the gold standard to refer
> > to here.
>
> Ah, right -- thanks for raising that! I'll add this for v3. Indeed, I
> picked the "obvious" way, given said misadventures with AFBC ;)
>
> This is the relevant Mesa code:
>
>    /*
>     * For WSI purposes, we need to associate a stride with all layouts.
>     * In the hardware, only strided linear images have an associated
>     * stride, there is no natural stride associated with twiddled
>     * images.  However, various clients assert that the stride is valid
>     * for the image if it were linear (even if it is in fact not
>     * linear). In those cases, by convention we use the minimum valid
>     * such stride.
>     */
>    static inline uint32_t
>    ail_get_wsi_stride_B(const struct ail_layout *layout, unsigned level)
>    {
>       assert(level == 0 && "Mipmaps cannot be shared as WSI");
>
>       if (layout->tiling == AIL_TILING_LINEAR)
>          return ail_get_linear_stride_B(layout, level);
>       else
>          return util_format_get_stride(layout->format, layout->width_px);
>    }
>
> I can either copy that comment here, or to make that logic more explicit:
>
>     Producers must set the stride to the image width in
>     pixels times the bytes per pixel. This is a software convention, the
>     hardware does not use this stride.

Hrm. I guess more in keeping with how it's used in other APIs, as well
as more kind of future-proof in the sense of not needing possibly
gen-specific rounding everywhere, would be to pass (n_tiles_h(buf) *
tile_size_bytes) / n_rows_in_tile(format). That gives you the same
information as you get for other users of stride, and might help make
things more explicit for small tiles as well? Plus would apply pretty
obviously to compression.

I know it seems pretty inconsequential, but it does help for utils
which e.g. dump and copy content. And makes sure no-one can make some
dumb assumptions and get it wrong.

Cheers,
Daniel