linux-kernel - Re: [PATCH v2] drm: omapdrm: Export correct scatterlist for TILER backed BOs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <16416f38-2d00-be53-23b1-894064beeaaf@gmail.com>
Date:   Tue, 16 Nov 2021 13:12:20 +0200
From:   Ivaylo Dimitrov <ivo.g.dimitrov.75@...il.com>
To:     Tomi Valkeinen <tomi.valkeinen@...asonboard.com>
Cc:     matthijsvanduin@...il.com, airlied@...ux.ie, daniel@...ll.ch,
        linux-kernel@...r.kernel.org, linux-omap@...r.kernel.org,
        Laurent Pinchart <laurent.pinchart@...asonboard.com>,
        Merlijn Wajer <merlijn@...zup.org>,
        Carl Philipp Klemm <philipp@...s.xyz>
Subject: Re: [PATCH v2] drm: omapdrm: Export correct scatterlist for TILER
 backed BOs

On 16.11.21 г. 12:20 ч., Tomi Valkeinen wrote:
> On 16/11/2021 10:27, Ivaylo Dimitrov wrote:
>> Hi,
>>
>> On 16.11.21 г. 8:42 ч., Tomi Valkeinen wrote:
>>> On 15/11/2021 19:15, Ivaylo Dimitrov wrote:
>>>> Hi,
>>>>
>>>> On 15.11.21 г. 17:37 ч., Tomi Valkeinen wrote:
>>>>> On 15/11/2021 15:55, Ivaylo Dimitrov wrote:
>>>>>> Hi Tomi,
>>>>>>
>>>>>> On 15.11.21 г. 10:42 ч., Tomi Valkeinen wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 13/11/2021 11:53, Ivaylo Dimitrov wrote:
>>>>>>>> Memory of BOs backed by TILER is not contiguous, but 
>>>>>>>> omap_gem_map_dma_buf()
>>>>>>>> exports it like it is. This leads to (possibly) invalid memory 
>>>>>>>> accesses if
>>>>>>>> another device imports such a BO.
>>>>>>>
>>>>>>> This is one reason why TILER hasn't been officially supported. 
>>>>>>> But the above is not exactly right, or at least not the whole truth.
>>>>>>>
>>>>>>
>>>>>> Definitely, not only these BOs lie about their memory layout, they 
>>>>>> lie about size and alignment as well. I have 2 more patches here 
>>>>>> (one is to align TILER memory on page, as proposed by Matthijs in 
>>>>>> the other mail, the other to set the correct size when exporting 
>>>>>> TILER BO), but I wanted to hear from you first, like, what is the 
>>>>>> general trend :) .
>>>>>
>>>>> My thoughts here are that the current code doesn't work in 
>>>>> practice, so if you get it fixed, it's great =).
>>>>>
>>>>>> Also, I have another patch in mind, that will enable exporting of 
>>>>>> buffers that are not TILER backed, but are not CMA backed either. 
>>>>>> SGX for example does not need CMA memory to render to.
>>>>>
>>>>> What do you mean with this? DSS needs contiguous memory, so the 
>>>>> memory has to be 1) physically contiguous, 2) mapped with DMM or 3) 
>>>>> mapped with TILER. There's no reason for the driver to export 
>>>>> non-contiguous memory.
>>>>>
>>>>
>>>> DSS yes, but, omapdrm is used to allocate non-scanout buffers as 
>>>> well, which do not need to be (and in practice are not) contiguous. 
>>>> GPU (or anyone with MMU) can render on them (DRI buffers for 
>>>> example) and later on those buffers can be copied (blit) to the 
>>>> framebuffer. Yes, not zero-copy, but if you're doing compositing, 
>>>> there is no option anyway.
>>>>
>>>> Exactly this is done by omap-video driver for example. GBM BOs are 
>>>> allocated through omapdrm as well.
>>>
>>> That is not correct and shouldn't be done. omapdrm is not a generic 
>>> memory allocator. We have real generic allocators, so those should be 
>>> used. Or, if the buffer is only used for a single device, the buffer 
>>> should be allocated from that device's driver.
>>>
>>
>> Yes, I saw the comment in kernel headers that dumb buffers should not 
>> be used for rendering 
>> (https://elixir.bootlin.com/linux/latest/source/include/drm/drm_drv.h#L361). 
>> This makes no sense to me at all, but maybe I am missing the point.
> 
> I believe that comments refers to another issue: a dumb buffer from may 
> not be usable for rendering. It's only guaranteed to be 
> readable/writable by the CPU.
> 
> What I'm talking about is that a driver must support memory allocations 
> for buffers that the device handled by the driver can use. In many cases 
> that allocated buffer also works with other devices, and thus dmabuf 
> export/import can be used. But a driver supporting memory allocations 
> for buffers that the device itself cannot use is just wrong.
> 
>> Also, it could be that the implementation of omap-video and/or PVR 
>> userspace blobs is against the specs, but I see omap-video calling 
>> DRM_IOCTL_OMAP_GEM_NEW for DRI buffers without OMAP_BO_SCANOUT and 
>> libdbm.so calling DRM_IOCTL_MODE_CREATE_DUMB to create buffers then 
>> used for rendering.
> 
> I think neither of those are exactly material to be used as examples on 
> how to do things. And there's lots of history there. We didn't have 
> generic allocators back then.
> 
>> This is not an issue on omap4 an later, because when export of that 
>> buffer is requested, omapdrm uses DMM and exports a single scatterlist 
>> entry, IIUC.
>>
>> But, on omap3, given there is no DMM, export is simply refused. I 
>> don't see that as a consistent behaviour - we shall either a) export 
>> non-scanout buffers (scattered ones) using whatever is supported (DMM 
>> and single scatterlist entry on omap4 (and later), multiple-entry 
>> scatterlist on omap3) or b) always require OMAP_BO_SCANOUT for BOs to 
>> be exported and refuse to export if no such flag is set. I would say 
>> b) is not a good option which leaves a) only.
> 
> I think we should always require OMAP_BO_SCANOUT, or rather, drop the 
> flag totally and always expect the buffer to be a scanout buffer. The 
> only use for DSS is scanout, and those are the only buffers that omapdrm 
> needs to support. But that would be breaking the uAPI, so I think we 
> just have to support what we do now.
> 
>> BTW, I think DMM is not really needed unless userspace requests 
>> mmap(), in theory we can provide userspace with view through DMM but 
>> give device drivers multiple entry scatterlist, potentially saving DMM 
>> space.
> 
> The userspace (CPU) doesn't need the DMM, the CPU has an MMU. I thought 
> we already skip the DMM when mapping to the userspace. But in TILER case 
> we always need TILER, even with the CPU.
> 
>> I hope I made it clearer now why I think this feature shall be 
>> implemented.
> 
> I think it's just adding more wrong on top of the old wrong =).
> 
> Also, if we need DMM/TILER allocations for other devices than DSS (but 
> so far this hasn't been mentioned), then I think the DMM/TILER 
> functionality should be separated from omapdrm and moved to (I think) 
> dma-heap.
> 

I see. Ok, it is not that much of an issue for me to carry one 
out-of-tree patch if needed.

>>>>>> 2. Set exp_info.size = omap_gem_mmap_size(obj); when exporting a 
>>>>>> BO. That way importer knows the real BO memory size (including 
>>>>>> alignment etc) so he will be able to calculate the number of pages 
>>>>>> he needs to map the scatterlist.
>>>>>
>>>>> Can you elaborate what this means?
>>>>>
>>>>
>>>> When we align to page, we shall report the size including the 
>>>> alignment, no? Or, it is the importer that shall take care to 
>>>> calculate BO size( including the alignment) based on scatterlist if 
>>>> he needs to?
>>>
>>> I'm not sure... But I guess the export size should include the 
>>> alignment.
>>>
>>
>> My understanding as well. Will sent that change as a part of page 
>> alignment patch.
>>
>>> Hmm... I haven't had enough coffee yet, but how does this go... Let's 
>>> say we have a tiled fb, and the width gets expanded to a page. What 
>>> happens to reads/writes that happen outside the fb, but still within 
>>> the page? Those should cause an error or do nothing, but is it 
>>> possible that they go through TILER and get mapped to some real 
>>> memory location?
>>>
>>
>> I lack the details here, but reading through TRM leaves me with the 
>> impression that TILER smallest unit is a tile, and every tile is 
>> backed by a real memory page (4KiB), so outside read-writes will end 
>> up in memory that's there but unused and will do nothing.
>>
>> omap_gem_new() calls tiler_align(), which in turn seems to return 
>> page-aligned size, so I think there is no issue here.
> 
> Maybe, but, consider this example, with numbers totally out of thin air: 
> We have a fb with the width of 32 pixels, so 128 bytes. If we have tiles 
> which cover 32 x 32 pixels (so 4096 bytes with 4 bpp), we need one tile 
> to cover the width. But we have all the rest of the page mapped, so 3968 
> bytes that are not covered with a tile (or rather, we haven't configured 
> that tile, or maybe the tile contains old configuration).
> 
> I could be totally wrong here, as I don't remember the details. But I do 
> think that it's very easy to get this wrong, creating memory corruptions 
> and/or security violations.
> 

I see what you mean and I think you are right. What about configuring 
(if we use your made-up values) those 31 'unused' TILER entries to point 
to the same page tile 1 points to? If we need more than one tile (say N) 
to cover the width, apply the same logic, like, tile N+1 points to page 
1, N+2 to page 2 and so on. Or, allocate one extra page and setup all 
'unused' tiles to point to it? Both will guarantee no other memory can 
be accessed, IIUC, though I prefer the 'overlap' workaround.

Also, we can have a single page used to back all the 'unused' tiler 
entries, for all the TILER BOs, but with such an attributes that it is 
not actually accessible (not sure how such a page shall be allocated 
though), so if someone tries to access memory outside of the allowed 
region and hits that page, we'll either have SEGFAULT or OOPS.

Ivo