lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 13 Aug 2020 22:59:02 -0700
From:   John Stultz <john.stultz@...aro.org>
To:     Robin Murphy <robin.murphy@....com>
Cc:     lkml <linux-kernel@...r.kernel.org>,
        dri-devel <dri-devel@...ts.freedesktop.org>,
        Liam Mark <lmark@...eaurora.org>,
        "Andrew F . Davis" <afd@...com>, Laura Abbott <labbott@...nel.org>,
        Hridya Valsaraju <hridya@...gle.com>,
        linux-media@...r.kernel.org
Subject: Re: [RFC][PATCH 2/2] dma-heap: Add a system-uncached heap

On Mon, Aug 3, 2020 at 4:06 AM Robin Murphy <robin.murphy@....com> wrote:
>
> On 2020-07-29 06:16, John Stultz wrote:
> > This adds a heap that allocates non-contiguous buffers that are
> > marked as writecombined, so they are not cached by the CPU.
> >
...
> > +     ret = sg_alloc_table(new_table, table->nents, GFP_KERNEL);
> > +     if (ret) {
> > +             kfree(new_table);
> > +             return ERR_PTR(-ENOMEM);
> > +     }
> > +
> > +     new_sg = new_table->sgl;
> > +     for_each_sg(table->sgl, sg, table->nents, i) {
>
> Consider using the new sgtable helpers that are just about to land - in
> this case, for_each_sgtable_sg().

Ack! Thanks for the suggestion!


> > +             memcpy(new_sg, sg, sizeof(*sg));
> > +             new_sg->dma_address = 0;
>
> This seems a little bit hairy, as in theory a consumer could still treat
> a nonzero DMA length as the address being valid. Rather than copying the
> whole entry then trying to undo parts of that, maybe just:
>
>         sg_set_page(new_sg, sg_page(sg), sg->len, sg->offset);
>
> ?

Sounds good.


> > +static struct sg_table *dma_heap_map_dma_buf(struct dma_buf_attachment *attachment,
> > +                                          enum dma_data_direction direction)
> > +{
> > +     struct dma_heap_attachment *a = attachment->priv;
> > +     struct sg_table *table = a->table;
> > +
> > +     if (!dma_map_sg_attrs(attachment->dev, table->sgl, table->nents, direction,
> > +                           DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WRITE_COMBINE))
>
> dma_map_sgtable()
>
> Also, DMA_ATTR_WRITE_COMBINE is meaningless for streaming DMA.
>

Hrm. Ok, my grasp of "streaming" vs "consistent" definitions are maybe
slightly off, since while we are mapping and unmapping buffers, the
point of this heap is that the allocated memory is uncached/coherent,
so we avoid the cache sync overhead on each mapping/unmapping, which I
thought was closer to the "consistent" definition.

But maybe if the mapping and unmapping part is really the key
difference, then ok.

Either way, from my testing, you seem to be right that the
ATTR_WRITE_COMBINE doesn't seem to make any difference in behavior.

> > +     pgprot = pgprot_writecombine(PAGE_KERNEL);
> > +
> > +     for_each_sg(table->sgl, sg, table->nents, i) {
>
> for_each_sg_page()

Ack.

> > +     /*
> > +      * XXX This is hackish. While the buffer will be uncached, we need
> > +      * to initially flush cpu cache, since the the __GFP_ZERO on the
> > +      * allocation means the zeroing was done by the cpu and thus it is likely
> > +      * cached. Map & flush it out now so we don't get corruption later on.
> > +      *
> > +      * Ideally we could do this without using the heap device as a dummy dev.
> > +      */
> > +     dma_map_sg_attrs(dma_heap_get_dev(heap), table->sgl, table->nents,
> > +                      DMA_BIDIRECTIONAL, DMA_ATTR_WRITE_COMBINE);
>
> Again, DMA_ATTR_WRITE_COMBINE is meaningless here.
>
> > +     dma_sync_sg_for_device(dma_heap_get_dev(heap), table->sgl, table->nents,
> > +                            DMA_BIDIRECTIONAL);
>
> This doesn't do anything that the map hasn't already just done.

Good point!

> > +     }
> > +     dma_heap_get_dev(heap->heap)->dma_mask = &dummy_mask;
> > +     dma_set_mask(dma_heap_get_dev(heap->heap), DMA_BIT_MASK(64));
>
> Much as I'd hate to encourage using dma_coerce_mask_and_coherent(), I'm
> not sure this is really any better :/

Sounds fair.

Thanks so much for the review, I really appreciate the feedback!
(Sorry I was a little slow to respond. The merge window has really
been something this cycle.. :P)

I'll get this all integrated and resend the patch here shortly.

thanks
-john

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ