lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 2 Apr 2022 21:07:17 +0200
From:   Arnd Bergmann <arnd@...db.de>
To:     Sven Peter <sven@...npeter.dev>
Cc:     Arnd Bergmann <arnd@...db.de>, Hector Martin <marcan@...can.st>,
        Alyssa Rosenzweig <alyssa@...enzweig.io>,
        Rob Herring <robh+dt@...nel.org>,
        Keith Busch <kbusch@...nel.org>, "axboe@...com" <axboe@...com>,
        "hch@....de" <hch@....de>, "sagi@...mberg.me" <sagi@...mberg.me>,
        Marc Zyngier <maz@...nel.org>,
        DTML <devicetree@...r.kernel.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-nvme@...ts.infradead.org
Subject: Re: [PATCH 4/9] soc: apple: Add SART driver

On Sat, Apr 2, 2022 at 2:38 PM Sven Peter <sven@...npeter.dev> wrote:
> On Mon, Mar 21, 2022, at 18:07, Arnd Bergmann wrote:
> > On Mon, Mar 21, 2022 at 5:50 PM Sven Peter <sven@...npeter.dev> wrote:
> >> The NVMe co-processor on the Apple M1 uses a DMA address filter called
> >> SART for some DMA transactions. This adds a simple driver used to
> >> configure the memory regions from which DMA transactions are allowed.
> >>
> >> Co-developed-by: Hector Martin <marcan@...can.st>
> >> Signed-off-by: Hector Martin <marcan@...can.st>
> >> Signed-off-by: Sven Peter <sven@...npeter.dev>
> >
> > Can you add some explanation about why this uses a custom interface
> > instead of hooking into the dma_map_ops?
>
> Sure.
> In a perfect world this would just be an IOMMU implementation but since
> SART can't create any real IOVA space using pagetables it doesn't fit
> inside that subsytem.
>
> In a slightly less perfect world I could just implement dma_map_ops here
> but that won't work either because not all DMA buffers of the NVMe
> device have to go through SART and those allocations happen
> inside the same device and would use the same dma_map_ops.
>
> The NVMe controller has two separate DMA filters:
>
>    - NVMMU, which must be set up for any command that uses PRPs and
>      ensures that the DMA transactions only touch the pages listed
>      inside the PRP structure. NVMMU itself is tightly coupled
>      to the NVMe controller: The list of allowed pages is configured
>      based on command's tag id and even commands that require no DMA
>      transactions must be listed inside NVMMU before they are started.
>    - SART, which must be set up for some shared memory buffers (e.g.
>      log messages from the NVMe firmware) and for some NVMe debug
>      commands that don't use PRPs.
>      SART is only loosely coupled to the NVMe controller and could
>      also be used together with other devices. It's also the only
>      thing that changed between M1 and M1 Pro/Max/Ultra and that's
>      why I decided to separate it from the NVMe driver.
>
> I'll add this explanation to the commit message.

Ok, thanks.

> >> +static void sart2_get_entry(struct apple_sart *sart, int index, u8 *flags,
> >> +                           phys_addr_t *paddr, size_t *size)
> >> +{
> >> +       u32 cfg = readl_relaxed(sart->regs + APPLE_SART2_CONFIG(index));
> >> +       u32 paddr_ = readl_relaxed(sart->regs + APPLE_SART2_PADDR(index));
> >
> > Why do you use the _relaxed() accessors here and elsewhere in the driver?
>
> This device itself doesn't do any DMA transactions so it needs no memory
> synchronization barriers. Only the consumer (i.e. rtkit and nvme) read/write
> from/to these buffers (multiple times) and they have the required barriers
> in place whenever they are used.
>
> These buffers so far are only allocated at probe time though so even using
> the normal writel/readl here won't hurt performance at all. I can just use
> those if you prefer or alternatively add a comment why _relaxed is fine here.
>
> This is a bit similar to the discussion for the pinctrl series last year [1].

I think it's better to only use the _relaxed version where it actually helps,
with a comment about it, and use the normal version elsewhere, in
particular in functions that you have copied from the normal nvme driver.
I had tried to compare some of your code with the other version and
was rather confused by that.

        Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ