linux-kernel - Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7984ef64-fb6d-52f6-26bf-00a685a3efc5@codeaurora.org>
Date:   Mon, 17 Jul 2017 17:58:31 +0530
From:   Sricharan R <sricharan@...eaurora.org>
To:     Rob Clark <robdclark@...il.com>, Will Deacon <will.deacon@....com>
Cc:     Vivek Gautam <vivek.gautam@...eaurora.org>,
        Stephen Boyd <sboyd@...eaurora.org>,
        Joerg Roedel <joro@...tes.org>,
        Robin Murphy <robin.murphy@....com>,
        Rob Herring <robh+dt@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
        "devicetree@...r.kernel.org" <devicetree@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-clk <linux-clk@...r.kernel.org>,
        linux-arm-msm <linux-arm-msm@...r.kernel.org>,
        Stanimir Varbanov <stanimir.varbanov@...aro.org>,
        Archit Taneja <architt@...eaurora.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe,
 add/remove device

Hi,

On 7/17/2017 5:16 PM, Sricharan R wrote:
> Hi,
> 
> On 7/15/2017 1:09 AM, Rob Clark wrote:
>> On Fri, Jul 14, 2017 at 3:36 PM, Will Deacon <will.deacon@....com> wrote:
>>> On Fri, Jul 14, 2017 at 03:34:42PM -0400, Rob Clark wrote:
>>>> On Fri, Jul 14, 2017 at 3:01 PM, Will Deacon <will.deacon@....com> wrote:
>>>>> On Fri, Jul 14, 2017 at 02:25:45PM -0400, Rob Clark wrote:
>>>>>> On Fri, Jul 14, 2017 at 2:06 PM, Will Deacon <will.deacon@....com> wrote:
>>>>>>> On Fri, Jul 14, 2017 at 01:42:13PM -0400, Rob Clark wrote:
>>>>>>>> On Fri, Jul 14, 2017 at 1:07 PM, Will Deacon <will.deacon@....com> wrote:
>>>>>>>>> On Thu, Jul 13, 2017 at 10:55:10AM -0400, Rob Clark wrote:
>>>>>>>>>> On Thu, Jul 13, 2017 at 9:53 AM, Sricharan R <sricharan@...eaurora.org> wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> On 7/13/2017 5:20 PM, Rob Clark wrote:
>>>>>>>>>>>> On Thu, Jul 13, 2017 at 1:35 AM, Sricharan R <sricharan@...eaurora.org> wrote:
>>>>>>>>>>>>> Hi Vivek,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/13/2017 10:43 AM, Vivek Gautam wrote:
>>>>>>>>>>>>>> Hi Stephen,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 07/13/2017 04:24 AM, Stephen Boyd wrote:
>>>>>>>>>>>>>>> On 07/06, Vivek Gautam wrote:
>>>>>>>>>>>>>>>> @@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
>>>>>>>>>>>>>>>>   static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
>>>>>>>>>>>>>>>>                    size_t size)
>>>>>>>>>>>>>>>>   {
>>>>>>>>>>>>>>>> -    struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
>>>>>>>>>>>>>>>> +    struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>>>>>>>>>>>>>>>> +    struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
>>>>>>>>>>>>>>>> +    size_t ret;
>>>>>>>>>>>>>>>>         if (!ops)
>>>>>>>>>>>>>>>>           return 0;
>>>>>>>>>>>>>>>>   -    return ops->unmap(ops, iova, size);
>>>>>>>>>>>>>>>> +    pm_runtime_get_sync(smmu_domain->smmu->dev);
>>>>>>>>>>>>>>> Can these map/unmap ops be called from an atomic context? I seem
>>>>>>>>>>>>>>> to recall that being a problem before.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That's something which was dropped in the following patch merged in master:
>>>>>>>>>>>>>> 523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Looks like we don't  need locks here anymore?
>>>>>>>>>>>>>
>>>>>>>>>>>>>  Apart from the locking, wonder why a explicit pm_runtime is needed
>>>>>>>>>>>>>  from unmap. Somehow looks like some path in the master using that
>>>>>>>>>>>>>  should have enabled the pm ?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, there are a bunch of scenarios where unmap can happen with
>>>>>>>>>>>> disabled master (but not in atomic context).  On the gpu side we
>>>>>>>>>>>> opportunistically keep a buffer mapping until the buffer is freed
>>>>>>>>>>>> (which can happen after gpu is disabled).  Likewise, v4l2 won't unmap
>>>>>>>>>>>> an exported dmabuf while some other driver holds a reference to it
>>>>>>>>>>>> (which can be dropped when the v4l2 device is suspended).
>>>>>>>>>>>>
>>>>>>>>>>>> Since unmap triggers tbl flush which touches iommu regs, the iommu
>>>>>>>>>>>> driver *definitely* needs a pm_runtime_get_sync().
>>>>>>>>>>>
>>>>>>>>>>>  Ok, with that being the case, there are two things here,
>>>>>>>>>>>
>>>>>>>>>>>  1) If the device links are still intact at these places where unmap is called,
>>>>>>>>>>>     then pm_runtime from the master would setup the all the clocks. That would
>>>>>>>>>>>     avoid reintroducing the locking indirectly here.
>>>>>>>>>>>
>>>>>>>>>>>  2) If not, then doing it here is the only way. But for both cases, since
>>>>>>>>>>>     the unmap can be called from atomic context, resume handler here should
>>>>>>>>>>>     avoid doing clk_prepare_enable , instead move the clk_prepare to the init.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I do kinda like the approach Marek suggested.. of deferring the tlb
>>>>>>>>>> flush until resume.  I'm wondering if we could combine that with
>>>>>>>>>> putting the mmu in a stalled state when we suspend (and not resume the
>>>>>>>>>> mmu until after the pending tlb flush)?
>>>>>>>>>
>>>>>>>>> I'm not sure that a stalled state is what we're after here, because we need
>>>>>>>>> to take care to prevent any table walks if we've freed the underlying pages.
>>>>>>>>> What we could try to do is disable the SMMU (put into global bypass) and
>>>>>>>>> invalidate the TLB when performing a suspend operation, then we just ignore
>>>>>>>>> invalidation whilst the clocks are stopped and, on resume, enable the SMMU
>>>>>>>>> again.
>>>>>>>>
>>>>>>>> wouldn't stalled just block any memory transactions by device(s) using
>>>>>>>> the context bank?  Putting it in bypass isn't really a good thing if
>>>>>>>> there is any chance the device can sneak in a memory access before
>>>>>>>> we've taking it back out of bypass (ie. makes gpu a giant userspace
>>>>>>>> controlled root hole).
>>>>>>>
>>>>>>> If it doesn't deadlock, then yes, it will stall transactions. However, that
>>>>>>> doesn't mean it necessarily prevents page table walks.
>>>>>>
>>>>>> btw, I guess the concern about pagetable walk is that the unmap could
>>>>>> have removed some sub-level of the pt that the tlb walk would hit?
>>>>>> Would deferring freeing those pages help?
>>>>>
>>>>> Could do, but it sounds like a lot of complication that I think we can fix
>>>>> by making the suspend operation put the SMMU into a "clean" state.
>>>>>
>>>>>>> Instead of bypass, we
>>>>>>> could configure all the streams to terminate, but this race still worries me
>>>>>>> somewhat. I thought that the SMMU would only be suspended if all of its
>>>>>>> masters were suspended, so if the GPU wants to come out of suspend then the
>>>>>>> SMMU should be resumed first.
>>>>>>
>>>>>> I believe this should be true.. on the gpu side, I'm mostly trying to
>>>>>> avoid having to power the gpu back on to free buffers.  (On the v4l2
>>>>>> side, somewhere in the core videobuf code would also need to be made
>>>>>> to wrap it's dma_unmap_sg() with pm_runtime_get/put()..)
>>>>>
>>>>> Right, and we shouldn't have to resume it if we suspend it in a clean state,
>>>>> with the TLBs invalidated.
>>>>>
>>>>
>>>> I guess if the device_link() stuff ensured the attached device
>>>> (gpu/etc) was suspended before suspending the iommu, then I guess I
>>>> can't see how temporarily putting the iommu in bypass would be a
>>>> problem.  I haven't looked at the device_link() stuff too closely, but
>>>> iommu being resumed first and suspended last seems like the only thing
>>>> that would make sense.  I'm mostly just nervous about iommu in bypass
>>>> vs gpu since userspace has so much control over what address gpu
>>>> writes to / reads from, so getting it wrong w/ the iommu would be a
>>>> rather bad thing ;-)
>>>
>>> Right, but we can also configure it to terminate if you don't want bypass.
>>>
>>
> 
>  But one thing here is, with devicelinks in picture, iommu suspend/resume
>  is called along with the master. That means, we can end up cleaning even
>  active entries on the suspend path ?, if suspend is going to
>  put the smmu in to a clean state every time. So if the master's are following
>  the pm_runtime sequence before a dma_map/unmap operation, that seems better.
> 

 Also, for the usecase of unmap being done from master's like GPU while it is already
 suspended, then following the Marek's approach of checking for the smmu state while
 in unmap and defer the TLB flush till resume seems correct way. All of the above
 true if we want to use device_link.

Regards,
 Sricharan

-- 
"QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus