[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAAFQd5CnhE6KBEeG=LKKqb2Ry324Wye=CQCuLjD_2Fd5JzpgcQ@mail.gmail.com>
Date: Mon, 24 Jan 2022 14:37:19 +0900
From: Tomasz Figa <tfiga@...omium.org>
To: Dafna Hirschfeld <dafna.hirschfeld@...labora.com>
Cc: Joerg Roedel <joro@...tes.org>, iommu@...ts.linux-foundation.org,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"open list:ARM/Rockchip SoC..." <linux-rockchip@...ts.infradead.org>,
Heiko Stuebner <heiko@...ech.de>,
Daniel Kurtz <djkurtz@...omium.org>,
Collabora Kernel ML <kernel@...labora.com>
Subject: Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table
state is coherent
Hi Dafna,
On Fri, Dec 10, 2021 at 12:18 AM Dafna Hirschfeld
<dafna.hirschfeld@...labora.com> wrote:
>
>
>
> On 23.03.15 10:38, Tomasz Figa wrote:
> > Sorry, I had to dig my way out through my backlog.
> >
> > On Tue, Mar 3, 2015 at 10:36 PM, Joerg Roedel <joro@...tes.org> wrote:
> >> On Mon, Feb 09, 2015 at 08:19:21PM +0900, Tomasz Figa wrote:
> >>> Even though the code uses the dt_lock spin lock to serialize mapping
> >>> operation from different threads, it does not protect from IOMMU
> >>> accesses that might be already taking place and thus altering state
> >>> of the IOTLB. This means that current mapping code which first zaps
> >>> the page table and only then updates it with new mapping which is
> >>> prone to mentioned race.
> >>
> >> Could you elabortate a bit on the race and why it is sufficient to zap
> >> only the first and the last iova? From the description and the comments
> >> in the patch this is not clear to me.
> >
> > Let's start with why it's sufficient to zap only first and last iova.
> >
> > While unmapping, the driver zaps all iovas belonging to the mapping,
> > so the page tables not used by any mapping won't be cached. Now when
> > the driver creates a mapping it might end up occupying several page
> > tables. However, since the mapping area is virtually contiguous, only
> > the first and last page table can be shared with different mappings.
> > This means that only first and last iovas can be already cached. In
> > fact, we could detect if first and last page tables are shared and do
> > not zap at all, but this wouldn't really optimize too much. Why
> > invalidating one iova is enough to invalidate the whole page table is
> > unclear to me as well, but it seems to be the correct way on this
> > hardware.
>
> Hi,
> It seems to me that actually each mapping needs exactly one page.
> Since (as the inline doc in rk_iommu_map states) the pgsize_bitmap
> makes sure that iova mappings fits exactly into one page table
> since the mapping size is maximum 4M.
>
> This actually means that if rk_dte_get_page_table does not allocate a
> new page table but returns one that is already partially used from previous
> mappings then two page tables might be required, but I think the iova
> allocation somehow make sure that this will not be the case.
Yes, it was exactly for the case. Note that the zap operation is
per-IO-page and not per IOPT and there is some prefetching going on in
the TLB of this IOMMU. So neighboring mappings can interfere with each
other.
>
> If it was the case then the code would be buggy because it means
> that the loop in rk_iommu_map_iova will write behind the page table
> given in rk_dte_get_page_table (which we didn't allocate)
Sorry, I don't see how it could write behind the page table. Could you
give me an example?
>
> So I it seems to me that calling 'rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);'
> as done before this patch should be used, but be moved from
> rk_dte_get_page_table to where rk_iommu_zap_iova_first_last is now
>
> Thanks,
> Dafna
>
> >
> > As for the race, it's also kind of explained by the above. The already
> > running hardware can trigger page table look-ups in the IOMMU and so
> > caching of the page table between our zapping and updating its
> > contents. With this patch zapping is performed after updating the page
> > table so the race is gone.
> >
> > Best regards,
> > Tomasz
> >
> > From mboxrd@z Thu Jan 1 00:00:00 1970
> > Return-Path: <linux-kernel-owner@...r.kernel.org>
> > Received: (majordomo@...r.kernel.org) by vger.kernel.org via listexpand
> > id S1753210AbbCWM3R (ORCPT <rfc822;w@....eu>);
> > Mon, 23 Mar 2015 08:29:17 -0400
> > Received: from 8bytes.org ([81.169.241.247]:33957 "EHLO theia.8bytes.org"
> > rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
> > id S1752552AbbCWM3M (ORCPT <rfc822;linux-kernel@...r.kernel.org>);
> > Mon, 23 Mar 2015 08:29:12 -0400
> > Date: Mon, 23 Mar 2015 13:29:10 +0100
> > From: Joerg Roedel <joro@...tes.org>
> > To: Tomasz Figa <tfiga@...omium.org>
> > Cc: iommu@...ts.linux-foundation.org,
> > "linux-arm-kernel@...ts.infradead.org"
> > <linux-arm-kernel@...ts.infradead.org>,
> > "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
> > "open list:ARM/Rockchip SoC..." <linux-rockchip@...ts.infradead.org>,
> > Heiko Stuebner <heiko@...ech.de>, Daniel Kurtz <djkurtz@...omium.org>
> > Subject: Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table
> > state is coherent
> > Message-ID: <20150323122910.GO4441@...tes.org>
> > References: <1423480761-33453-1-git-send-email-tfiga@...omium.org>
> > <20150303133659.GD10502@...tes.org>
> > <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@...l.gmail.com>
> > MIME-Version: 1.0
> > Content-Type: text/plain; charset=us-ascii
> > Content-Disposition: inline
> > In-Reply-To: <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@...l.gmail.com>
> > User-Agent: Mutt/1.5.21 (2010-09-15)
> > Sender: linux-kernel-owner@...r.kernel.org
> > List-ID: <linux-kernel.vger.kernel.org>
> > X-Mailing-List: linux-kernel@...r.kernel.org
> >
> > Hi Tomasz,
> >
> > On Mon, Mar 23, 2015 at 05:38:45PM +0900, Tomasz Figa wrote:
> >> While unmapping, the driver zaps all iovas belonging to the mapping,
> >> so the page tables not used by any mapping won't be cached. Now when
> >> the driver creates a mapping it might end up occupying several page
> >> tables. However, since the mapping area is virtually contiguous, only
> >> the first and last page table can be shared with different mappings.
> >> This means that only first and last iovas can be already cached. In
> >> fact, we could detect if first and last page tables are shared and do
> >> not zap at all, but this wouldn't really optimize too much. Why
> >> invalidating one iova is enough to invalidate the whole page table is
> >> unclear to me as well, but it seems to be the correct way on this
> >> hardware.
> >>
> >> As for the race, it's also kind of explained by the above. The already
> >> running hardware can trigger page table look-ups in the IOMMU and so
> >> caching of the page table between our zapping and updating its
> >> contents. With this patch zapping is performed after updating the page
> >> table so the race is gone.
> >
> > Okay, this makes sense. Can you add this information to the patch
> > changelog and resend please?
> >
> > Thanks,
> >
> > Joerg
> >
> >
> > From mboxrd@z Thu Jan 1 00:00:00 1970
> > From: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@...lic.gmane.org>
> > Subject: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is
> > coherent
> > Date: Mon, 9 Feb 2015 20:19:21 +0900
> > Message-ID: <1423480761-33453-1-git-send-email-tfiga@...omium.org>
> > Mime-Version: 1.0
> > Content-Type: text/plain; charset="us-ascii"
> > Content-Transfer-Encoding: 7bit
> > Return-path: <iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@...lic.gmane.org>
> > List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/iommu>,
> > <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@...lic.gmane.org?subject=unsubscribe>
> > List-Archive: <http://lists.linuxfoundation.org/pipermail/iommu/>
> > List-Post: <mailto:iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@...lic.gmane.org>
> > List-Help: <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@...lic.gmane.org?subject=help>
> > List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/iommu>,
> > <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@...lic.gmane.org?subject=subscribe>
> > Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@...lic.gmane.org
> > Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@...lic.gmane.org
> > To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@...lic.gmane.org
> > Cc: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@...lic.gmane.org>, linux-kernel-u79uwXL29TY76Z2rM5mHXA@...lic.gmane.org, Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@...lic.gmane.org>, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@...lic.gmane.org>, linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@...lic.gmane.org, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@...lic.gmane.org
> > List-Id: iommu@...ts.linux-foundation.org
> >
> > Even though the code uses the dt_lock spin lock to serialize mapping
> > operation from different threads, it does not protect from IOMMU
> > accesses that might be already taking place and thus altering state
> > of the IOTLB. This means that current mapping code which first zaps
> > the page table and only then updates it with new mapping which is
> > prone to mentioned race.
> >
> > In addition, current code assumes that mappings are always > 4 MiB
> > (which translates to 1024 PTEs) and so they would always occupy
> > entire page tables. This is not true for mappings created by V4L2
> > Videobuf2 DMA contig allocator.
> >
> > This patch changes the mapping code to always zap the page table
> > after it is updated, which avoids the aforementioned race and also
> > zap the last page of the mapping to make sure that stale data is
> > not cached from an already existing mapping.
> >
> > Signed-off-by: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@...lic.gmane.org>
> > Reviewed-by: Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@...lic.gmane.org>
> > ---
> > drivers/iommu/rockchip-iommu.c | 23 +++++++++++++++++------
> > 1 file changed, 17 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> > index 6a8b1ec..b06fe76 100644
> > --- a/drivers/iommu/rockchip-iommu.c
> > +++ b/drivers/iommu/rockchip-iommu.c
> > @@ -544,6 +544,15 @@ static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
> > spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> > }
> >
> > +static void rk_iommu_zap_iova_first_last(struct rk_iommu_domain *rk_domain,
> > + dma_addr_t iova, size_t size)
> > +{
> > + rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> > + if (size > SPAGE_SIZE)
> > + rk_iommu_zap_iova(rk_domain, iova + size - SPAGE_SIZE,
> > + SPAGE_SIZE);
> > +}
> > +
> > static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
> > dma_addr_t iova)
> > {
> > @@ -568,12 +577,6 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
> > rk_table_flush(page_table, NUM_PT_ENTRIES);
> > rk_table_flush(dte_addr, 1);
> >
> > - /*
> > - * Zap the first iova of newly allocated page table so iommu evicts
> > - * old cached value of new dte from the iotlb.
> > - */
> > - rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> > -
> > done:
> > pt_phys = rk_dte_pt_address(dte);
> > return (u32 *)phys_to_virt(pt_phys);
> > @@ -623,6 +626,14 @@ static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
> >
> > rk_table_flush(pte_addr, pte_count);
> >
> > + /*
> > + * Zap the first and last iova to evict from iotlb any previously
> > + * mapped cachelines holding stale values for its dte and pte.
> > + * We only zap the first and last iova, since only they could have
> > + * dte or pte shared with an existing mapping.
> > + */
> > + rk_iommu_zap_iova_first_last(rk_domain, iova, size);
> > +
> > return 0;
> > unwind:
> > /* Unmap the range of iovas that we just mapped */
> >
Powered by blists - more mailing lists