[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <MWHPR11MB188688A4C436B318AB0CDCB98CBD9@MWHPR11MB1886.namprd11.prod.outlook.com>
Date: Mon, 25 Jan 2021 08:16:32 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Lu Baolu <baolu.lu@...ux.intel.com>,
Joerg Roedel <joro@...tes.org>, "Will Deacon" <will@...nel.org>
CC: "Raj, Ashok" <ashok.raj@...el.com>,
Jacob Pan <jacob.jun.pan@...ux.intel.com>,
"Liu, Yi L" <yi.l.liu@...el.com>,
"iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH 1/3] iommu/vt-d: Add rate limited information when PRQ
overflows
> From: Lu Baolu <baolu.lu@...ux.intel.com>
> Sent: Monday, January 25, 2021 2:29 PM
>
> Hi Kevin,
>
> On 2021/1/22 14:38, Tian, Kevin wrote:
> >> From: Lu Baolu <baolu.lu@...ux.intel.com>
> >> Sent: Thursday, January 21, 2021 9:45 AM
> >>
> >> So that the uses could get chances to know what happened.
> >>
> >> Suggested-by: Ashok Raj <ashok.raj@...el.com>
> >> Signed-off-by: Lu Baolu <baolu.lu@...ux.intel.com>
> >> ---
> >> drivers/iommu/intel/svm.c | 10 ++++++++--
> >> 1 file changed, 8 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> >> index 033b25886e57..f49fe715477b 100644
> >> --- a/drivers/iommu/intel/svm.c
> >> +++ b/drivers/iommu/intel/svm.c
> >> @@ -895,6 +895,7 @@ static irqreturn_t prq_event_thread(int irq, void
> *d)
> >> struct intel_iommu *iommu = d;
> >> struct intel_svm *svm = NULL;
> >> int head, tail, handled = 0;
> >> + struct page_req_dsc *req;
> >>
> >> /* Clear PPR bit before reading head/tail registers, to
> >> * ensure that we get a new interrupt if needed. */
> >> @@ -904,7 +905,6 @@ static irqreturn_t prq_event_thread(int irq, void
> *d)
> >> head = dmar_readq(iommu->reg + DMAR_PQH_REG) &
> >> PRQ_RING_MASK;
> >> while (head != tail) {
> >> struct vm_area_struct *vma;
> >> - struct page_req_dsc *req;
> >> struct qi_desc resp;
> >> int result;
> >> vm_fault_t ret;
> >> @@ -1042,8 +1042,14 @@ static irqreturn_t prq_event_thread(int irq,
> void
> >> *d)
> >> * Clear the page request overflow bit and wake up all threads that
> >> * are waiting for the completion of this handling.
> >> */
> >> - if (readl(iommu->reg + DMAR_PRS_REG) & DMA_PRS_PRO)
> >> + if (readl(iommu->reg + DMAR_PRS_REG) & DMA_PRS_PRO) {
> >> + head = dmar_readq(iommu->reg + DMAR_PQH_REG) &
> >> PRQ_RING_MASK;
> >> + req = &iommu->prq[head / sizeof(*req)];
> >> + pr_warn_ratelimited("IOMMU: %s: Page request overflow:
> >> HEAD: %08llx %08llx",
> >> + iommu->name, ((unsigned long long
> >> *)req)[0],
> >> + ((unsigned long long *)req)[1]);
> >> writel(DMA_PRS_PRO, iommu->reg + DMAR_PRS_REG);
> >> + }
> >>
> >
> > Not about rate limiting but I think we may have a problem in above
> > logic. It is incorrect to always clear PRO when it's set w/o first checking
> > whether the overflow condition has been cleared. This code assumes
> > that if an overflow condition occurs it must have been cleared by earlier
> > loop when hitting this check. However since this code runs in a threaded
> > context, the overflow condition could occur even after you reset the head
> > to the tail (under some extreme condition). To be sane I think we'd better
> > read both head/tail again after seeing a pending PRO here and only clear
> > PRO when it becomes a false indicator based on latest head/tail.
> >
>
> Yes, agreed. We can check the head and tail and clear the overflow bit
> until the queue is empty. The finial code looks like:
>
> /*
> * Clear the page request overflow bit and wake up all threads that
> * are waiting for the completion of this handling.
> */
> if (readl(iommu->reg + DMAR_PRS_REG) & DMA_PRS_PRO) {
> head = dmar_readq(iommu->reg + DMAR_PQH_REG) &
> PRQ_RING_MASK;
> tail = dmar_readq(iommu->reg + DMAR_PQT_REG) &
> PRQ_RING_MASK;
> if (head == tail) {
> req = &iommu->prq[head / sizeof(*req)];
> pr_warn_ratelimited("IOMMU: %s: Page request
> overflow cleared: HEAD: %08llx %08llx",
> iommu->name, ((unsigned
> long long *)req)[0],
> ((unsigned long long
> *)req)[1]);
> writel(DMA_PRS_PRO, iommu->reg + DMAR_PRS_REG);
> }
> }
>
> Thought?
>
Just a small comment. Is it useful to also print a warning in the true
overflow condition which has to wait for next interrupt to be cleared?
Thanks
Kevin
Powered by blists - more mailing lists