[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <36C91A76-8C6D-47F0-8229-099779634A53@gridcentric.ca>
Date: Thu, 1 Aug 2013 09:30:08 -0400
From: Andres Lagar-Cavilla <andreslc@...dcentric.ca>
To: David Vrabel <david.vrabel@...rix.com>
Cc: Andres Lagar-Cavilla <andreslc@...dcentric.ca>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"xen-devel@...ts.xen.org" <xen-devel@...ts.xen.org>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
Andres Lagar-Cavilla <andres@...arcavilla.com>,
<boris.ostrovsky@...cle.com>
Subject: Re: [PATCH] Xen: Fix retry calls into PRIVCMD_MMAPBATCH*.
On Aug 1, 2013, at 8:04 AM, David Vrabel <david.vrabel@...rix.com> wrote:
> On 01/08/13 12:49, Andres Lagar-Cavilla wrote:
>> On Aug 1, 2013, at 7:23 AM, David Vrabel <david.vrabel@...rix.com> wrote:
>>
>>> On 01/08/13 04:30, Andres Lagar-Cavilla wrote:
>>>> -- Resend as I haven't seen this hit the lists. Maybe some smtp misconfig. Apologies. Also expanded cc --
>>>>
>>>> When a foreign mapper attempts to map guest frames that are paged out,
>>>> the mapper receives an ENOENT response and will have to try again
>>>> while a helper process pages the target frame back in.
>>>>
>>>> Gating checks on PRIVCMD_MMAPBATCH* ioctl args were preventing retries
>>>> of mapping calls.
>>>
>>> This breaks the auto_translated_physmap case as will allocate another
>>> set of empty pages and leak the previous set.
>>
>> David,
>> not able to follow you here. Under what circumstances will another
>> set of empty pages be allocated? And where? are we talking page table pages?
>
> ....
> vma = find_vma(mm, m.addr);
> if (!vma ||
> vma->vm_ops != &privcmd_vm_ops ||
> (m.addr != vma->vm_start) ||
> ((m.addr + (nr_pages << PAGE_SHIFT)) != vma->vm_end) ||
> !privcmd_enforce_singleshot_mapping(vma)) {
> up_write(&mm->mmap_sem);
> ret = -EINVAL;
> goto out;
> }
> if (xen_feature(XENFEAT_auto_translated_physmap)) {
> ret = alloc_empty_pages(vma, m.num);
>
> Here.
Right right right. Excellent observation thanks. I fwd ported from 3.4 and this slipped through the cracks. Ok, V2 coming.
>
> if (ret < 0) {
> up_write(&mm->mmap_sem);
> goto out;
> }
> }
>
>
>>> This privcmd_enforce_singleshot_mapping() stuff seems very odd anyway.
>>> Does anyone know what it was for originally? It would be preferrable if
>>> we could update the mappings with a new set of foreign MFNs without
>>> having to tear down the VMA and recreate a new VMA.
>>
>> I believe it's mostly historical. I agree with you on principle, but recreating VMAs is super-cheap.
>
> Tearing them down is not cheap as each page requires a trap-and-emulate
> to clear the PTE (see ptep_get_and_clear_full() in zap_pte_range()).
You need to tell the hypervisor to drop the ref on the mapped page. So you'd need a hyper call (arguably a multi-call) to do that, which is not free. Then you'd need privcmd and libxc to collude on agreeing to reuse the vma -- which has very low value in itself, just a piece of metadata. And you still need to deal with cleaning up the mapped refs when the mapping process crashes.
So a whole lot of new complexity for small value, imho.
Probably that's the whole point of the singleshot: don't forget you have something mapped in there. Because if you do you might leak the ref forever.
Andres
>
> David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists