[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <01bcbbe2-5560-ea42-4d75-6ab50c3060d4@linux.intel.com>
Date: Fri, 9 Sep 2016 16:19:15 +0800
From: Xiao Guangrong <guangrong.xiao@...ux.intel.com>
To: Dave Hansen <dave.hansen@...el.com>, pbonzini@...hat.com,
akpm@...ux-foundation.org, mhocko@...e.com,
dan.j.williams@...el.com
Cc: gleb@...nel.org, mtosatti@...hat.com, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, stefanha@...hat.com,
yuhuang@...hat.com, linux-mm@...ck.org,
ross.zwisler@...ux.intel.com
Subject: Re: [PATCH] Fix region lost in /proc/self/smaps
On 09/08/2016 10:05 PM, Dave Hansen wrote:
> On 09/07/2016 08:36 PM, Xiao Guangrong wrote:>> The user will see two
> VMAs in their output:
>>>
>>> A: 0x1000->0x2000
>>> C: 0x1000->0x3000
>>>
>>> Will it confuse them to see the same virtual address range twice? Or is
>>> there something preventing that happening that I'm missing?
>>>
>>
>> You are right. Nothing can prevent it.
>>
>> However, it is not easy to handle the case that the new VMA overlays
>> with the old VMA
>> already got by userspace. I think we have some choices:
>> 1: One way is completely skipping the new VMA region as current kernel
>> code does but i
>> do not think this is good as the later VMAs will be dropped.
>>
>> 2: show the un-overlayed portion of new VMA. In your case, we just show
>> the region
>> (0x2000 -> 0x3000), however, it can not work well if the VMA is a new
>> created
>> region with different attributions.
>>
>> 3: completely show the new VMA as this patch does.
>>
>> Which one do you prefer?
>
> I'd be willing to bet that #3 will break *somebody's* tooling.
> Addresses going backwards is certainly screwy. Imagine somebody using
> smaps to search for address holes and doing hole_size=0x1000-0x2000.
>
> #1 can lies about there being no mapping in place where there there may
> have _always_ been a mapping and is very similar to the bug you were
> originally fixing. I think that throws it out.
>
> #2 is our best bet, I think. It's unfortunately also the most code.
> It's also a bit of a fib because it'll show a mapping that never
> actually existed, but I think this is OK. I'm not sure what the
> downside is that you're referring to, though. Can you explain?
Yes. I was talking the case as follows:
1: read() #1: prints vma-A(0x1000 -> 0x2000)
2: unmap vma-A(0x1000 -> 0x2000)
3: create vma-B(0x80 -> 0x3000) on other file with different permission
(w, r, x)
4: read #2: prints vma-B(0x2000 -> 0x3000)
Then userspace will get just a portion of vma-B. well, maybe it is not too bad. :)
How about this changes:
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 187d84e..10ca648 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -147,7 +147,7 @@ m_next_vma(struct proc_maps_private *priv, struct vm_area_struct *vma)
static void m_cache_vma(struct seq_file *m, struct vm_area_struct *vma)
{
if (m->count < m->size) /* vma is copied successfully */
- m->version = m_next_vma(m->private, vma) ? vma->vm_start : -1UL;
+ m->version = m_next_vma(m->private, vma) ? vma->vm_end : -1UL;
}
static void *m_start(struct seq_file *m, loff_t *ppos)
@@ -176,14 +176,14 @@ static void *m_start(struct seq_file *m, loff_t *ppos)
if (last_addr) {
vma = find_vma(mm, last_addr);
- if (vma && (vma = m_next_vma(priv, vma)))
+ if (vma)
return vma;
}
m->version = 0;
if (pos < mm->map_count) {
for (vma = mm->mmap; pos; pos--) {
- m->version = vma->vm_start;
+ m->version = vma->vm_end;
vma = vma->vm_next;
}
return vma;
@@ -293,7 +293,7 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid)
vm_flags_t flags = vma->vm_flags;
unsigned long ino = 0;
unsigned long long pgoff = 0;
- unsigned long start, end;
+ unsigned long end, start = m->version;
dev_t dev = 0;
const char *name = NULL;
@@ -304,8 +304,13 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid)
pgoff = ((loff_t)vma->vm_pgoff) << PAGE_SHIFT;
}
+ /*
+ * the region [0, m->version) has already been handled, do not
+ * handle it doubly.
+ */
+ start = max(vma->vm_start, start);
+
/* We don't show the stack guard page in /proc/maps */
- start = vma->vm_start;
if (stack_guard_page_start(vma, start))
start += PAGE_SIZE;
end = vma->vm_end;
Powered by blists - more mailing lists