linux-kernel - Re: [PATCH v3 2.6.39-rc1-tip 12/26] 12: uprobes: slot allocation for uprobes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110419062654.GB10698@linux.vnet.ibm.com>
Date:	Tue, 19 Apr 2011 11:56:54 +0530
From:	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
To:	Peter Zijlstra <peterz@...radead.org>,
	James Morris <jmorris@...ei.org>
Cc:	Ingo Molnar <mingo@...e.hu>, Steven Rostedt <rostedt@...dmis.org>,
	Linux-mm <linux-mm@...ck.org>,
	Arnaldo Carvalho de Melo <acme@...radead.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Jonathan Corbet <corbet@....net>,
	Christoph Hellwig <hch@...radead.org>,
	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ananth N Mavinakayanahalli <ananth@...ibm.com>,
	Oleg Nesterov <oleg@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	SystemTap <systemtap@...rces.redhat.com>,
	Jim Keniston <jkenisto@...ux.vnet.ibm.com>,
	Roland McGrath <roland@...k.frob.com>,
	Andi Kleen <andi@...stfloor.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 2.6.39-rc1-tip 12/26] 12: uprobes: slot allocation
 for uprobes

* Peter Zijlstra <peterz@...radead.org> [2011-04-18 18:46:11]:

> On Fri, 2011-04-01 at 20:04 +0530, Srikar Dronamraju wrote:
> > Every task is allocated a fixed slot. When a probe is hit, the original
> > instruction corresponding to the probe hit is copied to per-task fixed
> > slot. Currently we allocate one page of slots for each mm. Bitmaps are
> > used to know which slots are free. Each slot is made of 128 bytes so
> > that its cache aligned.
> > 
> > TODO: On massively threaded processes (or if a huge number of processes
> > share the same mm), there is a possiblilty of running out of slots.
> > One alternative could be to extend the slots as when slots are required.
> 
> As long as you're single stepping things and not using boosted probes
> you can fully serialize the slot usage. Claim a slot on trap and release
> the slot on finish. Claiming can wait on a free slot since you already
> have the whole SLEEPY thing.
> 

Yes, thats certainly one approach but that approach makes every
breakpoint hit contend for spinlock. (Infact we will have to change it
to mutex lock (as you rightly pointed out) so that we allow threads to
wait when slots are not free). Assuming a 4K page, we would be taxing
applications that have less than 32 threads (which is probably the
default case). If we continue with the current approach, then we
could only add additional page(s) for apps which has more than 32
threads and only when more than 32 __live__ threads have actually hit a
breakpoint.

> 
> > +static int xol_add_vma(struct uprobes_xol_area *area)
> > +{
> > +	struct vm_area_struct *vma;
> > +	struct mm_struct *mm;
> > +	struct file *file;
> > +	unsigned long addr;
> > +	int ret = -ENOMEM;
> > +
> > +	mm = get_task_mm(current);
> > +	if (!mm)
> > +		return -ESRCH;
> > +
> > +	down_write(&mm->mmap_sem);
> > +	if (mm->uprobes_xol_area) {
> > +		ret = -EALREADY;
> > +		goto fail;
> > +	}
> > +
> > +	/*
> > +	 * Find the end of the top mapping and skip a page.
> > +	 * If there is no space for PAGE_SIZE above
> > +	 * that, mmap will ignore our address hint.
> > +	 *
> > +	 * We allocate a "fake" unlinked shmem file because
> > +	 * anonymous memory might not be granted execute
> > +	 * permission when the selinux security hooks have
> > +	 * their way.
> > +	 */
> 
> That just annoys me, so we're working around some stupid sekurity crap,
> executable anonymous maps are perfectly fine, also what do JITs do?

Yes, we are working around selinux security hooks, but do we have a
choice. 

James can you please comment on this.

> 
> > +	vma = rb_entry(rb_last(&mm->mm_rb), struct vm_area_struct, vm_rb);
> > +	addr = vma->vm_end + PAGE_SIZE;
> > +	file = shmem_file_setup("uprobes/xol", PAGE_SIZE, VM_NORESERVE);
> > +	if (!file) {
> > +		printk(KERN_ERR "uprobes_xol failed to setup shmem_file "
> > +			"while allocating vma for pid/tgid %d/%d for "
> > +			"single-stepping out of line.\n",
> > +			current->pid, current->tgid);
> > +		goto fail;
> > +	}
> > +	addr = do_mmap_pgoff(file, addr, PAGE_SIZE, PROT_EXEC, MAP_PRIVATE, 0);
> > +	fput(file);
> > +
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/