lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 27 Jul 2018 09:47:23 +0530
From:   Ravi Bangoria <ravi.bangoria@...ux.ibm.com>
To:     Oleg Nesterov <oleg@...hat.com>
Cc:     srikar@...ux.vnet.ibm.com, rostedt@...dmis.org,
        mhiramat@...nel.org, peterz@...radead.org, mingo@...hat.com,
        acme@...nel.org, alexander.shishkin@...ux.intel.com,
        jolsa@...hat.com, namhyung@...nel.org,
        linux-kernel@...r.kernel.org, ananth@...ux.vnet.ibm.com,
        alexis.berlemont@...il.com, naveen.n.rao@...ux.vnet.ibm.com,
        linux-arm-kernel@...ts.infradead.org, linux-mips@...ux-mips.org,
        linux@...linux.org.uk, ralf@...ux-mips.org, paul.burton@...s.com,
        Ravi Bangoria <ravi.bangoria@...ux.ibm.com>
Subject: Re: [PATCH v6 5/6] Uprobes/sdt: Prevent multiple reference counter
 for same uprobe

Hi Oleg,

On 07/25/2018 04:38 PM, Oleg Nesterov wrote:
> No, I can't understand this patch...
> 
> On 07/16, Ravi Bangoria wrote:
>>
>> --- a/kernel/events/uprobes.c
>> +++ b/kernel/events/uprobes.c
>> @@ -63,6 +63,8 @@ static struct percpu_rw_semaphore dup_mmap_sem;
>>
>>  /* Have a copy of original instruction */
>>  #define UPROBE_COPY_INSN	0
>> +/* Reference counter offset is reloaded with non-zero value. */
>> +#define REF_CTR_OFF_RELOADED	1
>>
>>  struct uprobe {
>>  	struct rb_node		rb_node;	/* node in the rb tree */
>> @@ -476,9 +478,23 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
>>  		return ret;
>>
>>  	ret = verify_opcode(old_page, vaddr, &opcode);
>> -	if (ret <= 0)
>> +	if (ret < 0)
>>  		goto put_old;
> 
> I agree, "ret <= 0" wasn't nice even before this change, but "ret < 0" looks
> worse because this is simply not possible.


Ok. Any better idea?
I think if we don't track all mms patched by uprobe, we have to rely
on current instruction.


> 
>> +	/*
>> +	 * If instruction is already patched but reference counter offset
>> +	 * has been reloaded to non-zero value, increment the reference
>> +	 * counter and return.
>> +	 */
>> +	if (ret == 0) {
>> +		if (is_register &&
>> +		    test_bit(REF_CTR_OFF_RELOADED, &uprobe->flags)) {
>> +			WARN_ON(!uprobe->ref_ctr_offset);
>> +			ret = update_ref_ctr(uprobe, mm, true);
>> +		}
>> +		goto put_old;
>> +	}
> 
> So we need to force update_ref_ctr(true) in case when uprobe_register_refctr()
> detects the already registered uprobe with ref_ctr_offset == 0, and then it calls
> register_for_each_vma().
> 
> Why this can't race with uprobe_mmap() ?
> 
> uprobe_mmap() can do install_breakpoint() right after REF_CTR_OFF_RELOADED was set,
> then register_for_each_vma() will find this vma and do install_breakpoint() too.
> If ref_ctr_vma was already mmaped, the counter will be incremented twice, no?


Hmm right. Probably, I can fix this race by using some lock, but I don't
know if it's good to do that inside uprobe_mmap().


> 
>> @@ -971,6 +1011,7 @@ register_for_each_vma(struct uprobe *uprobe, struct uprobe_consumer *new)
>>  	bool is_register = !!new;
>>  	struct map_info *info;
>>  	int err = 0;
>> +	bool installed = false;
>>
>>  	percpu_down_write(&dup_mmap_sem);
>>  	info = build_map_info(uprobe->inode->i_mapping,
>> @@ -1000,8 +1041,10 @@ register_for_each_vma(struct uprobe *uprobe, struct uprobe_consumer *new)
>>  		if (is_register) {
>>  			/* consult only the "caller", new consumer. */
>>  			if (consumer_filter(new,
>> -					UPROBE_FILTER_REGISTER, mm))
>> +					UPROBE_FILTER_REGISTER, mm)) {
>>  				err = install_breakpoint(uprobe, mm, vma, info->vaddr);
>> +				installed = true;
>> +			}
>>  		} else if (test_bit(MMF_HAS_UPROBES, &mm->flags)) {
>>  			if (!filter_chain(uprobe,
>>  					UPROBE_FILTER_UNREGISTER, mm))
>> @@ -1016,6 +1059,8 @@ register_for_each_vma(struct uprobe *uprobe, struct uprobe_consumer *new)
>>  	}
>>   out:
>>  	percpu_up_write(&dup_mmap_sem);
>> +	if (installed)
>> +		clear_bit(REF_CTR_OFF_RELOADED, &uprobe->flags);
> 
> I simply can't understand this "bool installed"....


That boolean is needed because consumer_filter() returns false when this
function gets called first time from uprobe_register(). And consumer_filter
returns true when it gets called by uprobe_apply(). If I make it
unconditional, there is no effect of setting REF_CTR_OFF_RELOADED bit. So
this boolean is needed.

Though, there is one more issue I found. Let's say there are two processes
running and user probes on both of them using uprobe_register() using, let's
say systemtap. Now, some other guy does uprobe_register_refctr() using
'perf -p PID' on same uprobe but he is interested in only one process. Here,
we will increment the reference counter only in the "PID" process and we will
clear REF_CTR_OFF_RELOADED flag. Later, some other guy does 'perf -a' which
will call uprobe_register_refctr() for both the target but we will fail to
increment the counter in "non-PID" process because we had already clear
REF_CTR_OFF_RELOADED.

I have a solution for this. Idea is, if reference counter is reloaded, save
of all mms for which consumer_filter() denied to updated when being called
from register_for_each_vma(). Use this list of mms as checklist next time
onwards. I don't know if it's good to do that or not.


> 
> shouldn't we clear REF_CTR_OFF_RELOADED unconditionally after register_for_each_vma()?
> 


No, because uprobe_register() is simply NOP and breakpoint is actually
installed by uprobe_apply().


> 
> 
> Also. Suppose we have a registered uprobe with ref_ctr_offset == 0. Then you add and
> remove uprobe with ref_ctr_offset != 0. But afaics uprobe->ref_ctr_offset is never
> cleared, so another uprobe with a different ref_ctr_offset != 0 will hit pr_warn/-EINVAL
> in alloc_uprobe() and find_old_trace_uprobe() added by the previous patch can't detect
> this case?


This is a valid concern. So, this point is forcing me to make it a consumer
property. But if I do that, all optimization done by uprobe_perf_open() and
uprobe_perf_close() needs to be reverted, which I don't want to.


> 
> Plus it seems that we can have the unbalanced update_ref_ctr(false), at least in case
> when __uprobe_register() with REF_CTR_OFF_RELOADED set fails before it patches all mm's.
> If/when the 1st uprobe with ref_ctr_offset == 0 goes away, remove_breakpoint() will dec
> the counter even if wasn't incremented.


Hmm incase of failure, this could be possible.


> 
> Quite possibly I am totally confused, but this patch wrong in many ways...

No, you are right.

Please let me know if you have any better approach.

Thanks for the review :)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ