linux-kernel - Re: Memory barrier needed with wake_up

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87oa40koh3.fsf@linux.intel.com>
Date:   Wed, 07 Sep 2016 13:12:40 +0300
From:   Felipe Balbi <felipe.balbi@...ux.intel.com>
To:     Alan Stern <stern@...land.harvard.edu>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Ingo Molnar <mingo@...hat.com>,
        USB list <linux-usb@...r.kernel.org>,
        Kernel development list <linux-kernel@...r.kernel.org>,
        Will Deacon <will.deacon@....com>
Subject: Re: Memory barrier needed with wake_up_process()?


Hi,

Alan Stern <stern@...land.harvard.edu> writes:
> On Tue, 6 Sep 2016, Peter Zijlstra wrote:
>
>> On Tue, Sep 06, 2016 at 01:49:37PM +0200, Peter Zijlstra wrote:
>> > On Tue, Sep 06, 2016 at 02:43:39PM +0300, Felipe Balbi wrote:
>> 
>> > > My fear now, however, is that changing smp_[rw]mb() to smp_mb() just
>> > > adds extra overhead which makes the problem much, much less likely to
>> > > happen. Does that sound plausible to you?
>> > 
>> > I did consider that, but I've not sufficiently grokked the code to rule
>> > out actual fail. So let me stare at this a bit more.
>> 
>> OK, so I'm really not seeing it, we've got:
>> 
>> while (bh->state != FULL) {
>>         for (;;) {
>>                 set_current_state(INTERRUPTIBLE); /* MB after */
>>                 if (signal_pending(current))
>>                         return -EINTR;
>>                 if (common->thread_wakeup_needed)
>>                         break;
>>                 schedule(); /* MB */
>>         }
>>         __set_current_state(RUNNING);
>>         common->thread_wakeup_needed = 0;
>>         smp_rmb(); /* NOP */
>> }
>> 
>> 
>> VS.
>> 
>> 
>> spin_lock(&common->lock); /* MB */
>> bh->state = FULL;
>> smp_wmb(); /* NOP */
>> common->thread_wakeup_needed = 1;
>> wake_up_process(common->thread_task); /* MB before */
>> spin_unlock(&common->lock);
>> 
>> 
>> 
>> (the MB annotations specific to x86, not true in general)
>> 
>> 
>> If we observe thread_wakeup_needed, we must also observe bh->state.
>> 
>> And the sleep/wakeup ordering is also correct, we either see
>> thread_wakeup_needed and continue, or we see task->state == RUNNING
>> (from the wakeup) and NO-OP schedule(). The MB from set_current_statE()
>> then matches with the MB from wake_up_process() to ensure we must see
>> thead_wakeup_needed.
>> 
>> Or, we go sleep, and get woken up, at which point the same happens.
>> Since the waking CPU gets the task back on its RQ the happens-before
>> chain includes the waking CPUs state along with the state of the task
>> itself before it went to sleep.
>> 
>> At which point we're back where we started, once we see
>> thread_wakeup_needed we must then also see bh->state (and all state
>> prior to that on the waking CPU).
>> 
>> 
>> 
>> There's enough cruft in the while-sleep loop to force reload bh->state.
>> 
>> Load/store tearing cannot be a problem because all values are single
>> bytes (the variables are multi bytes, but all values used only affect
>> the LSB).
>> 
>> Colour me puzzled.
>
> Felipe, can you please try this patch on an unmodified tree?  If the 
> problem still occurs, what shows up in the kernel log?
>
> Alan Stern
>
>
>
> Index: usb-4.x/drivers/usb/gadget/function/f_mass_storage.c
> ===================================================================
> --- usb-4.x.orig/drivers/usb/gadget/function/f_mass_storage.c
> +++ usb-4.x/drivers/usb/gadget/function/f_mass_storage.c
> @@ -485,6 +485,8 @@ static void bulk_out_complete(struct usb
>  	spin_lock(&common->lock);
>  	bh->outreq_busy = 0;
>  	bh->state = BUF_STATE_FULL;
> +	if (bh->bulk_out_intended_length == US_BULK_CB_WRAP_LEN)
> +		INFO(common, "compl: bh %p state %d\n", bh, bh->state);
>  	wakeup_thread(common);
>  	spin_unlock(&common->lock);
>  }
> @@ -2207,6 +2209,7 @@ static int get_next_command(struct fsg_c
>  		rc = sleep_thread(common, true);
>  		if (rc)
>  			return rc;
> +		INFO(common, "next: bh %p state %d\n", bh, bh->state);
>  	}
>  	smp_rmb();
>  	rc = fsg_is_set(common) ? received_cbw(common->fsg, bh) : -EIO;

I've replace INFO() with trace_printk() (which is what I have been using
anyway):

diff --git a/drivers/usb/gadget/function/f_mass_storage.c b/drivers/usb/gadget/function/f_mass_storage.c
index 2505117e88e8..dbc6a380b38b 100644
--- a/drivers/usb/gadget/function/f_mass_storage.c
+++ b/drivers/usb/gadget/function/f_mass_storage.c
@@ -485,6 +485,8 @@ static void bulk_out_complete(struct usb_ep *ep, struct usb_request *req)
 	spin_lock(&common->lock);
 	bh->outreq_busy = 0;
 	bh->state = BUF_STATE_FULL;
+	if (bh->bulk_out_intended_length == US_BULK_CB_WRAP_LEN)
+		trace_printk("compl: bh %p state %d\n", bh, bh->state);
 	wakeup_thread(common);
 	spin_unlock(&common->lock);
 }
@@ -2207,6 +2209,7 @@ static int get_next_command(struct fsg_common *common)
 		rc = sleep_thread(common, true);
 		if (rc)
 			return rc;
+		trace_printk("next: bh %p state %d\n", bh, bh->state);
 	}
 	smp_rmb();
 	rc = fsg_is_set(common) ? received_cbw(common->fsg, bh) : -EIO;

But I can't reproduce as reliably as before. I'll keep the thing running
an infinite loop which will stop only when interrupts in UDC (dwc3 in
this case) stop increasing.

-- 
balbi

Download attachment "signature.asc" of type "application/pgp-signature" (801 bytes)