lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 02 Oct 2014 09:05:42 -0400
From:	Peter Hurley <peter@...leysoftware.com>
To:	Peter Zijlstra <peterz@...radead.org>
CC:	Fengguang Wu <fengguang.wu@...el.com>,
	Jet Chen <jet.chen@...el.com>, Su Tao <tao.su@...el.com>,
	Yuanhan Liu <yuanhan.liu@...el.com>, LKP <lkp@...org>,
	linux-kernel@...r.kernel.org, Marcel Holtmann <marcel@...tmann.org>
Subject: Re: [rfcomm_run] WARNING: CPU: 1 PID: 79 at kernel/sched/core.c:7156
 __might_sleep()

On 10/02/2014 08:54 AM, Peter Zijlstra wrote:
> On Thu, Oct 02, 2014 at 08:38:46AM -0400, Peter Hurley wrote:
>> On 10/02/2014 08:31 AM, Peter Zijlstra wrote:
>>> On Thu, Oct 02, 2014 at 01:09:27PM +0200, Peter Zijlstra wrote:
>>>> On Tue, Sep 30, 2014 at 04:02:28PM +0800, Fengguang Wu wrote:
>>>>> Hi Peter,
>>>>>
>>>>> We possibly find a rfcomm bug (maintainers CCed) exposed by your debug patch
>>>>>
>>>>> [    1.861895] NET: Registered protocol family 5
>>>>> [    1.862978] Bluetooth: RFCOMM TTY layer initialized
>>>>> [    1.863099] ------------[ cut here ]------------
>>>>> [    1.863105] WARNING: CPU: 1 PID: 79 at kernel/sched/core.c:7156 __might_sleep+0x17d/0x1a1()
>>>>> [    1.863112] do not call blocking ops when !TASK_RUNNING; state=1 set at [<c14dc381>] rfcomm_run+0xdf/0x130e
>>>>> [    1.863591]  [<c1058b73>] ? kthread_stop+0x53/0x53
>>>>> [    1.864906]  [<c155a411>] dump_stack+0x48/0x60
>>>>> [    1.866298]  [<c14dc381>] ? rfcomm_run+0xdf/0x130e
>>>>
>>>> Ha yes, rfcomm_run is a complete buggy mess indeed. Lemme go see what I
>>>> can make of it.
>>>
>>> ---
>>> Subject: rfcomm: Fix broken wait construct
>>>
>>> rfcomm_run() is a tad broken in that is has a nested wait loop. One
>>> cannot rely on p->state for the outer wait because the inner wait will
>>> overwrite it.
>>>
>>> While at it, rename rfcomm_schedule to rfcomm_wake, since that is what
>>> it actually does.
>>
>> rfcomm_schedule() as in schedule_work(), which is how it's used.
> 
> Not really, all it does is wake the rfcomm_thread. The thread then does
> a linear walk of all known sessions looking for work -- which is clearly
> suboptimal as well, but I didn't feel like fixing that.
> 
> Also, the current implementation already disagrees with you, all it
> basically does it call wake_up_process() which is a big clue right
> there.

You're thinking of it from the point of view of the scheduler, so to you
it should be named what it does.

However, from the users' point of view, it's an abstraction of work
dispatching; the fact that a kthread (which needs waking) does the work
is irrelevant.

Consider if the kthread is converted to work_structs instead and your now-
renamed rfcomm_wake() is calling schedule_work().

Regards,
Peter Hurley
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists