linux-kernel - Re: [DRIVER SUBMISSION] DRBD wants to go mainline

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20070723211947.GD6477@mail.linbit.com>
Date:	Mon, 23 Jul 2007 23:19:47 +0200
From:	Lars Ellenberg <lars.ellenberg@...bit.com>
To:	Satyam Sharma <satyam.sharma@...il.com>
Cc:	Kyle Moffett <mrlinuxman@....com>, Jens Axboe <axboe@...nel.dk>,
	Andrew Morton <akpm@...l.org>,
	lkml <linux-kernel@...r.kernel.org>
Subject: Re: [DRIVER SUBMISSION] DRBD wants to go mainline

On Mon, Jul 23, 2007 at 07:10:58PM +0530, Satyam Sharma wrote:
> On 7/23/07, Lars Ellenberg <lars.ellenberg@...bit.com> wrote:
> >On Sun, Jul 22, 2007 at 09:32:02PM -0400, Kyle Moffett wrote:
> >[...]
> >> Don't use signals between kernel threads, use proper primitives like
> >> notifiers and waitqueues, which means you should also probably switch 
> >away
> >> from kernel_thread() to the kthread_*() APIs.  Also you should fix this
> >> FIXME or remove it if it no longer applies:-D.
> >
> >right.
> >but how to I tell a network thread in tcp_recvmsg to stop early,
> >without using signals?
> 
> 
> Yup, kthreads API cannot handle (properly stop) kernel threads that want
> to sleep on possibly-blocking-forever-till-signalled-functions such as
> tcp_recvmsg or skb_recv_datagram etc etc.
> 
> There are two workarounds:
> 1. Use sk_rcvtimeo and related while-continue logic
> 2. force_sig(SIGKILL) to your kernel thread just before kthread_stop
>   (note that you don't need to allow / write code to handle / etc signals
>   in your kthread code -- force_sig will work automatically)

this is not only at stop time.
for example our "drbd_asender" thread
does receive as well as send, and the sending
latency is crucial to performance, while the recv
will not timeout for the next few seconds.

> >> +/* THINK maybe we actually want to use the default "event/%s" worker 
> >threads
> >> + * or similar in linux 2.6, which uses per cpu data and threads.
> >> + *
> >> + * To be general, this might need a spin_lock member.
> >> + * For now, please use the mdev->req_lock to protect list_head,
> >> + * see drbd_queue_work below.
> >> + */
> >> +struct drbd_work_queue {
> >> +       struct list_head q;
> >> +       struct semaphore s; /* producers up it, worker down()s it */
> >> +       spinlock_t q_lock;  /* to protect the list. */
> >> +};
> >>
> >> Umm, how about fixing this to actually use proper workqueues or something
> >> instead of this open-coded mess?
> >
> >unlikely to happen "right now".
> >but it is on our todo list...
> 
> It should be easier to do it now (if you defer it for later, the code will
> only grow more and more complex). Also, removing this gunk from
> your driver will clearly make it smaller, and easier for us to review :-)

and will poison the generic work queues with stuff that might block
somewhere deep in the tcp stack. and where we are not able to cancel it.
not exactly desirable, either.
but maybe I am missing something?

	Lars
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/