[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121228143200.GB24229@redhat.com>
Date: Fri, 28 Dec 2012 15:32:00 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: Andrey Vagin <avagin@...nvz.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
Cc: linux-kernel@...r.kernel.org, criu@...nvz.org,
linux-fsdevel@...r.kernel.org, linux-api@...r.kernel.org,
Alexander Viro <viro@...iv.linux.org.uk>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
David Howells <dhowells@...hat.com>,
Dave Jones <davej@...hat.com>,
Michael Kerrisk <mtk.manpages@...il.com>,
Pavel Emelyanov <xemul@...allels.com>,
Cyrill Gorcunov <gorcunov@...nvz.org>
Subject: Re: [PATCH 3/3] signalfd: add ability to read siginfo-s without
dequeuing signals (v3)
On 12/28, Andrey Vagin wrote:
>
> pread(fd, buf, size, pos) with non-zero pos returns siginfo-s
> without dequeuing signals.
>
> A sequence number and a queue are encoded in pos.
>
> pos = seq + SFD_*_OFFSET
>
> seq is a sequence number of a signal in a queue.
>
> SFD_PER_THREAD_QUEUE_OFFSET - read signals from a per-thread queue.
> SFD_SHARED_QUEUE_OFFSET - read signals from a shared (process wide) queue.
>
> This functionality is required for checkpointing pending signals.
>
> v2: llseek() can't be used here, because peek_offset/f_pos/whatever
> has to be shared with all processes which have this file opened.
>
> Suppose that the task forks after sys_signalfd(). Now if parent or child
> do llseek this affects them both. This is insane because signalfd is
> "strange" to say at least, fork/dup/etc inherits signalfd_ctx but not
> the" source" of the data. // Oleg Nesterov
I think we should cc Linus.
This patch adds the hack and it makes signalfd even more strange.
Yes, this hack was suggested by me because I can't suggest something
better. But if Linus dislikes this user-visible API it would be better
to get his nack right now.
> +static ssize_t signalfd_peek(struct signalfd_ctx *ctx,
> + siginfo_t *info, loff_t *ppos)
> +{
> + struct sigpending *pending;
> + struct sigqueue *q;
> + loff_t seq;
> + int ret = 0;
> +
> + spin_lock_irq(¤t->sighand->siglock);
> +
> + if (*ppos >= SFD_SHARED_QUEUE_OFFSET) {
> + pending = ¤t->signal->shared_pending;
> + seq = *ppos - SFD_SHARED_QUEUE_OFFSET;
> + } else {
> + pending = ¤t->pending;
> + seq = *ppos - SFD_PER_THREAD_QUEUE_OFFSET;
> + }
You can do this outside of spin_lock_irq().
And I think it would be better to check SFD_PRIVATE_QUEUE_OFFSET too
although this is not strictly necessary. Otherwise this code assumes
that sys_pread() cheks pos >= 0 and SFD_PRIVATE_QUEUE_OFFSET == 1.
> + list_for_each_entry(q, &pending->list, list) {
> + if (sigismember(&ctx->sigmask, q->info.si_signo))
> + continue;
> +
> + if (seq-- == 0) {
> + copy_siginfo(info, &q->info);
> + ret = info->si_signo;
> + break;
> + }
> + }
> +
> + spin_unlock_irq(¤t->sighand->siglock);
> +
> + if (ret)
> + (*ppos)++;
We can change it unconditionally but I won't argue.
> @@ -338,6 +379,7 @@ SYSCALL_DEFINE4(signalfd4, int, ufd, sigset_t __user *, user_mask,
> }
>
> file->f_flags |= flags & SFD_RAW;
> + file->f_mode |= FMODE_PREAD;
Again, this is not needed or the code was broken by the previous patch.
Given that 2/3 passes O_RDWR to anon_inode_getfile() I think FMODE_PREAD
should be already set. Note OPEN_FMODE(flags) in anon_inode_getfile().
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists