[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190215045214.GA13123@nautica>
Date: Fri, 15 Feb 2019 05:52:14 +0100
From: Dominique Martinet <asmadeus@...ewreck.org>
To: Tom Herbert <tom@...ntonium.net>
Cc: Tom Herbert <tom@...bertland.com>,
David Miller <davem@...emloft.net>,
Doron Roberts-Kedes <doronrk@...com>,
Dave Watson <davejwatson@...com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] kcm: remove any offset before parsing messages
Tom Herbert wrote on Thu, Feb 14, 2019:
> On Thu, Feb 14, 2019 at 7:31 PM Dominique Martinet
> <asmadeus@...ewreck.org> wrote:
> > Yes, the parser fails with -ENOMEM ; that is not handled gracefully at
> > all: from a user point of view, the connection just hangs (recvmsg never
> > returns), without so much as a message in dmesg either.
>
> That's by design. Here is the description in kcm.txt:
>
> "When a TCP socket is attached to a KCM multiplexor data ready
> (POLLIN) and write space available (POLLOUT) events are handled by the
> multiplexor. If there is a state change (disconnection) or other error
> on a TCP socket, an error is posted on the TCP socket so that a
> POLLERR event happens and KCM discontinues using the socket. When the
> application gets the error notification for a TCP socket, it should
> unattach the socket from KCM and then handle the error condition (the
> typical response is to close the socket and create a new connection if
> necessary)."
Sigh, that's what I get for relying on pieces of code found on the
internet.
It does make "trivial" kcm sockets difficult to use though, the old
ganesha code I adapted to kcm was the horrible (naive?) kind spawning
one thread per connection and just waiting on read(), so making it just
create a kcm socket from the tcp one and wait on recvmsg() until an
error pops up does not seem an option.
That being said I'm a bit confused, I thought part of the point of kcm
was the multiplexing so a simple server could just keep attaching tcp
sockets to a single kcm socket and only have a single trivial loop
reading from the kcm socket ; but I guess there's no harm in also
looking for POLLERR on the tcp sockets... It would still need to close
them once clients disconnect I guess, this really only affects my naive
server.
> > strparser might be able to retry somehow.
>
> We could retry on -ENOMEM since it is potentially a transient error,
Yes, I think we should aim to retry on -ENOMEM; I agree all errors are
not meant to be retried on but there already are different cases based
on what the parse cb returned; but that can be a different patch.
> but generally for errors (like connection is simply broken) it seems
> like it's working as intended. I suppose we could add a socket option
> to fail the KCM socket when all attached lower sockets have failed,
> but I not sure that would be a significant benefit for additional
> complexity.
Right, I agree it's probably not worth doing, now I'm aware of this I
can probably motivate myself to change this old code to use epoll
properly.
I think it could make sense to have this option for simpler programs,
but as things stand I guess we can require kcm users to do this much,
thanks for the explanation, and sorry for having failed to notice it.
With all that said I guess my patch should work correctly then, I'll try
to find some time to check the error does come back up the tcp socket in
my reproducer but I have no reason to believe it doesn't.
I'd like to see some retry on ENOMEM before this is merged though, so
while I'm there I'll resend this with a second patch doing that
retry,.. I think just not setting strp->interrupted and not reporting
the error up might be enough? Will have to try either way.
Thanks for the feedback,
--
Dominique
Powered by blists - more mailing lists