[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <656f77b1-caf6-ea3c-6d32-54637f70a629@suse.de>
Date: Mon, 3 Jul 2023 15:46:35 +0200
From: Hannes Reinecke <hare@...e.de>
To: Sagi Grimberg <sagi@...mberg.me>, David Howells <dhowells@...hat.com>
Cc: Keith Busch <kbusch@...nel.org>, Christoph Hellwig <hch@....de>,
linux-nvme@...ts.infradead.org, Jakub Kicinski <kuba@...nel.org>,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
netdev@...r.kernel.org
Subject: Re: [PATCHv6 0/5] net/tls: fixes for NVMe-over-TLS
On 7/3/23 15:42, Sagi Grimberg wrote:
>
>>>> Hannes Reinecke <hare@...e.de> wrote:
>>>>
>>>>>> 'discover' and 'connect' works, but when I'm trying to transfer data
>>>>>> (eg by doing a 'mkfs.xfs') the whole thing crashes horribly in
>>>>>> sock_sendmsg() as it's trying to access invalid pages :-(
>>>>
>>>> Can you be more specific about the crash?
>>>
>>> Hannes,
>>>
>>> See:
>>> [PATCH net] nvme-tcp: Fix comma-related oops
>>
>> Ah, right. That solves _that_ issue.
>>
>> But now I'm deadlocking on the tls_rx_reader_lock() (patched as to
>> your suggestion). Investigating.
>
> Are you sure it is a deadlock? or maybe you returned EAGAIN and nvme-tcp
> does not interpret this as a transient status and simply returns from
> io_work?
>
>> But it brought up yet another can of worms: what _exactly_ is the
>> return value of ->read_sock()?
>>
>> There are currently two conflicting use-cases:
>> -> Ignore the return value, and assume errors etc are signalled
>> via 'desc.error'.
>> net/strparser/strparser.c
>> drivers/infiniband/sw/siw
>> drivers/scsi/iscsi_tcp.c
>> -> use the return value of ->read_sock(), ignoring 'desc.error':
>> drivers/nvme/host/tcp.c
>> net/ipv4/tcp.c
>> So which one is it?
>> Needless to say, implementations following the second style do not
>> set 'desc.error', causing any errors there to be ignored for callers
>> from the first style...
>
> I don't think ignoring the return value of read_sock makes sense because
> it can fail outside of the recv_actor failures.
>
Oh, but it's not read_actor which is expected to set desc.error.
Have a look at 'strp_read_sock()':
/* sk should be locked here, so okay to do read_sock */
sock->ops->read_sock(strp->sk, &desc, strp_recv);
desc.error = strp->cb.read_sock_done(strp, desc.error);
it's the ->read_sock() callback which is expected to set desc.error.
> But to be on the safe side, perhaps you can both return an error and set
> desc.error?
>
But why? We can easily make ->read_sock() a void function, then it's
obvious that you can't check the return value.
Cheers,
Hannes
Powered by blists - more mailing lists