lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 20 Oct 2020 09:55:30 -0700
From:   Lee Duncan <lduncan@...e.com>
To:     Mike Christie <michael.christie@...cle.com>,
        linux-scsi@...r.kernel.org
Cc:     linux-kernel@...r.kernel.org, open-iscsi@...glegroups.com,
        martin.petersen@...cle.com, mchristi@...hat.com, hare@...e.com
Subject: Re: [PATCH v2 1/1] scsi: libiscsi: fix NOP race condition

On 10/8/20 1:54 PM, Mike Christie wrote:
> On 10/8/20 12:11 PM, Mike Christie wrote:
>> On 9/25/20 1:41 PM, lduncan@...e.com wrote:
>>> From: Lee Duncan <lduncan@...e.com>
>>>
>>> iSCSI NOPs are sometimes "lost", mistakenly sent to the
>>> user-land iscsid daemon instead of handled in the kernel,
>>> as they should be, resulting in a message from the daemon like:
>>>
>>>> iscsid: Got nop in, but kernel supports nop handling.
>>>
>>> This can occur because of the forward- and back-locks
>>> in the kernel iSCSI code, and the fact that an iSCSI NOP
>>> response can be processed before processing of the NOP send
>>> is complete. This can result in "conn->ping_task" being NULL
>>> in iscsi_nop_out_rsp(), when the pointer is actually in
>>> the process of being set.
>>>
>>> To work around this, we add a new state to the "ping_task"
>>> pointer. In addition to NULL (not assigned) and a pointer
>>> (assigned), we add the state "being set", which is signaled
>>> with an INVALID pointer (using "-1").
>>>
>>> Signed-off-by: Lee Duncan <lduncan@...e.com>
>>> ---
>>>  drivers/scsi/libiscsi.c | 13 ++++++++++---
>>>  include/scsi/libiscsi.h |  3 +++
>>>  2 files changed, 13 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
>>> index 1e9c3171fa9f..cade108c33b6 100644
>>> --- a/drivers/scsi/libiscsi.c
>>> +++ b/drivers/scsi/libiscsi.c
>>> @@ -738,6 +738,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn, struct iscsi_hdr *hdr,
>>>  						   task->conn->session->age);
>>>  	}
>>>  
>>> +	if (unlikely(READ_ONCE(conn->ping_task) == INVALID_SCSI_TASK))
>>> +		WRITE_ONCE(conn->ping_task, task);
>>> +
>>>  	if (!ihost->workq) {
>>>  		if (iscsi_prep_mgmt_task(conn, task))
>>>  			goto free_task;
>>
>> I think the API gets a little weird now where in some cases
>> __iscsi_conn_send_pdu checks the opcode to see what type of request
>> it is but above we the caller sets the ping_task.
>>
>> For login, tmfs and passthrough, we assume the __iscsi_conn_send_pdu
>> has sent or cleaned up everything. I think it might be nicer to just
>> have __iscsi_conn_send_pdu set the ping_task field before doing the
>> xmit/queue call. It would then work similar to the conn->login_task
>> case where that function knows about that special task too.
>>
>> So in __iscsi_conn_send_pdu add a "if (opcode == ISCSI_OP_NOOP_OUT)",
>> and check if it's a nop we need to track. If so set conn->ping_task.
>>
> Ignore this. It won't work nicely either. To figure out if the nop is
> our internal transport test ping vs a userspace ping that also needs
> a reply, we would need to do something like you did above so there is
> no point.
> 

Hi Mike:

I've read this a few times, and I'm still no sure I'm parsing it correctly.

Are you saying that my original patch submission is ok, or are you
saying there's nothing we can do and we're up the proverbial creek?
-- 
Lee Duncan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ