lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <88334658-072b-4b90-a949-9c74ef93cfd1@huawei.com>
Date: Tue, 15 Jul 2025 11:05:07 +0800
From: Li Lingfeng <lilingfeng3@...wei.com>
To: Hou Tao <houtao@...weicloud.com>, Mike Christie
	<michael.christie@...cle.com>
CC: <lduncan@...e.com>, <cleech@...hat.com>, <njavali@...vell.com>,
	<mrangankar@...vell.com>, <GR-QLogic-Storage-Upstream@...vell.com>,
	<martin.petersen@...cle.com>, <linux-scsi@...r.kernel.org>,
	<jejb@...ux.ibm.com>, yangerkun <yangerkun@...wei.com>, "yukuai (C)"
	<yukuai3@...wei.com>, "zhangyi (F)" <yi.zhang@...wei.com>,
	<James.Bottomley@...senPartnership.com>, <open-iscsi@...glegroups.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Li Lingfeng
	<lilingfeng@...weicloud.com>
Subject: Re: [PATCH 1/6] scsi: iscsi: Fix HW conn removal use after free

Hi all,

在 2025/7/3 17:40, Hou Tao 写道:
> Hi Mike & Lingfeng:
>
> On 7/3/2025 9:35 AM, Li Lingfeng wrote:
>> Friendly ping...
>>
>> Thanks
>>
>> 在 2025/6/19 20:57, Li Lingfeng 写道:
>>> Hi Mike,
>>>
>>> Thanks for this patch addressing the UAF issue. I have some questions
>>> when
>>> analyzing this patch.
>>>
>>> You mention that iscsi_remove_conn() frees the connection, making the
>>> extra
>>> iscsi_put_conn() cause UAF. However, looking at iscsi_remove_conn():
>>>
>>> iscsi_remove_conn
>>>   device_del(&conn->dev)
>>>    put_device(parent) // Only parent gets put_device
>>>
>>> This doesn't appear to free conn directly - only device_del() + parent
>>> reference drop. Typically, conn is freed via put_device() on
>>> &conn->dev when
>>> its refcount reaches zero.
>>>
>>> Could you briefly clarify how iscsi_remove_conn() ultimately triggers
>>> the
>>> freeing, and in what scenario the subsequent iscsi_put_conn() leads
>>> to UAF?
>>>
>>> Thanks again for the fix.
>>>
>>> Lingfeng
>>>
>>> 在 2022/6/17 6:27, Mike Christie 写道:
>>>> If qla4xxx doesn't remove the connection before the session, the iSCSI
>>>> class tries to remove the connection for it. We were doing a
>>>> iscsi_put_conn() in the iter function which is not needed and will
>>>> result
>>>> in a use after free because iscsi_remove_conn() will free the
>>>> connection.
>>>>
>>>> Reviewed-by: Lee Duncan <lduncan@...e.com>
>>>> Signed-off-by: Mike Christie <michael.christie@...cle.com>
>>>> ---
>>>>    drivers/scsi/scsi_transport_iscsi.c | 2 --
>>>>    1 file changed, 2 deletions(-)
>>>>
>>>> diff --git a/drivers/scsi/scsi_transport_iscsi.c
>>>> b/drivers/scsi/scsi_transport_iscsi.c
>>>> index 2c0dd64159b0..e6084e158cc0 100644
>>>> --- a/drivers/scsi/scsi_transport_iscsi.c
>>>> +++ b/drivers/scsi/scsi_transport_iscsi.c
>>>> @@ -2138,8 +2138,6 @@ static int iscsi_iter_destroy_conn_fn(struct
>>>> device *dev, void *data)
>>>>            return 0;
>>>>          iscsi_remove_conn(iscsi_dev_to_conn(dev));
>>>> -    iscsi_put_conn(iscsi_dev_to_conn(dev));
>>>> -
>>>>        return 0;
>>>>    }
>> .
> I didn't follow the patch either. If I understand correctly, the
> invocation of  iscsi_put_conn() in iscsi_iter_destory_conn_fn() is used
> to free the initial reference counter of iscsi_cls_conn. For non-qla4xxx
> cases, the ->destroy_conn() callback (e.g., iscsi_conn_teardown) will
> call iscsi_remove_conn() and iscsi_put_conn() to remove the connection
> from the children list of session and free the connection at last.
> However for qla4xxx, it is not the case. The ->destroy_conn() callback
> of qla4xxx will keep the connection in the session conn_list and doesn't
> use iscsi_put_conn() to free the initial reference counter. Therefore,
> it seems necessary to keep the iscsi_put_conn() in the
> iscsi_iter_destroy_conn_fn(), otherwise, there will be memory leak problem.
This patch indeed caused a memory leak. I reproduced the leak issue and
confirmed that reintroducing the removed iscsi_put_conn() prevents the
leak in the same scenario.


*kernel base:*
master 8c2e52ebbe885c7eeaabd3b7ddcdc1246fc400d2


*kernel diff:*
1) Since I don't have the device that supports qla4xxx, I forcibly
bypassed the connection check during session destruction;
2) I expanded the memory required by conn by 1024 times to more clearly
observe the memory changes.

diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 7b4fe0e6afb2..bb354deeb668 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -940,8 +940,12 @@ static void iscsi_sw_tcp_session_destroy(struct 
iscsi_cls_session *cls_session)
         struct Scsi_Host *shost = iscsi_session_to_shost(cls_session);
         struct iscsi_session *session = cls_session->dd_data;

-       if (WARN_ON_ONCE(session->leadconn))
-               return;
+       if (WARN_ON_ONCE(session->leadconn)) {
+               printk("%s skip checking leadconn", __func__);
+       //      return;
+       }
+
+       printk("%s %d\n", __func__, __LINE__);

         iscsi_session_remove(cls_session);
         /*
diff --git a/drivers/scsi/scsi_transport_iscsi.c 
b/drivers/scsi/scsi_transport_iscsi.c
index c75a806496d6..eac353ad536f 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -2408,7 +2408,7 @@ iscsi_alloc_conn(struct iscsi_cls_session 
*session, int dd_size, uint32_t cid)
         struct iscsi_transport *transport = session->transport;
         struct iscsi_cls_conn *conn;

-       conn = kzalloc(sizeof(*conn) + dd_size, GFP_KERNEL);
+       conn = kzalloc((sizeof(*conn) + dd_size) * 1024, GFP_KERNEL);
         if (!conn)
                 return NULL;
         if (dd_size)
@@ -2984,6 +2984,7 @@ iscsi_if_destroy_conn(struct iscsi_transport 
*transport, struct iscsi_uevent *ev

         if (transport->destroy_conn)
                 transport->destroy_conn(conn);
+
         return 0;
  }

@@ -3937,13 +3938,20 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct 
nlmsghdr *nlh, uint32_t *group)
                 iscsi_put_endpoint(ep);
                 break;
         case ISCSI_UEVENT_DESTROY_SESSION:
+               printk("%s %d skip checking conns\n", __func__, __LINE__);
                 session = iscsi_session_lookup(ev->u.d_session.sid);
-               if (!session)
+               if (!session) {
+                       printk("%s %d\n", __func__, __LINE__);
                         err = -EINVAL;
-               else if (iscsi_session_has_conns(ev->u.d_session.sid))
+/*
+               } else if (iscsi_session_has_conns(ev->u.d_session.sid)) {
+                       printk("%s %d\n", __func__, __LINE__);
                         err = -EBUSY;
-               else
+*/
+               } else {
+                       printk("%s %d\n", __func__, __LINE__);
                         transport->destroy_session(session);
+               }
                 break;
         case ISCSI_UEVENT_DESTROY_SESSION_ASYNC:
                 session = iscsi_session_lookup(ev->u.d_session.sid);


*open-iscsi base:*
master 5b49cf2a86b1bfce13082f8ddc90b2282feb563d


*open-iscsi diff:*
I forcibly bypassed the standard connection destruction procedure, so
that during the session destruction process, the connection would be
destroyed to trigger iscsi_iter_destroy_conn_fn().

diff --git a/usr/netlink.c b/usr/netlink.c
index 4bcaf8b..970d782 100644
--- a/usr/netlink.c
+++ b/usr/netlink.c
@@ -516,12 +516,15 @@ kcreate_conn(uint64_t transport_handle, uint32_t sid,
  static int
  kdestroy_conn(uint64_t transport_handle, uint32_t sid, uint32_t cid)
  {
-       int rc;
-       struct iscsi_uevent ev;
-       struct iovec iov[2];
-
+//     int rc;
+//     struct iscsi_uevent ev;
+//     struct iovec iov[2];
+       (void)transport_handle;
+       (void)sid;
+       (void)cid;
         log_debug(7, "in %s", __FUNCTION__);
-
+       printf("skip destroy conn\n");
+/*
         memset(&ev, 0, sizeof(struct iscsi_uevent));

         ev.type = ISCSI_UEVENT_DESTROY_CONN;
@@ -534,7 +537,7 @@ kdestroy_conn(uint64_t transport_handle, uint32_t 
sid, uint32_t cid)
         rc = __kipc_call(iov, 2);
         if (rc < 0)
                 return rc;
-
+*/
         return 0;
  }


*test script:*
#!/bin/bash
LOOP_TIMES=10000
LOG_FILE="iscsi_opt.log"
DEVICE_TYPE="VIRTUAL-DISK"

for ((count=1; count<=LOOP_TIMES; count++)); do
     iscsiadm -m node -l &>/dev/null &
     login_pid=$!
     login_start=$(date +%s)
     device_found=0

     while kill -0 $login_pid 2>/dev/null; do
         sleep 0.5
         elapsed=$(( $(date +%s) - login_start ))

         if [ $elapsed -ge 30 ]; then
             if lsscsi | grep -q "$DEVICE_TYPE"; then
                 kill $login_pid 2>/dev/null
                 status="DEVICE_READY (KILLED_TIMEOUT)"
                 device_found=1
                 break
             fi
         fi
     done

     if [ $device_found -eq 0 ]; then
         wait $login_pid
         status="ATTACHED"
     fi

     mem_login=$(grep MemAvailable /proc/meminfo | awk '{print $2}')
     printf "%-6d %-12d %-8s %s\n" $count $mem_login "LOGIN" "$status" | 
tee -a $LOG_FILE

     iscsiadm -m node -u &>/dev/null
     while iscsiadm -m session &>/dev/null; do sleep 0.1; done
     mem_logout=$(grep MemAvailable /proc/meminfo | awk '{print $2}')
     printf "%-6d %-12d %-8s %s\n" $count $mem_logout "LOGOUT" "SUCCESS" 
| tee -a $LOG_FILE
done


*result without iscsi_put_conn():*
1) After a certain number of cycles, the available memory significantly
decreases.
1      53738028     LOGIN    ATTACHED
1      53735616     LOGOUT   SUCCESS
...
100    52453148     LOGIN    ATTACHED
100    52453376     LOGOUT   SUCCESS
...
200    51122008     LOGIN    ATTACHED
200    51122644     LOGOUT   SUCCESS
...
2108   25984180     LOGIN    ATTACHED
2108   25995580     LOGOUT   SUCCESS

2) After removing iSCSI devices, stopping related services, and unloading
kernel modules, the available memory has not returned to the expected
level.
[root@...ora test_ko]# lsscsi
[0:0:0:0]    disk    QEMU     QEMU HARDDISK    2.5+  /dev/sda
[0:0:1:0]    disk    QEMU     QEMU HARDDISK    2.5+  /dev/sdb
[0:0:2:0]    disk    QEMU     QEMU HARDDISK    2.5+  /dev/sdc
[0:0:3:0]    disk    QEMU     QEMU HARDDISK    2.5+  /dev/sdd
[0:0:4:0]    disk    QEMU     QEMU HARDDISK    2.5+  /dev/sde
[root@...ora test_ko]# service iscsid stop
Redirecting to /bin/systemctl stop iscsid.service
Warning: Stopping iscsid.service, but it can still be activated by:
   iscsid.socket
[root@...ora test_ko]# service tgtd stop
Redirecting to /bin/systemctl stop tgtd.service
[root@...ora test_ko]# rmmod iscsi_tcp
[root@...ora test_ko]# rmmod iscsi_boot_sysfs
[root@...ora test_ko]# rmmod libiscsi_tcp
[root@...ora test_ko]# rmmod libiscsi
[root@...ora test_ko]# rmmod scsi_transport_iscsi
[root@...ora test_ko]# cat /proc/meminfo | grep MemAvailable
MemAvailable:   26063488 kB
[root@...ora test_ko]# cat /proc/meminfo | grep MemAvailable
MemAvailable:   26065000 kB
[root@...ora test_ko]#


*result with iscsi_put_conn():*
After a certain number of cycles, the available memory does not show
significant changes.
1      53823120     LOGIN    ATTACHED
1      53828420     LOGOUT   SUCCESS
...
100    53738576     LOGIN    ATTACHED
100    53747436     LOGOUT   SUCCESS
...
200    53723400     LOGIN    ATTACHED
200    53728696     LOGOUT   SUCCESS
...

I haven't found more details about the UAF issue described in this patch,
but I think the patch's description of the root cause 'iscsi_remove_conn()
will free the connection' is incorrect. Moreover, this fix approach is
inappropriate.
Until the original issue is fully understood, I recommend reverting this
patch.

Thanks,

Lingfeng


> For the use-after-free problem, I think it may be due to the concurrent
> invocation of iscsi_add_conn() and iscsi_session_teardown().
> iscsi_add_conn() has already invoked device_add(), but fails on
> transport_register_device() and is trying to invoke device_del(). At the
> same time iscsi_session_teardown() invokes iscsi_remove_session() and is
> invoking iscsi_iter_destroy_conn_fn for the failed-to-add connection.
> The connection will be put twice: one in iscsi_iter_destroy_conn_fn()
> and another one in iscsi_conn_setup() and leads to use-after-free problem.
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ