lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFL455ni63jgLha_AypB6hW=w2YQjWzbi9CJo9oK8yG1VM-=6A@mail.gmail.com>
Date:   Wed, 19 Jul 2023 11:59:32 +0200
From:   Maurizio Lombardi <mlombard@...hat.com>
To:     Jirong Feng <jirong.feng@...ystack.cn>
Cc:     nab@...ux-iscsi.org, linux-scsi@...r.kernel.org,
        target-devel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: Close connection aborting an out-of-order cmd will hang

Hello,

Ășt 18. 7. 2023 v 8:52 odesĂ­latel Jirong Feng <jirong.feng@...ystack.cn> napsal:
>
> Hi,
>
> I recently encountered a hanging issue as follow:

Can you please provide the kernel version?

Thanks,
Maurizio

> [root@...e-6 ~]# ps -aux | grep ' D '
> root      8648  0.4  0.0      0     0 ?        D    Jul12  21:04 [iscsi_np]
> root     17572  0.0  0.0      0     0 ?        D    Jul12   0:09
> [kworker/7:3+events]
> root     56555  0.0  0.0 216576  1536 pts/1    S+   14:57   0:00 grep
> --color=auto  D
> root     59853  0.0  0.0      0     0 ?        D    Jul12   0:04 [iscsi_trx]
>
> the call stack:
> kworker:
> PID: 17572  TASK: ffff862470df0e00  CPU: 7   COMMAND: "kworker/7:3"
>   #0 [ffff0000528afab0] __switch_to at ffff4a49c69e74b8
>   #1 [ffff0000528afad0] __schedule at ffff4a49c72b60f4
>   #2 [ffff0000528afb60] schedule at ffff4a49c72b6754
>   #3 [ffff0000528afb70] schedule_timeout at ffff4a49c72ba980
>   #4 [ffff0000528afc30] wait_for_common at ffff4a49c72b7504
>   #5 [ffff0000528afcb0] wait_for_completion at ffff4a49c72b7594
>   #6 [ffff0000528afcd0] target_put_cmd_and_wait at ffff4a49a3dad38c
> [target_core_mod]
>   #7 [ffff0000528afd30] core_tmr_abort_task at ffff4a49a3da55c8
> [target_core_mod]
>   #8 [ffff0000528afd80] target_tmr_work at ffff4a49a3daa1c8
> [target_core_mod]
>   #9 [ffff0000528afdb0] process_one_work at ffff4a49c6a603c0
> #10 [ffff0000528afe00] worker_thread at ffff4a49c6a60640
> #11 [ffff0000528afe60] kthread at ffff4a49c6a67474
>
> iscsi_trx:
> PID: 59853  TASK: ffff8624fe0b5200  CPU: 7   COMMAND: "iscsi_trx"
>   #0 [ffff000095f6fa50] __switch_to at ffff4a49c69e74b8
>   #1 [ffff000095f6fa70] __schedule at ffff4a49c72b60f4
>   #2 [ffff000095f6fb00] schedule at ffff4a49c72b6754
>   #3 [ffff000095f6fb10] schedule_timeout at ffff4a49c72ba870
>   #4 [ffff000095f6fbd0] wait_for_common at ffff4a49c72b7504
>   #5 [ffff000095f6fc50] wait_for_completion_timeout at ffff4a49c72b75d0
>   #6 [ffff000095f6fc70] __transport_wait_for_tasks at ffff4a49a3da9c28
> [target_core_mod]
>   #7 [ffff000095f6fcb0] transport_generic_free_cmd at ffff4a49a3da9dd0
> [target_core_mod]
>   #8 [ffff000095f6fd20] iscsit_free_cmd at ffff4a49a3fc4464
> [iscsi_target_mod]
>   #9 [ffff000095f6fd50] iscsit_close_connection at ffff4a49a3fccf48
> [iscsi_target_mod]
> #10 [ffff000095f6fdf0] iscsit_take_action_for_connection_exit at
> ffff4a49a3fb7614 [iscsi_target_mod]
> #11 [ffff000095f6fe20] iscsi_target_rx_thread at ffff4a49a3fcc064
> [iscsi_target_mod]
> #12 [ffff000095f6fe60] kthread at ffff4a49c6a67474
>
> inspect the aborting cmd in kworker:
> crash> struct iscsi_cmd FFFFA62592F4B400
> struct iscsi_cmd {
>    dataout_timer_flags = (unknown: 0),
>    dataout_timeout_retries = 0 '\000',
>    error_recovery_count = 0 '\000',
>    deferred_i_state = ISTATE_NEW_CMD,
>    i_state = ISTATE_DEFERRED_CMD,
>    immediate_cmd = 0 '\000',
>    immediate_data = 0 '\000',
>    iscsi_opcode = 1 '\001',
>    iscsi_response = 0 '\000',
>    logout_reason = 0 '\000',
>    logout_response = 0 '\000',
>    maxcmdsn_inc = 0 '\000',
>    unsolicited_data = 0 '\000',
>    reject_reason = 0 '\000',
>    logout_cid = 0,
>    cmd_flags = ICF_OOO_CMDSN,
>    init_task_tag = 2415919152,
>    targ_xfer_tag = 205,
>    cmd_sn = 2860352639,
>    exp_stat_sn = 2502541166,
>    stat_sn = 0,
>    data_sn = 0,
> ...
>
> so this is an out-of-order cmd. In my conclusion, trx is waiting for
> kworker to abort the cmd,  while kworker is waiting for someone to
> complete the cmd, and that is never going to happen, hence the hanging.
>
> Could someone please help me to confirm the case?
>
> Regards,
> Jirong Feng
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ