lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230726171930.1632710-1-khorenko@virtuozzo.com>
Date: Wed, 26 Jul 2023 20:19:29 +0300
From: Konstantin Khorenko <khorenko@...tuozzo.com>
To: netdev@...r.kernel.org
Cc: Jakub Kicinski <kuba@...nel.org>,
	Manish Chopra <manishc@...vell.com>,
	Ariel Elior <aelior@...vell.com>,
	David Miller <davem@...emloft.net>,
	Sudarsana Kalluru <skalluru@...vell.com>,
	Paolo Abeni <pabeni@...hat.com>,
	Konstantin Khorenko <khorenko@...tuozzo.com>
Subject: [PATCH 0/1] qed: Yet another scheduling while atomic fix

Running an old RHEL7-based kernel we have got several cases of following
BUG_ON():

  BUG: scheduling while atomic: swapper/24/0/0x00000100

   [<ffffffffb41c6199>] schedule+0x29/0x70
   [<ffffffffb41c5512>] schedule_hrtimeout_range_clock+0xb2/0x150
   [<ffffffffb41c55c3>] schedule_hrtimeout_range+0x13/0x20
   [<ffffffffb41c3bcf>] usleep_range+0x4f/0x70
   [<ffffffffc08d3e58>] qed_ptt_acquire+0x38/0x100 [qed]
   [<ffffffffc08eac48>] _qed_get_vport_stats+0x458/0x580 [qed]
   [<ffffffffc08ead8c>] qed_get_vport_stats+0x1c/0xd0 [qed]
   [<ffffffffc08dffd3>] qed_get_protocol_stats+0x93/0x100 [qed]
                        qed_mcp_send_protocol_stats
            case MFW_DRV_MSG_GET_LAN_STATS:
            case MFW_DRV_MSG_GET_FCOE_STATS:
            case MFW_DRV_MSG_GET_ISCSI_STATS:
            case MFW_DRV_MSG_GET_RDMA_STATS:
   [<ffffffffc08e36d8>] qed_mcp_handle_events+0x2d8/0x890 [qed]
                        qed_int_assertion
                        qed_int_attentions
   [<ffffffffc08d9490>] qed_int_sp_dpc+0xa50/0xdc0 [qed]
   [<ffffffffb3aa7623>] tasklet_action+0x83/0x140
   [<ffffffffb41d9125>] __do_softirq+0x125/0x2bb
   [<ffffffffb41d560c>] call_softirq+0x1c/0x30
   [<ffffffffb3a30645>] do_softirq+0x65/0xa0
   [<ffffffffb3aa78d5>] irq_exit+0x105/0x110
   [<ffffffffb41d8996>] do_IRQ+0x56/0xf0

The situation is clear - tasklet function called schedule, but the fix
is not so trivial.

Checking the mainstream code it seem the same calltrace is still
possible on the latest kernel as well, so here is the fix.

The was a similar case recently for QEDE driver (reading stats through
sysfs) which resulted in the commit:
  42510dffd0e2 ("qed/qede: Fix scheduling while atomic")

i tried to implement the same logic as a fix for my case, but failed:
unfortunately it's not clear to me for this particular QED driver case
which statistic to collect in delay works for each particular device and
getting ALL possible stats for all devices, ignoring device type seems
incorrect.

Taking into account that i do not have access to the hardware at all,
the delay work approach is nearly impossible for me.

Thus i have taken the idea from patch v3 - just to provide the context
by the caller:
  https://www.spinics.net/lists/netdev/msg901089.html

At least this solution is technically clear and hopefully i did not make
stupid mistakes here.

The patch is COMPILE TESTED ONLY.

i would appreciate if somebody can test the patch. :)


Konstantin Khorenko (1):
  qed: Fix scheduling in a tasklet while getting stats

 drivers/net/ethernet/qlogic/qed/qed_dev_api.h |  2 ++
 drivers/net/ethernet/qlogic/qed/qed_fcoe.c    | 19 ++++++++++----
 drivers/net/ethernet/qlogic/qed/qed_fcoe.h    |  6 +++--
 drivers/net/ethernet/qlogic/qed/qed_hw.c      | 26 ++++++++++++++++---
 drivers/net/ethernet/qlogic/qed/qed_iscsi.c   | 19 ++++++++++----
 drivers/net/ethernet/qlogic/qed/qed_iscsi.h   |  6 +++--
 drivers/net/ethernet/qlogic/qed/qed_l2.c      | 19 ++++++++++----
 drivers/net/ethernet/qlogic/qed/qed_l2.h      |  3 +++
 drivers/net/ethernet/qlogic/qed/qed_main.c    |  6 ++---
 9 files changed, 80 insertions(+), 26 deletions(-)

-- 
2.31.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ