lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <364A2E37-B912-4160-ABC8-AC630A8777B2@redhat.com>
Date:   Wed, 07 Dec 2022 06:44:34 -0500
From:   Benjamin Coddington <bcodding@...hat.com>
To:     "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>
Cc:     linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH v1 0/3] Stop corrupting socket's task_frag

Hi Dave, Eric, Jakub, Paolo,

I think it makes sense for all three of these to go together through netdev.
If you agree, would you like me to chase down individual ACKs for each
treewide touch?

What can I do from netdev's perspective to move this forward?

Ben

On 21 Nov 2022, at 8:35, Benjamin Coddington wrote:

> The networking code uses flags in sk_allocation to determine if it can use
> current->task_frag, however in-kernel users of sockets may stop setting
> sk_allocation when they convert to the preferred memalloc_nofs_save/restore,
> as SUNRPC has done in commit a1231fda7e94 ("SUNRPC: Set memalloc_nofs_save()
> on all rpciod/xprtiod jobs").
>
> This will cause corruption in current->task_frag when recursing into the
> network layer for those subsystems during page fault or reclaim.  The
> corruption is difficult to diagnose because stack traces may not contain the
> offending subsystem at all.  The corruption is unlikely to show up in
> testing because it requires memory pressure, and so subsystems that
> convert to memalloc_nofs_save/restore are likely to continue to run into
> this issue.
>
> Previous reports and proposed fixes:
> https://lore.kernel.org/netdev/96a18bd00cbc6cb554603cc0d6ef1c551965b078.1663762494.git.gnault@redhat.com/
> https://lore.kernel.org/netdev/b4d8cb09c913d3e34f853736f3f5628abfd7f4b6.1656699567.git.gnault@redhat.com/
> https://lore.kernel.org/linux-nfs/de6d99321d1dcaa2ad456b92b3680aa77c07a747.1665401788.git.gnault@redhat.com/
>
> Guilluame Nault has done all of the hard work tracking this problem down and
> finding the best fix for this issue.  I'm just taking a turn posting another
> fix.
>
> Benjamin Coddington (2):
>   Treewide: Stop corrupting socket's task_frag
>   net: simplify sk_page_frag
>
> Guillaume Nault (1):
>   net: Introduce sk_use_task_frag in struct sock.
>
>  drivers/block/drbd/drbd_receiver.c |  3 +++
>  drivers/block/nbd.c                |  1 +
>  drivers/nvme/host/tcp.c            |  1 +
>  drivers/scsi/iscsi_tcp.c           |  1 +
>  drivers/usb/usbip/usbip_common.c   |  1 +
>  fs/afs/rxrpc.c                     |  1 +
>  fs/cifs/connect.c                  |  1 +
>  fs/dlm/lowcomms.c                  |  2 ++
>  fs/ocfs2/cluster/tcp.c             |  1 +
>  include/net/sock.h                 | 10 ++++++----
>  net/9p/trans_fd.c                  |  1 +
>  net/ceph/messenger.c               |  1 +
>  net/core/sock.c                    |  1 +
>  net/sunrpc/xprtsock.c              |  3 +++
>  14 files changed, 24 insertions(+), 4 deletions(-)
>
> -- 
> 2.31.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ