lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+FuTSe=kJSSvcYwCE9-omRF5Snd9AyesZac61PYyAHDStPt=A@mail.gmail.com>
Date:   Thu, 20 Jun 2019 09:41:30 -0400
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Neil Horman <nhorman@...driver.com>
Cc:     Network Development <netdev@...r.kernel.org>,
        Matteo Croce <mcroce@...hat.com>,
        "David S. Miller" <davem@...emloft.net>
Subject: Re: [PATCH net] af_packet: Block execution of tasks waiting for
 transmit to complete in AF_PACKET

On Wed, Jun 19, 2019 at 4:26 PM Neil Horman <nhorman@...driver.com> wrote:
>
> When an application is run that:
> a) Sets its scheduler to be SCHED_FIFO
> and
> b) Opens a memory mapped AF_PACKET socket, and sends frames with the
> MSG_DONTWAIT flag cleared, its possible for the application to hang
> forever in the kernel.  This occurs because when waiting, the code in
> tpacket_snd calls schedule, which under normal circumstances allows
> other tasks to run, including ksoftirqd, which in some cases is
> responsible for freeing the transmitted skb (which in AF_PACKET calls a
> destructor that flips the status bit of the transmitted frame back to
> available, allowing the transmitting task to complete).
>
> However, when the calling application is SCHED_FIFO, its priority is
> such that the schedule call immediately places the task back on the cpu,
> preventing ksoftirqd from freeing the skb, which in turn prevents the
> transmitting task from detecting that the transmission is complete.
>
> We can fix this by converting the schedule call to a completion
> mechanism.  By using a completion queue, we force the calling task, when
> it detects there are no more frames to send, to schedule itself off the
> cpu until such time as the last transmitted skb is freed, allowing
> forward progress to be made.
>
> Tested by myself and the reporter, with good results
>
> Appies to the net tree
>
> Signed-off-by: Neil Horman <nhorman@...driver.com>
> Reported-by: Matteo Croce <mcroce@...hat.com>
> CC: "David S. Miller" <davem@...emloft.net>
> ---

This is a complex change for a narrow configuration. Isn't a
SCHED_FIFO process preempting ksoftirqd a potential problem for other
networking workloads as well? And the right configuration to always
increase ksoftirqd priority when increasing another process's
priority? Also, even when ksoftirqd kicks in, isn't some progress
still made on the local_bh_enable reached from schedule()?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ