netdev - Re: [PATCH net-next] tcp: forbid direct reclaim if MSG

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iL1BMCx3Mbsj3TijR3Srjji95q86px0k98r7JYJbwLzcw@mail.gmail.com>
Date:   Tue, 9 Oct 2018 07:12:09 -0700
From:   Eric Dumazet <edumazet@...gle.com>
To:     Yafang Shao <laoar.shao@...il.com>
Cc:     David Miller <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net-next] tcp: forbid direct reclaim if MSG_DONTWAIT is
 set in send path

On Tue, Oct 9, 2018 at 5:05 AM Yafang Shao <laoar.shao@...il.com> wrote:
>
> By default, the sk->sk_allocation is GFP_KERNEL, that means if there's
> no enough memory it will do both direct reclaim and background reclaim.
> If the size of system memory is great, the direct reclaim may cause great
> latency spike.
>
> When we set MSG_DONTWAIT in send syscalls, we really don't want it to be
> blocked, so we'd better clear __GFP_DIRECT_RECLAIM when allocate skb in the
> send path. Then, it will return immediately if there's no enough memory to
> be allocated, and then the appliation has a chance to do some other stuffs
> instead of being blocked here.
>
> Signed-off-by: Yafang Shao <laoar.shao@...il.com>
> ---
>  net/ipv4/tcp.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 43ef83b..fe4f5ce 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -1182,6 +1182,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
>         bool process_backlog = false;
>         bool zc = false;
>         long timeo;
> +       gfp_t gfp;
>
>         flags = msg->msg_flags;
>
> @@ -1255,6 +1256,9 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
>         /* Ok commence sending. */
>         copied = 0;
>
> +       gfp = flags & MSG_DONTWAIT ? sk->sk_allocation & ~__GFP_DIRECT_RECLAIM :
> +             sk->sk_allocation;
> +
>  restart:
>         mss_now = tcp_send_mss(sk, &size_goal, flags);
>
> @@ -1283,8 +1287,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
>                         }
>                         first_skb = tcp_rtx_and_write_queues_empty(sk);
>                         linear = select_size(first_skb, zc);
> -                       skb = sk_stream_alloc_skb(sk, linear, sk->sk_allocation,
> -                                                 first_skb);
> +                       skb = sk_stream_alloc_skb(sk, linear, gfp, first_skb);
>                         if (!skb)
>                                 goto wait_for_memory;


How have you tested this patch exactly ?

Most of TCP payloads are added in page fragments, and you have not
changed the page allocation fragments.

Also, I do not see how an application will get future notifications
that it can retry the failed system call ?
How are you really going to deal with this in high performance applications ?

I would rather prefer a socket setsockopt() to eventually be able to
flip __GFP_DIRECT_RECLAIM in sk->sk_allocation,
to not add all these tests in fast path, but honestly I do not see how
applications can really make use of this.