lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALx6S37uZah89sNgH9wuD1J+_WEhd34Z5zmrnG8Qp-AQ7Ew=Jg@mail.gmail.com>
Date:	Thu, 24 Mar 2016 16:54:03 -0700
From:	Tom Herbert <tom@...bertland.com>
To:	Yann Ylavic <ylavic.dev@...il.com>
Cc:	Eric Dumazet <eric.dumazet@...il.com>,
	Linux Kernel Network Developers <netdev@...r.kernel.org>,
	Willy Tarreau <w@....eu>,
	Tolga Ceylan <tolga.ceylan@...il.com>,
	Craig Gallek <cgallek@...gle.com>,
	Josh Snyder <josh@...e406.com>,
	Aaron Conole <aconole@...heb.org>,
	"David S. Miller" <davem@...emloft.net>,
	Daniel Borkmann <daniel@...earbox.net>
Subject: Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as
 drain mode

On Thu, Mar 24, 2016 at 4:40 PM, Yann Ylavic <ylavic.dev@...il.com> wrote:
> On Thu, Mar 24, 2016 at 11:49 PM, Eric Dumazet <eric.dumazet@...il.com> wrote:
>> On Thu, 2016-03-24 at 23:40 +0100, Yann Ylavic wrote:
>>
>>> FWIW, I find:
>>>
>>>     const struct bpf_insn prog[] = {
>>>         /* BPF_MOV64_REG(BPF_REG_6, BPF_REG_1) */
>>>         { BPF_ALU64 | BPF_MOV | BPF_X, BPF_REG_6, BPF_REG_1, 0, 0 },
>>>         /* BPF_LD_ABS(BPF_W, 0) R0 = (uint32_t)skb[0] */
>>>         { BPF_LD | BPF_ABS | BPF_W, 0, 0, 0, 0 },
>>>         /* BPF_ALU64_IMM(BPF_MOD, BPF_REG_0, mod) */
>>>         { BPF_ALU64 | BPF_MOD | BPF_K, BPF_REG_0, 0, 0, mod },
>>>         /* BPF_EXIT_INSN() */
>>>         { BPF_JMP | BPF_EXIT, 0, 0, 0, 0 }
>>>     };
>>> (and all the way to make it run)
>>>
>>> something quite unintuitive from a web server developper perspective,
>>> simply to make SO_REUSEPORT work with forked TCP listeners (probably
>>> as it should out of the box)...
>>
>>
>> That is why EBPF has LLVM backend.
>>
>> Basically you can write your "BPF" program in C, and let llvm convert it
>> into EBPF.
>
> I'll learn how to do this to get the best performances from the
> server, but having to do so to work around what looks like a defect
> (for simple/default SMP configurations at least, no NUMA or clever
> CPU-affinity or queuing policy involved) seems odd in the first place.
>
I disagree with your assessment that there is a defect. SO_REUSEPORT
is designed to spread packets amongst _equivalent_ connections. In the
server draining case sockets are no longer equivalent, but that is a
special case.

> From this POV, draining the (ending) listeners is already non obvious
> but might be reasonable, (e)BPF sounds really overkill.
>
Just the opposite, it's a simplification. With BPF we no longer to add
interfaces for all these special cases. This is an important point,
because the question is going to be raised for any proposed interface
change that could be accomplished with BPF (i.e. adding new interfaces
in the kernel becomes the overkill).

Please try to work with it. As I mentioned, the part that we may be
missing are some real world programs that we can direct people to use,
but aside from that I don't think we've seen any arguments that BPF is
overkill or too hard to use for stuff like this.

Tom

> But there are surely plenty of good reasons for it, and I won't be
> able to dispute your technical arguments in any case ;)
>
>>
>> Sure, you still can write BPF manually, as you could write HTTPS server
>> in assembly.
>
> OK, I'll take your previous proposal :)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ