lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m2d0fkt5pj.fsf@badgerous.net>
Date:   Sat, 28 Sep 2019 15:29:44 -0700
From:   Alun Evans <alun@...gerous.net>
To:     ebiederm@...ssion.com (Eric W. Biederman)
Cc:     linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 05/27] containers: Open a socket inside a container



On Fri 27 Sep '19 at 07:46 ebiederm@...ssion.com (Eric W. Biederman) wrote:
> 
> Alun Evans <alun@...gerous.net> writes:
>
>> Hi Eric,
>>
>>
>> On Tue, 19 Feb 2019, Eric W. Biederman <ebiederm@...ssion.com> wrote:
>>>
>>> David Howells <dhowells@...hat.com> writes:
>>>
>>> > Provide a system call to open a socket inside of a container, using that
>>> > container's network namespace.  This allows netlink to be used to manage
>>> > the container.
>>> >
>>> > 	fd = container_socket(int container_fd,
>>> > 			      int domain, int type, int protocol);
>>> >
>>>
>>> Nacked-by: "Eric W. Biederman" <ebiederm@...ssion.com>
>>>
>>> Use a namespace file descriptor if you need this.  So far we have not
>>> added this system call as it is just a performance optimization.  And it
>>> has been too niche to matter.
>>>
>>> If this that has changed we can add this separately from everything else
>>> you are doing here.
>>
>> I think I've found the niche.
>>
>>
>> I'm trying to use network namespaces from Go.
>
> Yes. Go sucks for this.

Haha... Neither confirm nor deny.

>> Since setns is thread
>> specific, I'm forced to use this pattern:
>>
>>     runtime.LockOSThread()
>>     defer runtime.UnlockOSThread()
>>     …
>>     err = netns.Set(newns)
>>
>>
>> This is only safe recently:
>> https://github.com/vishvananda/netns/issues/17#issuecomment-367325770
>>
>> - but is still less than ideal performance wise, as it locks out other
>>   socket operations.
>>
>> The socketat() / socketns() would be ideal:
>>
>>   https://lwn.net/Articles/406684/
>>   https://lwn.net/Articles/407495/
>>   https://lkml.org/lkml/2011/10/3/220
>>
>>
>> One thing that is interesting, the LockOSThread works pretty well for
>> receiving, since I can wrap it around the socket()/bind()/listen() at
>> startup. Then accept() can run outside of the lock.
>>
>> It's creating new outbound tcp connections via socket()/connect() pairs
>> that is the issue.
>
> As I understand it you should be able to write socketat in go something like:
>
>         runtime.LockOSThread()
>         err = netns.Set(newns);
>         fd = socket(...);
>         err = netns.Set(defaultns);
>         runtime.UnlockOSThread()

Yeah, this is currently what I'm having to do. It's painful because due
to the Go runtime model of a single OS netpoller thread, locking the OS
thread to the current goroutine blocks out the other goroutines doing
network I/O.

> I have no real objections to a kernel system call doing that.  It has
> just never risen to the level where it was necessary to optimize
> userspace yet.

Would you be able to accept the patch from this thread with the
container API?

    fd = container_socket(int container_fd,
                          int domain, int type, int protocol);

I think that seems more coherent with the rest of the container world
than a follow up of https://lkml.org/lkml/2011/10/3/220 :

    int socketns(int namespace, int domain, int type, int protocol)


I could also put some up if required.


A.


-- 
Alun Evans.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ