[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m18vi3w7zd.fsf@fess.ebiederm.org>
Date: Tue, 10 Apr 2012 15:13:42 -0700
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Helge Deller <deller@....de>
Cc: Cong Wang <amwang@...hat.com>,
Octavian Purdila <octavian.purdila@...el.com>,
netdev@...r.kernel.org, David Miller <davem@...emloft.net>,
Andrew Morton <akpm@...ux-foundation.org>,
Frank Danapfel <fdanapfe@...hat.com>,
Laszlo Ersek <lersek@...hat.com>, shemminger@...tta.com
Subject: Re: [RFC] API to modify /proc/sys/net/ipv4/ip_local_reserved_ports
Helge Deller <deller@....de> writes:
> On 04/09/2012 10:43 AM, Cong Wang wrote:
>> On Wed, 2012-04-04 at 22:24 +0200, Helge Deller wrote:
>>> I would like to follow up on my last patch series to be able to modify
>>> the contents of the /proc/sys/net/ipv4/ip_local_reserved_ports port list
>>> from userspace.
>>>
>>> My last patch (https://lkml.org/lkml/2012/3/10/187) was based on
>>> modifications to the proc interface, which - based on the feedback here
>>> on the list - seemed to not be the right way to go (although I personally
>>> still like the idea very much :-)).
>>>
>>> Anyway, with this RFC I would like to get feedback about a new proposed
>>> API and attached kernel patch.
>>>
>>> The idea is to introduce a new<optname> value for get/setsockopt()
>>> named SO_RESERVED_PORTS to get/set the ip_local_reserved_ports
>>> bitmap via standard get/setsockopt() syscalls.
>>> As far as I understand this seems to be similiar to how iptables works.
>>>
>>> An untested kernel patch for review and feedback is attached below.
>>>
>>> In userspace it then would be possible to write a new tool or to extend
>>> for example the "ip" tool to accept commands like:
>>> $> ip reserved_ports add 100-2000
>>> $> ip reserved_ports remove 50-60
>>> $> ip reserved_ports list (to show current reserved port list)
>>>
>>> This userspace tool could then read the port bitmap from kernel via
>>> a) socket(PF_INET, SOCK_RAW, IPPROTO_RAW)
>>> b) getsockopt(3, SOL_SOCKET, SO_RESERVED_PORTS,<bitmaplist>)
>>> and write back the results after modification via
>>> c) setsockopt(3, SOL_SOCKET, SO_RESERVED_PORTS,<bitmaplist>)
>>>
>>> Would that be an acceptable solution?
>> Hmm, it is indeed that bitmap fits for syscall rather than /proc file.
>>
>> But it seems that using getsockopt()/setsockopt() makes it like it is a
>> per-socket setting, actually it is a system-wide setting.
> Yes, that's the reason why I used SOL_SOCKET which configures at least
> a few system-wide settings too.
>
>> So I am
>> wondering if exporting a binary /proc file for this is a better
>> solution.
> Yeah - that's another solution, but (65536 ports)/(8 bits per byte) = 8 KByte,
> so we
> may again hit the 4k limit of /proc (unless you do binary reads which should
> be done with a binary /proc-entry anyway).
>
> Again, I'm open to develop any kind of solution which would get an OK
> here.
Just looking at proc_do_large_bitmap, it does appear that there is a
very local 4k limit on writes.
Can you please just modify proc_do_large_bitmap so that there is not a
4k limit on writes. Ideally the code would just read another 4k from
userspace when it is getting close to the end of it's 4k buffer, or
perhaps we just read everything directly from userspace and run slowly.
The bitmap is installed atomically at the end so any weird partial
states should not be a problem..
Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists