[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGsizzKC_qnV_WdW+4vDLkE69615atHz3rm4Bj7o0wLJnd92rQ@mail.gmail.com>
Date:	Fri, 27 Jan 2012 23:26:02 +0100
From:	Štefan Gula <steweg@...il.com>
To:	David Miller <davem@...emloft.net>
Cc:	david.vrabel@...rix.com, netdev@...r.kernel.org,
	gregory.v.rose@...el.com
Subject: Re: Regression: "rtnetlink: Compute and store minimum ifinfo dump
 size" breaks glibc's getifaddrs()
2012/1/27 David Miller <davem@...emloft.net>:
> From: David Vrabel <david.vrabel@...rix.com>
> Date: Fri, 27 Jan 2012 12:36:47 +0000
>
>> Changeset c7ac8679bec9397afe8918f788cbcef88c38da54 (rtnetlink: Compute
>> and store minimum ifinfo dump size) applied to 3.1 increased the maximum
>> size of the RTM_GETLINK message response.
>>
>> glibc's getifaddrs() function uses a page sized (4 KiB) buffer for the
>> RTM_GETLINK response and returns a failure if the message is truncated.
>> This buffer is not large enough if there is a network card with many
>> virtual functions.
>>
>> What do you recommend to resolve this regression?
>
> Actually, glibc technically uses the CPP define value "PAGE_SIZE" if
> available, which is potentially different from the system page size.
>
> Using a statically defined PAGE_SIZE is wrong if the program is
> subsequently executed on a system with a different page size.
>
> On sparc, and powerpc I believe, this happens commonly.  A 32-bit
> executable will see a PAGE_SIZE value of 4K, but when executed on a
> 64-bit system the page size is actually 8K or larger.  That's why
> __getpagesize() should always be used.
>
> Anyways, if the page sizes are correct we're in a bit of a pickle.
>
> Do you have any idea what the computed value of min_ifinfo_dump_size
> is at the time of the failure?
>
> Greg, I think we're kind of screwed.  The defined minimum appropriate
> buffer size for a recvmsg() call on a netlink socket is defined as:
>
>        getpagesize() < 8192 ? getpagesize() : 8192
>
> and glibc essentially abides by this by unconditionally using page
> size, and therefore if an interface with many virtual interfaces takes
> us over this limit, we break basically every properly written piece of
> netlink code out there.
>
> iproute2 is funny, it uses a static 16K buffer size for netlink
> message reception, which tends to paper over this problem :-/
>
> Stefan, this plays into your work too, we're starting to create
> situations which are going to cause very serious problems.  Imagine
> an interface with lots of these new macvlan source rules, it would
> potentially exceed the limit in the above equation as well.
>
> Thanks.
I can theoretically avoid this situation by using multiple small
rtnetlink messages instead of one huge. I believe that ipset code is
using that approach - just have to figure out proper changes to my
macvlan code. But this doesn't solves the issue you are mentioning, it
only prohibits to happen for macvlan code as one page size will be
more than enough to allocate buffer....
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists
 
