[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250221202332.GA6576@pendragon.ideasonboard.com>
Date: Fri, 21 Feb 2025 22:23:32 +0200
From: Laurent Pinchart <laurent.pinchart@...asonboard.com>
To: Jan Engelhardt <ej@...i.de>
Cc: David Laight <david.laight.linux@...il.com>,
"H. Peter Anvin" <hpa@...or.com>,
Greg KH <gregkh@...uxfoundation.org>,
Boqun Feng <boqun.feng@...il.com>,
Miguel Ojeda <miguel.ojeda.sandonis@...il.com>,
Christoph Hellwig <hch@...radead.org>,
rust-for-linux <rust-for-linux@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
David Airlie <airlied@...il.com>, linux-kernel@...r.kernel.org,
ksummit@...ts.linux.dev
Subject: Re: C aggregate passing (Rust kernel policy)
On Fri, Feb 21, 2025 at 09:06:14PM +0100, Jan Engelhardt wrote:
> On Friday 2025-02-21 19:34, David Laight wrote:
> >>
> >> Returning aggregates in C++ is often implemented with a secret extra
> >> pointer argument passed to the function. The C backend does not
> >> perform that kind of transformation automatically. I surmise ABI reasons.
> >
> > Have you really looked at the generated code?
> > For anything non-trivial if gets truly horrid.
> >
> > To pass a class by value the compiler has to call the C++ copy-operator to
> > generate a deep copy prior to the call, and then call the destructor after
> > the function returns - compare against passing a pointer to an existing
> > item (and not letting it be written to).
>
> And that is why people generally don't pass aggregates by value,
> irrespective of the programming language.
It's actually sometimes more efficient to pass aggregates by value.
Considering std::string for instance,
std::string global;
void setSomething(std::string s)
{
global = std::move(s);
}
void foo(int x)
{
std::string s = std::to_string(x);
setSomething(std::move(s));
}
Passing by value is the most efficient option. The backing storage for
the string is allocated once in foo(). If you instead did
std::string global;
void setSomething(const std::string &s)
{
global = s;
}
void foo(int x)
{
std::string s = std::to_string(x);
setSomething(s);
}
then the data would have to be copied when assigned global.
The std::string object itself needs to be copied in the first case of
course, but that doesn't require heap allocation. The best solution
depends on the type of aggregates you need to pass. It's one of the
reasons string handling is messy in C++, due to the need to interoperate
with zero-terminated strings, the optimal API convention depends on the
expected usage pattern in both callers and callees. std::string_view is
no silver bullet :-(
> > Returning a class member is probably worse and leads to nasty bugs.
> > In general the called code will have to do a deep copy from the item
> > being returned
>
> People have thought of that already and you can just
> `return std::move(a.b);`.
Doesn't that prevent NRVO (named return value optimization) in C++ ?
Starting in C++17, compilers are required to perform copy ellision.
> > Then you get code like:
> > const char *foo = data.func().c_str();
> > very easily written looks fine, but foo points to garbage.
>
> Because foo is non-owning, and the only owner has gone out of scope.
> You have to be wary of that.
>
> > You can return a reference - that doesn't go out of scope.
>
> That depends on the refererred item.
> string &f() { string z; return z; }
> is going to explode (despite returning a reference).
>
> > (Apart from the fact that c++ makes it hard to ensure all the non-class
> > members are initialised.)
>
> struct stat x{};
> struct stat x = {};
>
> all of x's members (which are scalar and thus non-class) are
> initialized. The second line even works in C.
--
Regards,
Laurent Pinchart
Powered by blists - more mailing lists