[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YyjYVUetY0OL/9sX@maniforge.dhcp.thefacebook.com>
Date: Mon, 19 Sep 2022 16:00:05 -0500
From: David Vernet <void@...ifault.com>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc: ast@...nel.org, daniel@...earbox.net, andrii@...nel.org,
martin.lau@...ux.dev, bpf@...r.kernel.org, song@...nel.org,
yhs@...com, john.fastabend@...il.com, kpsingh@...nel.org,
sdf@...gle.com, haoluo@...gle.com, jolsa@...nel.org, tj@...nel.org,
linux-kernel@...r.kernel.org, kernel-team@...com
Subject: Re: [PATCH v5 3/4] bpf: Add libbpf logic for user-space ring buffer
On Mon, Sep 19, 2022 at 03:22:09PM -0500, David Vernet wrote:
[...]
> > > + timeout_ns = timeout_ms * ns_per_ms;
> > > + do {
> > > + __u64 ns_remaining = timeout_ns - ns_elapsed;
> > > + int cnt, ms_remaining;
> > > + void *sample;
> > > + struct timespec curr;
> > > +
> > > + sample = user_ring_buffer__reserve(rb, size);
> > > + if (sample)
> > > + return sample;
> > > + else if (errno != ENOSPC)
> > > + return NULL;
> > > +
> > > + ms_remaining = timeout_ms == -1 ? -1 : ns_remaining / ns_per_ms;
> >
> > ok, so you've special-cased timeout_ms == -1 but still didn't do
> > max(0, ns_remaining). Can you prove that ns_elapsed will never be
> > bigger than timeout_ns due to various delays in waking up this thread?
> > If not, let's please have max(0) otherwise we can accidentally
> > epoll_wait(-1).
>
> Yes you're right, this was an oversight. Thanks for catching this!
Wait, actually, this can't happen because of the while check at the end of
the loop:
while (ns_elapsed <= timeout_ns)
So I don't think the max is necessary to prevent underflowing, but I do
think we need to have one more attempt to invoke
user_ring_buffer__reserve() at the end of the function to account for
wakeup delays after epoll_wait() returns. Otherwise, we might think we've
timed out when there's actually a sample left in the buffer. I also still
think your suggestion below for cleaning up makes sense, so I'll still add
it in v6, but I think I can leave off the max() call.
>
> > > + /* The kernel guarantees at least one event notification
> > > + * delivery whenever at least one sample is drained from the
> > > + * ring buffer in an invocation to bpf_ringbuf_drain(). Other
> > > + * additional events may be delivered at any time, but only one
> > > + * event is guaranteed per bpf_ringbuf_drain() invocation,
> > > + * provided that a sample is drained, and the BPF program did
> > > + * not pass BPF_RB_NO_WAKEUP to bpf_ringbuf_drain().
> > > + */
> > > + cnt = epoll_wait(rb->epoll_fd, &rb->event, 1, ms_remaining);
> > > + if (cnt < 0)
> > > + return NULL;
> > > +
> > > + if (timeout_ms == -1)
> > > + continue;
> > > +
> > > + err = clock_gettime(CLOCK_MONOTONIC, &curr);
> > > + if (err)
> > > + return NULL;
> > > +
> > > + ns_elapsed = ns_elapsed_timespec(&start, &curr);
> >
> > nit: if you move re-calculation of ms_remaining and ns_remaining to
> > here, I think overall loop logic will be even more straightforwad. You
> > can initialize ms_remaining to -1 if timeout_ms < 0 and never
> > recalculate it, right? Note that you can also do ns_elapsed conversion
> > to ms right here and then keep everything else in ms (so no need for
> > timeout_ns, ns_remaining, etc).
>
> Sounds good, let me give this a shot in v6.
>
> Thanks for another detailed review!
Powered by blists - more mailing lists