lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoCZhakNunSGT4Y0RfaBi-UXbxDDcEU0n-OG9FXNb56Bcg@mail.gmail.com>
Date: Tue, 27 Aug 2024 23:27:05 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, 
	pabeni@...hat.com, dsahern@...nel.org, willemb@...gle.com, 
	netdev@...r.kernel.org, Jason Xing <kernelxing@...cent.com>
Subject: Re: [PATCH net-next 1/2] tcp: make SOF_TIMESTAMPING_RX_SOFTWARE
 feature per socket

Hello Willem,

On Tue, Aug 27, 2024 at 9:20 PM Willem de Bruijn
<willemdebruijn.kernel@...il.com> wrote:
>
> Jason Xing wrote:
> > On Tue, Aug 27, 2024 at 2:43 AM Willem de Bruijn
> > <willemdebruijn.kernel@...il.com> wrote:
> > >
> > > Jason Xing wrote:
> > > > On Tue, Aug 27, 2024 at 12:03 AM Willem de Bruijn
> > > > <willemdebruijn.kernel@...il.com> wrote:
> > > > >
> > > > > Jason Xing wrote:
> > > > > > Hello Willem,
> > > > > >
> > > > > > On Mon, Aug 26, 2024 at 9:24 PM Willem de Bruijn
> > > > > > <willemdebruijn.kernel@...il.com> wrote:
> > > > > > >
> > > > > > > Jason Xing wrote:
> > > > > > > > From: Jason Xing <kernelxing@...cent.com>
> > > > > > > >
> > > > > > > > Normally, if we want to record and print the rx timestamp after
> > > > > > > > tcp_recvmsg_locked(), we must enable both SOF_TIMESTAMPING_SOFTWARE
> > > > > > > > and SOF_TIMESTAMPING_RX_SOFTWARE flags, from which we also can notice
> > > > > > > > through running rxtimestamp binary in selftests (see testcase 7).
> > > > > > > >
> > > > > > > > However, there is one particular case that fails the selftests with
> > > > > > > > "./rxtimestamp: Expected swtstamp to not be set." error printing in
> > > > > > > > testcase 6.
> > > > > > > >
> > > > > > > > How does it happen? When we keep running a thread starting a socket
> > > > > > > > and set SOF_TIMESTAMPING_RX_HARDWARE option first, then running

Sorry, I found one mistake I made, it should be "and set
SOF_TIMESTAMPING_RX_SOFTWARE".

> > > > > > > > ./rxtimestamp, it will fail. The reason is the former thread
> > > > > > > > switching on netstamp_needed_key that makes the feature global,
> > > > > > > > every skb going through netif_receive_skb_list_internal() function
> > > > > > > > will get a current timestamp in net_timestamp_check(). So the skb
> > > > > > > > will have timestamp regardless of whether its socket option has
> > > > > > > > SOF_TIMESTAMPING_RX_SOFTWARE or not.
> > > > > > > >
> > > > > > > > After this patch, we can pass the selftest and control each socket
> > > > > > > > as we want when using rx timestamp feature.
> > > > > > > >
> > > > > > > > Signed-off-by: Jason Xing <kernelxing@...cent.com>
> > > > > > > > ---
> > > > > > > >  net/ipv4/tcp.c | 10 ++++++++--
> > > > > > > >  1 file changed, 8 insertions(+), 2 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > > > > > > > index 8514257f4ecd..49e73d66c57d 100644
> > > > > > > > --- a/net/ipv4/tcp.c
> > > > > > > > +++ b/net/ipv4/tcp.c
> > > > > > > > @@ -2235,6 +2235,7 @@ void tcp_recv_timestamp(struct msghdr *msg, const struct sock *sk,
> > > > > > > >                       struct scm_timestamping_internal *tss)
> > > > > > > >  {
> > > > > > > >       int new_tstamp = sock_flag(sk, SOCK_TSTAMP_NEW);
> > > > > > > > +     u32 tsflags = READ_ONCE(sk->sk_tsflags);
> > > > > > > >       bool has_timestamping = false;
> > > > > > > >
> > > > > > > >       if (tss->ts[0].tv_sec || tss->ts[0].tv_nsec) {
> > > > > > > > @@ -2274,14 +2275,19 @@ void tcp_recv_timestamp(struct msghdr *msg, const struct sock *sk,
> > > > > > > >                       }
> > > > > > > >               }
> > > > > > > >
> > > > > > > > -             if (READ_ONCE(sk->sk_tsflags) & SOF_TIMESTAMPING_SOFTWARE)
> > > > > > > > +             /* skb may contain timestamp because another socket
> > > > > > > > +              * turned on netstamp_needed_key which allows generate
> > > > > > > > +              * the timestamp. So we need to check the current socket.
> > > > > > > > +              */
> > > > > > > > +             if (tsflags & SOF_TIMESTAMPING_SOFTWARE &&
> > > > > > > > +                 tsflags & SOF_TIMESTAMPING_RX_SOFTWARE)
> > > > > > > >                       has_timestamping = true;
> > > > > > > >               else
> > > > > > > >                       tss->ts[0] = (struct timespec64) {0};
> > > > > > > >       }
[...]
> > >
> > > > Besides those two concepts you mentioned, could you explain if there
> > > > are side effects that the series has and what kind of bad consequences
> > > > that the series could bring?
> > >
> > > It doesn't do the same for hardware timestamping, creating
> > > inconsistency.
>
> Taking a closer look at the code, there are actually already two weird
> special cases here.
>
> SOF_TIMESTAMPING_RX_HARDWARE never has to be passed, as rx hardware
> timestamp generation is configured through SIOCSHWTSTAMP.

Do you refer to the patch [1/2] I wrote? To be more specific, is it
about the above wrong commit message which I just modified?

Things could happen when other unrelated threads set
SOF_TIMESTAMPING_RX_SOFTWARE instead of SOF_TIMESTAMPING_RX_HARDWARE.

Sorry for the confusion.

>
> SOF_TIMESTAMPING_RX_SOFTWARE already enables timestamp reporting from
> sock_recv_timestamp(), while reporting should not be conditional on
> this generation flag.

I'm not sure if you're talking about patch [2/2] in the series. But I guess so.

I can see what you mean here: you don't like combining the reporting
flag and generation flag, right? But If we don't check whether those
two flags (SOF_TIMESTAMPING_RX_SOFTWARE __and__
SOF_TIMESTAMPING_SOFTWARE) in sock_recv_timestamp(), some tests in the
protocols like udp will fail as we talked before.

netstamp_needed_key cannot be implemented as per socket feature (at
that time when the driver just pass the skb to the rx stack, we don't
know which socket the skb belongs to). Since we cannot prevent this
from happening during its generation period, I suppose we can delay
the check and try to stop it when it has to report, I mean, in
sock_recv_timestamp().

Or am I missing something? What would you suggest?

>
>         /*
>          * generate control messages if
>          * - receive time stamping in software requested
>          * - software time stamp available and wanted
>          * - hardware time stamps available and wanted
>          */
>         if (sock_flag(sk, SOCK_RCVTSTAMP) ||
>             (tsflags & SOF_TIMESTAMPING_RX_SOFTWARE) ||
>             (kt && tsflags & SOF_TIMESTAMPING_SOFTWARE) ||
>             (hwtstamps->hwtstamp &&
>              (tsflags & SOF_TIMESTAMPING_RAW_HARDWARE)))
>                 __sock_recv_timestamp(msg, sk, skb);
>
> I evidently already noticed this back in 2014, when I left a note in
> commit b9f40e21ef42 ("net-timestamp: move timestamp flags out of
> sk_flags"):
>
>     SOCK_TIMESTAMPING_RX_SOFTWARE is also used to toggle the receive
>     timestamp logic (netstamp_needed). That can be simplified and this
>     last key removed, but will leave that for a separate patch.
>
> But I do not see __sock_recv_timestamp toggling the feature either
> then or now, so I think this is vestigial and can be removed.

I'm not so sure about the unix case, I can see this call trace:
unix_dgram_recvmsg()->__unix_dgram_recvmsg()->__sock_recv_timestamp().

The reason why I added the check in in __sock_recv_timestamp () in the
patch [2/2] is considering the above call trace.

One thing I can be sure of is that removing the modification in
__sock_recv_timestamp in that patch doesn't affect the selftests.

Please correct me if I'm wrong.

>
> > >
> > > Changing established interfaces always risks production issues. In
> > > this case, I'm not convinced that the benefit outweighs this risk.
> >
> > I got it.
> >
> > I'm thinking that I'm not the first one and the last one who know/find
> > this long standing "issue", could we at least documentented it
> > somewhere, like adding comments in the selftests or Documentation, to
> > avoid the similar confusion in the future? Or change the behaviour in
> > the rxtimestamp.c test? What do you think about it? Adding
> > documentation or comments is the simplest way:)
>
> I can see the value of your extra filter. Given the above examples, it
> won't be the first subtle variance from the API design, either.

Really appreciate that you understand me :)

>
> So either way is fine with me: change it or leave it.
>
> But in both ways, yes: please update the documentation accordingly.

Roger that, sir. I will do it.

>
> And if you do choose to change it, please be ready to revert on report
> of breakage. Applications that only pass SOF_TIMESTAMPING_SOFTWARE,
> because that always worked as they subtly relied on another daemon to
> enable SOF_TIMESTAMPING_RX_SOFTWARE, for instance.

Yes, I still chose to change it and try to make it in the correct
direction. So if there are future reports, please let me know, I will
surely keep a close eye on it.

Thanks,
Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ