netdev - Re: [PATCH net-next 02/13] selftests: ncdevmem: Remove validation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZuRzGmedL2LfTKXw@mini-arch>
Date: Fri, 13 Sep 2024 10:15:06 -0700
From: Stanislav Fomichev <stfomichev@...il.com>
To: Mina Almasry <almasrymina@...gle.com>
Cc: Stanislav Fomichev <sdf@...ichev.me>, netdev@...r.kernel.org,
	davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
	pabeni@...hat.com
Subject: Re: [PATCH net-next 02/13] selftests: ncdevmem: Remove validation

On 09/13, Mina Almasry wrote:
> On Thu, Sep 12, 2024 at 2:57 PM Stanislav Fomichev <stfomichev@...il.com> wrote:
> >
> > On 09/12, Mina Almasry wrote:
> > > On Thu, Sep 12, 2024 at 10:12 AM Stanislav Fomichev <sdf@...ichev.me> wrote:
> > > >
> > > > ncdevmem should (see next patches) print the payload on the stdout.
> > > > The validation can and should be done by the callers:
> > > >
> > > > $ ncdevmem -l ... > file
> > > > $ sha256sum file
> > > >
> > > > Cc: Mina Almasry <almasrymina@...gle.com>
> > > > Signed-off-by: Stanislav Fomichev <sdf@...ichev.me>
> > > > ---
> > > >  tools/testing/selftests/net/ncdevmem.c | 56 +++-----------------------
> > > >  1 file changed, 6 insertions(+), 50 deletions(-)
> > > >
> > > > diff --git a/tools/testing/selftests/net/ncdevmem.c b/tools/testing/selftests/net/ncdevmem.c
> > > > index 352dba211fb0..3712296d997b 100644
> > > > --- a/tools/testing/selftests/net/ncdevmem.c
> > > > +++ b/tools/testing/selftests/net/ncdevmem.c
> > > > @@ -64,24 +64,13 @@
> > > >  static char *server_ip = "192.168.1.4";
> > > >  static char *client_ip = "192.168.1.2";
> > > >  static char *port = "5201";
> > > > -static size_t do_validation;
> > > >  static int start_queue = 8;
> > > >  static int num_queues = 8;
> > > >  static char *ifname = "eth1";
> > > >  static unsigned int ifindex;
> > > >  static unsigned int dmabuf_id;
> > > >
> > > > -void print_bytes(void *ptr, size_t size)
> > > > -{
> > > > -       unsigned char *p = ptr;
> > > > -       int i;
> > > > -
> > > > -       for (i = 0; i < size; i++)
> > > > -               printf("%02hhX ", p[i]);
> > > > -       printf("\n");
> > > > -}
> > > > -
> > > > -void print_nonzero_bytes(void *ptr, size_t size)
> > > > +static void print_nonzero_bytes(void *ptr, size_t size)
> > > >  {
> > > >         unsigned char *p = ptr;
> > > >         unsigned int i;
> > > > @@ -91,30 +80,6 @@ void print_nonzero_bytes(void *ptr, size_t size)
> > > >         printf("\n");
> > > >  }
> > > >
> > > > -void validate_buffer(void *line, size_t size)
> > > > -{
> > > > -       static unsigned char seed = 1;
> > > > -       unsigned char *ptr = line;
> > > > -       int errors = 0;
> > > > -       size_t i;
> > > > -
> > > > -       for (i = 0; i < size; i++) {
> > > > -               if (ptr[i] != seed) {
> > > > -                       fprintf(stderr,
> > > > -                               "Failed validation: expected=%u, actual=%u, index=%lu\n",
> > > > -                               seed, ptr[i], i);
> > > > -                       errors++;
> > >
> > > FWIW the index at where the validation started to fail often gives
> > > critical clues about where the bug is, along with this line, which I'm
> > > glad is not removed:
> > >
> > > printf("received frag_page=%llu, in_page_offset=%llu,
> > > frag_offset=%llu, frag_size=%u, token=%u, total_received=%lu,
> > > dmabuf_id=%u\n",
> > >
> > > I think we can ensure that what is doing the validation above ncdevmem
> > > prints enough context about the error. Although, just to understand
> > > your thinking a bit, why not have this binary do the validation
> > > itself?
> >
> > Right, the debugging stuff will still be there to track the offending
> > part. And the caller can print out the idx of data where the validation
> > failed.
> >
> 
> Sorry to harp on this, but on second thought I don't think just
> printing out the idx is good enough. In many cases all the context
> printed by ncdevmem validation (page_frag/offset/dmabuf_id/etc) is
> useful, and it's useful to have it inline with where the check failed.
> 
> IIUC after your changes the frag_page/offset/dmabuf_id will go to
> stderr output of ncdevmem, but the validation check fail will go to a
> different log by the parent checker. Matching the failure in the 2
> logs in megs of frag output will be annoying.

Yes, it will definitely be more annoying to piece those two things
together. But I don't expect us to debug the payload validation issues
from the nipa dashboard logs. Even if you get a clear message of
"byte at position X is not expected" plus all the chunk info logs,
what do you really do with this info (especially if it's flaky)?

For development we can have some script to put those two things together
for debugging.

> > The reason I removed it from the tool is to be able to do the validation
> > with regular nc on the other side. I just do:
> >
> 
> To be honest I don't think the ncdevmem validation gets in the way of that.
> 
> To test rx, we can set up regular nc on the client side, and have
> ncdevmem do validation on rx.
> To test tx, we can have ncdevmem do tx (no validation), and have a
> script on top of nc do validation on the rx side.
> 
> I guess you may find the lack of symmetry annoying, but IMO it's less
> annoying than losing some debuggability when the test fails.
> 
> I think probably there are 3 hopefully agreeable things we can do here:
> 
> 1. Keep the validation as is and do the rx/tx test as I described above.
> 2. keep the ncdevmem validation, dump it to stderr, and ignore it in
> the test. Leave it be for folks wanting to run the test manually to
> debug it.
> 3. Move the validation to a parent process of ncdevmem, but that
> parent process needs to print full context of the failure in 1 log.
> 
> I prefer #1 TBH.

TBH, I find the existing ncdevmem validation scheme a bit lacking. It is
basically the same payload over and over (or maybe I misread how it
works). Maybe we should implement a PRNG-like sequence for validation
if you prefer to keep it internal?

How about we start with what I have (simple 'hello\nworld') and once
you do tx path, we can add the internal validation back? Both sides,
with a proper selftest this time. For the time being, we can use the existing
rx selftest as a smoke test.

Or I can just drop this patch from the series and let you follow up
on the validation in the selftests (i.e., convert from 'hello\nworld'
to whatever you prefer)