lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <pn2lkox4qj7ylx44jnqjf4kfpenaexbnw33xsebekeg4yz6yw4@y464o5qyj4f2>
Date:   Wed, 25 Oct 2023 14:07:18 -0400
From:   Daniel Jordan <daniel.m.jordan@...cle.com>
To:     Wang Jinchao <wangjinchao@...sion.com>
Cc:     Steffen Klassert <steffen.klassert@...unet.com>,
        linux-crypto@...r.kernel.org, linux-kernel@...r.kernel.org,
        stone.xulei@...sion.com
Subject: Re: [RFC/REFACT] Refactoring and significantly reducing code
 complexity

Hello,

On Thu, Sep 28, 2023 at 04:53:38PM +0800, Wang Jinchao wrote:
> This is a refactored version with the following main changes:

The RFC overall is a nice simplification, and the basic approach of using an
ordered workqueue seems to work.

> - The parallel workqueue no longer uses the WQ_UNBOUND attribute

What's the justification here?  If it improves performance, please show
numbers.  Earlier tests[0] showed a large improvement when adding this
flag.

[0] https://lore.kernel.org/linux-crypto/20190906014029.3345-1-daniel.m.jordan@oracle.com/

> - Removal of CPU-related logic, sysfs-related interfaces

I agree with Steffen that we should continue to honor the cpumasks that the
user sets.

The simplest way I see to make the parallel mask work with your refactor is to
just make the parallel workqueue unbound again, since setting workqueue
attributes is only allowed for unbound, and bring back some of the plumbing
that leads to the apply_workqueue_attrs call.

The serial mask is trickier.  Changing attributes of an ordered workqueue (the
cpumask in this case) makes the kernel throw a warning...

static int apply_workqueue_attrs_locked
        ...
        /* creating multiple pwqs breaks ordering guarantee */                   
        if (!list_empty(&wq->pwqs)) {                                            
                if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT))                  
                        return -EINVAL;                                          
                                                                                 
                wq->flags &= ~__WQ_ORDERED;                                      
        }

...but I'm not sure this is a fundamental limitation.  The changelog of
0a94efb5acbb ("workqueue: implicit ordered attribute should be overridable")
says changes to "max_active and some attribute changes" are rejected, but it
might be possible to relax the warning to allow setting a cpumask while still
rejecting other changes.

> Testing was conducted using ltp's pcrypt_aead01, and the execution time
> comparison between the old and new versions is as follows:
> 
> Old Version:
> real 0m27.451s
> user 0m0.031s
> sys 0m0.260s
> 
> New Version:
> real 0m21.351s
> user 0m0.023s
> sys 0m0.260s

Great speedup.  A test that runs many requests for a long time in parallel is
also good to run, such as [0].

> @@ -986,57 +281,27 @@ struct padata_instance *padata_alloc(const char *name)
...
> +	pinst->serial_wq = alloc_ordered_workqueue ("%s_serial",
> +									WQ_MEM_RECLAIM | WQ_FREEZABLE,
> +									name);

Why add these two WQ_ flags?  Also, whitespace is kinda funky.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ