linux-kernel - Re: [PATCH 1/7] cpumask: fix checking valid cpu range

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xhsmhwn9k3ibb.mognet@vschneid.remote.csb>
Date:   Fri, 30 Sep 2022 18:04:08 +0100
From:   Valentin Schneider <vschneid@...hat.com>
To:     Yury Norov <yury.norov@...il.com>
Cc:     linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        "David S . Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Rasmus Villemoes <linux@...musvillemoes.dk>
Subject: Re: [PATCH 1/7] cpumask: fix checking valid cpu range

On 28/09/22 07:49, Yury Norov wrote:
> On Wed, Sep 28, 2022 at 01:18:20PM +0100, Valentin Schneider wrote:
>> On 19/09/22 14:05, Yury Norov wrote:
>> > @@ -174,9 +174,8 @@ static inline unsigned int cpumask_last(const struct cpumask *srcp)
>> >  static inline
>> >  unsigned int cpumask_next(int n, const struct cpumask *srcp)
>> >  {
>> > -	/* -1 is a legal arg here. */
>> > -	if (n != -1)
>> > -		cpumask_check(n);
>> > +	/* n is a prior cpu */
>> > +	cpumask_check(n + 1);
>> >       return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
>>
>> I'm confused, this makes passing nr_cpu_ids-1 to cpumask_next*() trigger a
>> warning. The documentation does states:
>>
>> * @n: the cpu prior to the place to search (ie. return will be > @n)
>>
>> So n is a valid CPU number (with -1 being the exception for scan
>> initialization), this shouldn't exclude nr_cpu_ids-1.
>
> For a regular cpumask function, like cpumask_any_but(), the valid range is
> [0, nr_cpu_ids).
>
> 'Special' functions shift by 1 when call underlying find API:
>
>   static inline
>   unsigned int cpumask_next(int n, const struct cpumask *srcp)
>   {
>           /* n is a prior cpu */
>           cpumask_check(n + 1);
>           return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
>   }
>
> So, for them the valid range [0, nr_cpu_ids) must be shifted in other
> direction: [-1, nr_cpu_ids-1).
>

The way I've been seeing this is that the [0, nr_cpu_ids) range is extended
to [-1, nr_cpu_ids) to accommodate for iteration starts.

>> IMO passing nr_cpu_ids-1 should be treated the same as passing the
>> last set bit in a bitmap: no warning, and returns the bitmap
>> size.
>
> This is how cpumask_check() works for normal functions. For
> cpumask_next() passing nr_cpu_ids-1 is the same as passing nr_cpu_ids
> for cpumask_any_but(), and it should trigger warning in both cases.
> (Or should not, but it's a different story.)
>
>> calling code which seems like unnecessary boiler plate
>>
>> For instance, I trigger the cpumask_check() warning there:
>>
>> 3d2dcab932d0:block/blk-mq.c @l2047
>>         if (--hctx->next_cpu_batch <= 0) {
>> select_cpu:
>>                 next_cpu = cpumask_next_and(next_cpu, hctx->cpumask, <-----
>>                                 cpu_online_mask);
>>                 if (next_cpu >= nr_cpu_ids)
>>                         next_cpu = blk_mq_first_mapped_cpu(hctx);
>>                 hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
>>         }
>>
>> next_cpu is a valid CPU number, shifting it doesn't seem to make sense, and
>> we do want it to reach nr_cpu_ids-1.
>
> next_cpu is a valid CPU number for all, but not for cpumask_next().
> The warning is valid. If we are at the very last cpu, what for we look
> for next?
>

Consider:

  nr_cpu_ids=4

  A)
  cpumask: 0.1.1.0
  CPU      0 1 2 3
  n            ^
  result: nr_cpu_ids

  B)
  cpumask: 0.0.1.1
  CPU      0 1 2 3
  n              ^
  result: nr_cpu_ids + WARN

Both scenarios are identical from a user perspective: a valid CPU number
was passed in (either from smp_processor_id() or from a previous call to
cpumask_next*()), but there are no more bits set in the cpumask. There's no
more CPUs to search for in both scenarios, but only one produces as WARN.

> The snippet above should be fixed like this:
>
>           if (--hctx->next_cpu_batch <= 0) {
>   select_cpu:
>                   if (next_cpu == nr_cpu_ids - 1)
>                           next_cpu = nr_cpu_ids;
>                   else
>                           next_cpu = cpumask_next_and(next_cpu,
>                                                       hctx->cpumask,
>                                                       cpu_online_mask);
>                   if (next_cpu >= nr_cpu_ids)
>                           next_cpu = blk_mq_first_mapped_cpu(hctx);
>                   hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
>           }
>
> The original motivation for this special shifted semantics was to
> avoid passing '+1' in cpumask_next() everywhere where it's used to
> iterate over cpumask. This is especially ugly because it brings negative
> semantics in such a simple thing like an index, and makes people confused.
> It was a bad decision, but now it's so broadly used that we have to live
> with it.
>
> The strategy to mitigate this is to minimize using of that 'special'
> functions. They all are cpumask_next()-like. In this series I reworked
> for_each_cpu() to not use cpumask_next().
>
> Often, cpumask_next() is a part of opencoded for_each_cpu(), and this
> is relatively easy to fix. In case of blk_mq_hctx_next_cpu() that you
> mentioned above, cpumask_next_and() usage looks unavoidable, and
> there's nothing to do with that, except that being careful.
>
> It didn't trigger the warning in my test setup, so I didn't fix it.
> Feel free to submit a patch, if you observe the warning for yourself.
>
> Maybe we should consider nr_cpu_ids as a special valid index for
> cpumask_check(), a sign of the end of an array. This would help to
> silence many warnings, like this one. For now I'm leaning towards that
> it's more a hack than a meaningful change.
>

I agree, we definitely want to warn for e.g.

  cpumask_set_cpu(nr_cpu_ids, ...);

Could we instead make cpumask_next*() immediately return nr_cpu_ids when
passed n=nr_cpu_ids-1?

Also, what about cpumask_next_wrap()? That uses cpumask_next() under the
hood and is bound to warn when wrapping after n=nr_cpu_ids-1, I think.


> Thanks,
> Yury