linux-kernel - Re: [PATCH 2/3] irqchip/gic-v3-its: Postpone LPI pending table freeing and memreserve

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87a6iyju92.mognet@arm.com>
Date:   Sun, 24 Oct 2021 16:51:53 +0100
From:   Valentin Schneider <valentin.schneider@....com>
To:     Marc Zyngier <maz@...nel.org>
Cc:     linux-kernel@...r.kernel.org, linux-rt-users@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org,
        Will Deacon <will@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Ard Biesheuvel <ardb@...nel.org>
Subject: Re: [PATCH 2/3] irqchip/gic-v3-its: Postpone LPI pending table freeing and memreserve

On 23/10/21 10:48, Marc Zyngier wrote:
> On Fri, 22 Oct 2021 11:33:06 +0100,
> Valentin Schneider <valentin.schneider@....com> wrote:
>> @@ -5202,6 +5205,39 @@ int its_cpu_init(void)
>>      return 0;
>>  }
>>
>> +#ifdef CONFIG_EFI
>
> Why do we need this? I can't see anything that'd be problematic even
> if EFI was disabled.
>

You're right, that's not required.

>> +static int its_cpu_memreserve_lpi(unsigned int cpu)
>> +{
>> +	struct page *pend_page = gic_data_rdist()->pend_page;
>> +	phys_addr_t paddr;
>> +
>> +	/*
>> +	 * If the pending table was pre-programmed, free the memory we
>> +	 * preemptively allocated.
>> +	 */
>> +	if (pend_page &&
>> +	    (gic_data_rdist()->flags & RDIST_FLAGS_PENDTABLE_PREALLOCATED)) {
>> +		its_free_pending_table(gic_data_rdist()->pend_page);
>> +		gic_data_rdist()->pend_page = NULL;
>> +	}
>
> So you set it to NULL and carry on, ending up reserving a 64kB block
> at address 0 if the RESERVED flag isn't set. Can this happen at all?
> If, as I suspect, it cannot happen because the two flags are always
> set at the same time, why do we need two flags?
>

PREALLOCATED implies RESERVED, but the reverse isn't true.

> My gut feeling is that if pend_page is non-NULL and that the RESERVED
> flag is set, you should free the memory and leave the building.
> Otherwise, reserve the memory and set the flag. PREALLOCATED doesn't
> seem to make much sense on a per-CPU basis here.
>

One thing I was concerned about is that this cpuhp callback can be invoked
more than once on the same CPU, even with the removal in patch 3.
Consider a system with maxcpus=X on the cmdline; not all secondaries will
be brought up in smp_init(). You then get to userspace which can issue all
sorts of hotplug sequences. Let me try to paint a picture:

  maxcpus=2

  CPU0                      CPU1                      CPU2

  its_init()                                          <nothing ever happens here>
  [...]
  its_cpu_memreserve_lpi()
    flags |= RESERVED
  [...]
  smp_init()
                            its_cpu_memreserve_lpi()
                              flags |= RESERVED

                         [.....]

  cpu_down(CPU1, CPUHP_OFFLINE)
  cpu_up(CPU1, CPUHP_ONLINE)

                            its_cpu_memreserve_lpi()
                              // pend_page != NULL && (flags & RESERVED) != 0
                              // we musn't free the memory

Now, my approach clearly isn't great (I also went through the "wait those
two flags are the same thing" phase, which in hindsight wasn't a good sign).
What we could do instead is only have a PREALLOCATED flag (or RESERVED; in
any case just one rather than two) set in its_cpu_init_lpis(), and ensure
each CPU only ever executes the body of the callback exactly once.

  if (already_booted())
      return 0;

  if (PREALLOCATED)
      its_free_pending_table();
  else
      gic_reserve_range();

  out:
    // callback removal faff here
    return 0;

Unfortunately, the boot CPU will already be present in
cpus_booted_once_mask when this is first invoked for the BP, so AFAICT we'd
need some new tracking utility (either a new RDIST_LOCAL flag or a separate
cpumask).

WDYT?