[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpXBJ2eAz4TW_GmOWRLrkyJcO=u-8VSnmv91ZABG-21Agg@mail.gmail.com>
Date: Tue, 16 Apr 2019 16:47:55 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: "Luck, Tony" <tony.luck@...el.com>
Cc: Borislav Petkov <bp@...en8.de>,
LKML <linux-kernel@...r.kernel.org>, linux-edac@...r.kernel.org,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH 1/2] ras: fix an off-by-one error in __find_elem()
On Tue, Apr 16, 2019 at 4:28 PM Luck, Tony <tony.luck@...el.com> wrote:
>
> On Tue, Apr 16, 2019 at 04:18:57PM -0700, Cong Wang wrote:
> > > The problem case occurs when we've seen enough distinct
> > > errors that we have filled every entry, then we try to
> > > look up a pfn that is larger that any seen before.
> > >
> > > The loop:
> > >
> > > while (min < max) {
> > > ...
> > > }
> > >
> > > will terminate with "min" set to MAX_ELEMS. Then we
> > > execute:
> > >
> > > this_pfn = PFN(ca->array[min]);
> > >
> > > which references beyond the end of the space allocated
> > > for ca->array.
> >
> > Exactly.
>
> Hmmm. But can we ever really have this happen? The call
> sequence to get here looks like:
>
>
> mutex_lock(&ce_mutex);
>
> if (ca->n == MAX_ELEMS)
> WARN_ON(!del_lru_elem_unlocked(ca));
>
> ret = find_elem(ca, pfn, &to);
>
> I.e. if the array was all the way full, we delete one element
> before calling find_elem(). So when we get here:
>
> static int __find_elem(struct ce_array *ca, u64 pfn, unsigned int *to)
> {
> u64 this_pfn;
> int min = 0, max = ca->n;
>
> The biggest value "max" can have is MAX_ELEMS-1
This is exactly the explanation for why the crash is inside
memmove() rather than inside find_elem(). del_elem() actually
accesses off-by-two once we pass its 'if' check in line 232:
229 static void del_elem(struct ce_array *ca, int idx)
230 {
231 /* Save us a function call when deleting the last element. */
232 if (ca->n - (idx + 1))
233 memmove((void *)&ca->array[idx],
234 (void *)&ca->array[idx + 1],
235 (ca->n - (idx + 1)) * sizeof(u64));
236
237 ca->n--;
238 }
idx is ca->n and ca->n is MAX_ELEMS-1, then the above if statement
becomes true, therefore idx+1 is MAX_ELEMS which is just beyond
the valid range.
Thanks.
Powered by blists - more mailing lists