linux-kernel - Re: [EXT] Re: [PATCH 0/5] net: atlantic: more fuzzing fixes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANEJEGvsfnry=tFOyx+cTRHJyTo2-TNOe1u4AWV+J=amrFyZpw@mail.gmail.com>
Date:   Tue, 3 May 2022 11:07:08 -0700
From:   Grant Grundler <grundler@...omium.org>
To:     Dmitrii Bezrukov <dbezrukov@...vell.com>
Cc:     Grant Grundler <grundler@...omium.org>,
        Igor Russkikh <irusskikh@...vell.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        netdev <netdev@...r.kernel.org>,
        "David S . Miller" <davem@...emloft.net>,
        LKML <linux-kernel@...r.kernel.org>,
        Aashay Shringarpure <aashay@...gle.com>,
        Yi Chou <yich@...gle.com>,
        Shervin Oloumi <enlightened@...gle.com>
Subject: Re: [EXT] Re: [PATCH 0/5] net: atlantic: more fuzzing fixes

Hi Dmitrii!

On Tue, May 3, 2022 at 4:15 AM Dmitrii Bezrukov <dbezrukov@...vell.com> wrote:
>
> Hi Grants,
>
> >[1/5] net: atlantic: limit buff_ring index value
>
> >diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
> >b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
> >index d875ce3ec759..e72b9d86f6ad 100644
> >--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
> >+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
> >@@ -981,7 +981,9 @@  int hw_atl_b0_hw_ring_rx_receive(struct aq_hw_s
> >*self, struct aq_ring_s *ring)
> >
> >                       if (buff->is_lro) {
> >                               /* LRO */
> >-                              buff->next = rxd_wb->next_desc_ptr;
> >+                              buff->next =
> >+                                      (rxd_wb->next_desc_ptr < ring->size) ?
> >+                                      rxd_wb->next_desc_ptr : 0U;
> >                               ++ring->stats.rx.lro_packets;
> >                       } else {
> >                               /* jumbo */
>
> I don’t find this way correct. At least in this functions there is no access to buffers by this index "next".

Did I misunderstand what this code (line 378) is doing in aq_ring.c?
342 #define AQ_SKB_ALIGN SKB_DATA_ALIGN(sizeof(struct skb_shared_info))
343 int aq_ring_rx_clean(struct aq_ring_s *self,
344                      struct napi_struct *napi,
345                      int *work_done,
346                      int budget)
347 {
...
371                                 if (buff_->next >= self->size) {
372                                         err = -EIO;
373                                         goto err_exit;
374                                 }
375
376                                 frag_cnt++;
377                                 next_ = buff_->next,
378                                 buff_ = &self->buff_ring[next_];

My change is redundant with Lines 371-374 - they essentially do the
same thing and were added on
2021-12-26 by 5f50153288452 (Zekun Shen)

The original fuzzing work was done on chromeos-v5.4 kernel and didn't
include this change. I'll backport 5f50153288452t to chromeos-v5.4 and
drop my proposed change.

> Following this fix, if it happens then during assembling of LRO session it could be that this buffer (you suggesting to use buffer with index 0) becomes a part of current LRO session and will be also processed as a single packet or as a part of other LRO huge packet.
> To be honest there are lot of possibilities depends on current values of head and tail which may cause either memory leak or double free or something else.

*nod*

> There is a code which calls this function aq_ring.c: aq_ring_rx_clean.

Exactly.

> From my POV it's better to check that indexes don't point to outside of ring in code which collects LRO session.

Sounds good to me. I don't have a preference. I'm ok with
dropping/ignoring [1/5] patch.

> There is expectation that "next" is in range "cleaned" descriptors, which means that HW put data there wrote descriptor and buffers are ready to be process by assembling code.
> So in case if "next" points to something outside of ring then guess it would be better just to stop collecting these buffers to one huge packet and treat this LRO session as completed.
> Then rest of buffers (which should be it that chain) will be collected again without beginning and just dropped by stack later.

That makes sense to me. And apologies for not noticing Zekun Shen's
2021-12-26 change earlier. I've been working on this off and on for
several months.

> > [4/5] net: atlantic: add check for MAX_SKB_FRAGS
> >
> >diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
> >b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
> >index bc1952131799..8201ce7adb77 100644
> >--- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
> >+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
> >@@ -363,6 +363,7 @@  int aq_ring_rx_clean(struct aq_ring_s *self,
> >                       continue;
> >
> >               if (!buff->is_eop) {
> >+                      unsigned int frag_cnt = 0U;
> >                       buff_ = buff;
> >                       do {
> >                               bool is_rsc_completed = true;
> >@@ -371,6 +372,8 @@  int aq_ring_rx_clean(struct aq_ring_s *self,
> >                                       err = -EIO;
> >                                       goto err_exit;
> >                               }
> >+
> >+                              frag_cnt++;
> >                               next_ = buff_->next,
> >                               buff_ = &self->buff_ring[next_];
> >                               is_rsc_completed =
> >@@ -378,7 +381,8 @@  int aq_ring_rx_clean(struct aq_ring_s *self,
> >                                                           next_,
> >                                                           self->hw_head);
> >
> >-                              if (unlikely(!is_rsc_completed)) {
> >+                              if (unlikely(!is_rsc_completed) ||
> >+                                  frag_cnt > MAX_SKB_FRAGS) {
> >                                       err = 0;
> >                                       goto err_exit;
> >                               }
>
> Number of fragments are limited by HW configuration: hw_atl_b0_internal.h: #define HW_ATL_B0_LRO_RXD_MAX 16U.

Should this loop be checking against HW_ATL_B0_LRO_RXD_MAX instead?

> Let's imagine if it happens: driver stucks at this point it will wait for session completion and session will not be completed due to too much fragments.
> Guess in case if number of buffers exceeds this limit then it is required to close session and continue (submit packet to stack and finalize clearing if processed descriptors/buffers).

Sorry, I'm not understanding your conclusion. Is the "goto err_exit"
in this case doing what you described?
Does this patch have the right idea (even if it should test against a
different constant)?

My main concern is the CPU gets stuck in this loop for a very long
(infinite?) time.

>
> > [5/5] net: atlantic: verify hw_head_ is reasonable diff --git
> >a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
> >b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
> >index e72b9d86f6ad..9b6b93bb3e86 100644
> >--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
> >+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
> >@@ -889,6 +889,27 @@  int hw_atl_b0_hw_ring_tx_head_update(struct aq_hw_s *self,
> >               err = -ENXIO;
> >               goto err_exit;
> >       }
> >+
> >+      /* Validate that the new hw_head_ is reasonable. */
> >+      if (hw_head_ >= ring->size) {
> >+              err = -ENXIO;
> >+              goto err_exit;
> >+      }
> >+
> >+      if (ring->sw_head >= ring->sw_tail) {
> >+              /* Head index hasn't wrapped around to below tail index. */
> >+              if (hw_head_ < ring->sw_head && hw_head_ >= ring->sw_tail) {
> >+                      err = -ENXIO;
> >+                      goto err_exit;
> >+              }
> >+      } else {
> >+              /* Head index has wrapped around and is below tail index. */
> >+              if (hw_head_ < ring->sw_head || hw_head_ >= ring->sw_tail) {
> >+                      err = -ENXIO;
> >+                      goto err_exit;
> >+              }
> >+      }
> >+
> >       ring->hw_head = hw_head_;
> >       err = aq_hw_err_from_flags(self);
>
> Simple example. One packet with one buffer was sent. Sw_tail = 1, sw_head=0. From interrupt this function is called and hw_head_ is 1.
> Code will follow "else" branch of second "if" that you add and condition "if (hw_head_ < ring->sw_head || hw_head_ >= ring->sw_tail) {" will be true which seems to be not expected.

Correct - I've got this wrong (head/tail swapped). Even if I had it
right, As Igor observed, debatable if it's necessary. Please
drop/ignore this patch as well. Aashay and I need to discuss this
more.

thank you again!

cheers,
grant


>
> Best regards,
> Dmitrii Bezrukov
>
> -----Original Message-----
> From: Grant Grundler <grundler@...omium.org>
> Sent: Tuesday, April 26, 2022 7:21 PM
> To: Igor Russkikh <irusskikh@...vell.com>
> Cc: Grant Grundler <grundler@...omium.org>; Dmitrii Bezrukov <dbezrukov@...vell.com>; Jakub Kicinski <kuba@...nel.org>; Paolo Abeni <pabeni@...hat.com>; netdev <netdev@...r.kernel.org>; David S . Miller <davem@...emloft.net>; LKML <linux-kernel@...r.kernel.org>; Aashay Shringarpure <aashay@...gle.com>; Yi Chou <yich@...gle.com>; Shervin Oloumi <enlightened@...gle.com>
> Subject: Re: [EXT] Re: [PATCH 0/5] net: atlantic: more fuzzing fixes
>
> [reply-all again since I forgot to tell gmail to post this as "plain text"...grrh... so much for AI figuring this stuff out.]
>
>
> On Tue, Apr 26, 2022 at 9:00 AM Igor Russkikh <irusskikh@...vell.com> wrote:
> >
> > Hi Grant,
> >
> > Sorry for the delay, I was on vacation.
> > Thanks for working on this.
>
> Hi Igor!
> Very welcome! And yes, I was starting to wonder... but I'm now glad that you didn't review them before you got back. These patches are no reason to ruin a perfectly good vacation. :)
>
> > I'm adding here Dmitrii, to help me review the patches.
> > Dmitrii, here is a full series:
> >
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel.
> > org_project_netdevbpf_cover_20220418231746.2464800-2D1-2Dgrundler-40ch
> > romium.org_&d=DwIFaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=AliKLBUTg9lFc5sIMTzJt8
> > MdPiAgKbsC8IpLIHmdX9w&m=1LeNSCJMgZ7UkGBm56FcvL_Oza8VOX45LEtQf31qPE2KLQ
> > cr5Q36aMIUR2DzLhi7&s=fVFxLPRO8K2DYFpGUOggf38nbDFaHKg8aATsjB1TuB0&e=
> >
> > Grant, I've reviewed and also quite OK with patches 1-4.
>
> Excellent! \o/
>
>
> > For patch 5 - why do you think we need these extra comparisons with software head/tail?
>
> The ChromeOS security team (CC'd) believes the driver needs to verify "expected behavior". In other words, the driver expects the device to provide new values of tail index which are between [tail,head) ("available to fill").
>
> Your question makes me chuckle because I asked exactly the same question. :D Everyone agrees it is a minimum requirement to verify the index was "in bounds". And I agree it's prudent to verify the device is "well behaved" where we can. I haven't looked at the code enough to know what could go wrong if, for example, the tail index is decremented instead of incremented or a "next fragment" index falls in the "available to fill" range.
>
> However, I didn't run the fuzzer and, for now, I'm taking the ChromeOS security team's word that this check is needed. If you (or Dmitrii) feel strongly the driver can handle malicious or firmware bugs in other ways, I'm not offended if you decline this patch. However, I would be curious what those other mechanisms are.
>
> > From what I see in logic, only the size limiting check is enough there..
> >
> > Other extra checks are tricky and non intuitive..
>
> Yes, somewhat tricky in the code but conceptually simple: For the RX buffer ring, IIUC, [head,tail) is "CPU to process" and [tail, head) is "available to fill". New tail values should always be in the latter range.
>
> The trickiness comes in because this is a ring buffer and [tail, head) it is equally likely that head =< tail  or head > tail numerically.
>
> If you like, feel free to add comments explaining the ring behavior or ask me to add such a comment (and repost #5). I'm a big fan of documenting non-intuitive things in the code. That way the next person to look at the code can verify the code and the IO device do what the comment claims.
>
> On the RX buffer ring, I'm also wondering if there is a race condition such that the driver uses stale values of the tail pointer when walking the RX fragment lists and validating index values. Aashay assures me this race condition is not possible and I am convinced this is true for the TX buffer ring where the driver is the "producer"
> (tells the device what is in the TX ring). I still have to review the RX buffer handling code more and will continue the conversation with him until we agree.
>
> cheers,
> grant
>
> >
> > Regards,
> >   Igor
> >
> > On 4/21/2022 9:53 PM, Grant Grundler wrote:
> > > External Email
> > >
> > > --------------------------------------------------------------------
> > > --
> > > Igor,
> > > Will you have a chance to comment on this in the near future?
> > > Should someone else review/integrate these patches?
> > >
> > > I'm asking since I've seen no comments in the past three days.
> > >
> > > cheers,
> > > grant
> > >
> > >
> > > On Mon, Apr 18, 2022 at 4:17 PM Grant Grundler
> > > <grundler@...omium.org>
> > > wrote:
> > >>
> > >> The Chrome OS fuzzing team posted a "Fuzzing" report for atlantic
> > >> driver in Q4 2021 using Chrome OS v5.4 kernel and "Cable Matters
> > >> Thunderbolt 3 to 10 Gb Ethernet" (b0 version):
> > >>
> > >> https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.co
> > >> m_document_d_e_2PACX-2D1vT4oCGNhhy-5FAuUqpu6NGnW0N9HF-5Fjxf2kS7raOp
> > >> OlNRqJNiTHAtjiHRthXYSeXIRTgfeVvsEt0qK9qK_pub&d=DwIBaQ&c=nKjWec2b6R0
> > >> mOyPaz7xtfQ&r=3kUjVPjrPMvlbd3rzgP63W0eewvCq4D-kzQRqaXHOqU&m=QoxR8Wo
> > >> QQ-hpWu_tThQydP3-6zkRWACvRmj_7aY1qo2FG6DdPdI86vAYrfKQFMHX&s=620jqeS
> > >> vQrGg6aotI35cWwQpjaL94s7TFeFh2cYSyvA&e=
> > >>
> > >> It essentially describes four problems:
> > >> 1) validate rxd_wb->next_desc_ptr before populating buff->next
> > >> 2) "frag[0] not initialized" case in aq_ring_rx_clean()
> > >> 3) limit iterations handling fragments in aq_ring_rx_clean()
> > >> 4) validate hw_head_ in hw_atl_b0_hw_ring_tx_head_update()
> > >>
> > >> I've added one "clean up" contribution:
> > >>     "net: atlantic: reduce scope of is_rsc_complete"
> > >>
> > >> I tested the "original" patches using chromeos-v5.4 kernel branch:
> > >>
> > >> https://urldefense.proofpoint.com/v2/url?u=https-3A__chromium-2Drev
> > >> iew.googlesource.com_q_hashtag-3Apcinet-2Datlantic-2D2022q1-2B-28st
> > >> atus-3Aopen-2520OR-2520status-3Amerged-29&d=DwIBaQ&c=nKjWec2b6R0mOy
> > >> Paz7xtfQ&r=3kUjVPjrPMvlbd3rzgP63W0eewvCq4D-kzQRqaXHOqU&m=QoxR8WoQQ-
> > >> hpWu_tThQydP3-6zkRWACvRmj_7aY1qo2FG6DdPdI86vAYrfKQFMHX&s=1a1YwJqrY-
> > >> be2oDgGAG5oOyZDnqIok_2p5G-N8djo2I&e=
> > >>
> > >> The fuzzing team will retest using the chromeos-v5.4 patches and
> > >> the b0 HW.
> > >>
> > >> I've forward ported those patches to 5.18-rc2 and compiled them but
> > >> am currently unable to test them on 5.18-rc2 kernel (logistics problems).
> > >>
> > >> I'm confident in all but the last patch:
> > >>    "net: atlantic: verify hw_head_ is reasonable"
> > >>
> > >> Please verify I'm not confusing how ring->sw_head and ring->sw_tail
> > >> are used in hw_atl_b0_hw_ring_tx_head_update().
> > >>
> > >> Credit largely goes to Chrome OS Fuzzing team members:
> > >>     Aashay Shringarpure, Yi Chou, Shervin Oloumi
> > >>
> > >> cheers,
> > >> grant