Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 10 Mar 2022 13:41:28 +0200
From:      Ze Dupsys <zedupsys@gmail.com>
To:        =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= <roger.pau@citrix.com>
Cc:        freebsd-xen@freebsd.org, buhrow@nfbcal.org
Subject:   Re: ZFS + FreeBSD XEN dom0 panic
Message-ID:  <5dfdecd5-f94d-29b4-791e-0adde5405cf5@gmail.com>
In-Reply-To: <YihojHNbzJagm4SI@Air-de-Roger>
References:  <CAOEWpzc2WVViMJHrrtuU-G_7yck4eehm6b=JQPSZU1MH-bzmiw@mail.gmail.com> <202203011540.221FeR4f028103@nfbcal.org> <CAOEWpzdC41ithfd7R_qa66%2Bsh_UXeku7OcVC_b%2BXUaLr_9SSTA@mail.gmail.com> <Yh93uLIBqk5NC2xf@Air-de-Roger> <CAOEWpzfsajhbvXfAw5-F1p83jjmSggobANBEyeYFAfiumAWRCA@mail.gmail.com> <YiCa70%2BHQScsoaKX@Air-de-Roger> <3d4691a7-c4b3-1c91-9eaa-7af071561bb6@gmail.com> <YihojHNbzJagm4SI@Air-de-Roger>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------0001BD4272D7E3B6557D5819
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit

On 2022.03.09. 10:42, Roger Pau Monné wrote:
> On Sun, Mar 06, 2022 at 02:41:17PM +0200, Ze Dupsys wrote:
>> Then i caught another sysctl variable that is growing due to XEN,
>> "kern.msgbuf: Contents of kernel message buffer". I do not know how this
>> variable grows or by which component it is managed, but in VM start/stop
>> case it grows and contains lines with pattern like so:
>> ..
>> xnb(xnb_rxpkt2rsp:2059): Got error -1 for hypervisor gnttab_copy status
>> xnb(xnb_ring2pkt:1526): Unknown extra info type 255.  Discarding packet
>> xnb(xnb_dump_txreq:299): netif_tx_request index =0
>> xnb(xnb_dump_txreq:300): netif_tx_request.gref  =0
>> xnb(xnb_dump_txreq:301): netif_tx_request.offset=0
>> xnb(xnb_dump_txreq:302): netif_tx_request.flags =8
>> xnb(xnb_dump_txreq:303): netif_tx_request.id    =69
>> xnb(xnb_dump_txreq:304): netif_tx_request.size  =1000
>> xnb(xnb_dump_txreq:299): netif_tx_request index =1
>> xnb(xnb_dump_txreq:300): netif_tx_request.gref  =255
>> xnb(xnb_dump_txreq:301): netif_tx_request.offset=0
>> xnb(xnb_dump_txreq:302): netif_tx_request.flags =0
>> xnb(xnb_dump_txreq:303): netif_tx_request.id    =0
>> xnb(xnb_dump_txreq:304): netif_tx_request.size  =0
>> ..
>>
>> Those lines in that variable just keep growing and growing, it is not that
>> they are flushed, trimmed or anything. Each time i get the same message on
>> serial output, it has one more section of error appended to "same-previous"
>> serial output message and sysctl variable as well. Thus at some point serial
>> output and sysctl contains a large block of those errors while VM is
>> starting. So at some point the value of this sysctl could be reaching max
>> allowed/available and this makes the system panic.  I do not know the reason
>> for those errors, but actually if there was a patch to suppress them, this
>> could be "solved". Another diff chunk might be related to this:
>> +dev.xnb.1.xenstore_peer_path: /local/domain/7/device/vif/0
>> +dev.xnb.1.xenbus_peer_domid: 7
>> +dev.xnb.1.xenbus_connection_state: InitWait
>> +dev.xnb.1.xenbus_dev_type: vif
>> +dev.xnb.1.xenstore_path: backend/vif/7/0
>> +dev.xnb.1.dump_rings:
>> +dev.xnb.1.unit_test_results: xnb_rxpkt2rsp_empty:1765 Assertion Error:
>> nr_reqs == 0
>> +xnb_rxpkt2rsp_empty:1767 Assertion Error: memcmp(&rxb_backup,
>> &xnb_unit_pvt.rxb, sizeof(rxb_backup)) == 0
>> +xnb_rxpkt2rsp_empty:1769 Assertion Error: memcmp(&rxs_backup,
>> xnb_unit_pvt.rxs, sizeof(rxs_backup)) == 0
>> +52 Tests Passed
>> +1 Tests FAILED
> So you have failed tests for netback. Maybe the issue is with
> netback rather than blkback.

Just found that there are no errors with network, most probably. Based 
on https://reviews.freebsd.org/D9234?id=24172#201024, since i was 
monitoring sysctl variables with 'sysctl -a', most probably that did 
"sysctl dev.xnb.0.unit_test_results" implicitly for each xnb interface, 
thus the extra output, since reading value actually runs tests. So now i 
upgraded my monitoring command to 'sysctl -a -N'. Never did i expect 
that sysctl variable reading could trigger test suite execution.

I did not understand from comments though - how do you instantiate xnb 
interface?

"ifconfig xnb create" returned:
ifconfig: SIOCIFCREATE2: Invalid argument

Thanks.



--------------0001BD4272D7E3B6557D5819
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: 8bit

<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    On 2022.03.09. 10:42, Roger Pau Monné wrote:<br>
    <blockquote type="cite" cite="mid:YihojHNbzJagm4SI@Air-de-Roger">
      <pre class="moz-quote-pre" wrap="">On Sun, Mar 06, 2022 at 02:41:17PM +0200, Ze Dupsys wrote:
</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">Then i caught another sysctl variable that is growing due to XEN,
"kern.msgbuf: Contents of kernel message buffer". I do not know how this
variable grows or by which component it is managed, but in VM start/stop
case it grows and contains lines with pattern like so:
..
xnb(xnb_rxpkt2rsp:2059): Got error -1 for hypervisor gnttab_copy status
xnb(xnb_ring2pkt:1526): Unknown extra info type 255.  Discarding packet
xnb(xnb_dump_txreq:299): netif_tx_request index =0
xnb(xnb_dump_txreq:300): netif_tx_request.gref  =0
xnb(xnb_dump_txreq:301): netif_tx_request.offset=0
xnb(xnb_dump_txreq:302): netif_tx_request.flags =8
xnb(xnb_dump_txreq:303): netif_tx_request.id    =69
xnb(xnb_dump_txreq:304): netif_tx_request.size  =1000
xnb(xnb_dump_txreq:299): netif_tx_request index =1
xnb(xnb_dump_txreq:300): netif_tx_request.gref  =255
xnb(xnb_dump_txreq:301): netif_tx_request.offset=0
xnb(xnb_dump_txreq:302): netif_tx_request.flags =0
xnb(xnb_dump_txreq:303): netif_tx_request.id    =0
xnb(xnb_dump_txreq:304): netif_tx_request.size  =0
..

Those lines in that variable just keep growing and growing, it is not that
they are flushed, trimmed or anything. Each time i get the same message on
serial output, it has one more section of error appended to "same-previous"
serial output message and sysctl variable as well. Thus at some point serial
output and sysctl contains a large block of those errors while VM is
starting. So at some point the value of this sysctl could be reaching max
allowed/available and this makes the system panic.  I do not know the reason
for those errors, but actually if there was a patch to suppress them, this
could be "solved". Another diff chunk might be related to this:
+dev.xnb.1.xenstore_peer_path: /local/domain/7/device/vif/0
+dev.xnb.1.xenbus_peer_domid: 7
+dev.xnb.1.xenbus_connection_state: InitWait
+dev.xnb.1.xenbus_dev_type: vif
+dev.xnb.1.xenstore_path: backend/vif/7/0
+dev.xnb.1.dump_rings:
+dev.xnb.1.unit_test_results: xnb_rxpkt2rsp_empty:1765 Assertion Error:
nr_reqs == 0
+xnb_rxpkt2rsp_empty:1767 Assertion Error: memcmp(&amp;rxb_backup,
&amp;xnb_unit_pvt.rxb, sizeof(rxb_backup)) == 0
+xnb_rxpkt2rsp_empty:1769 Assertion Error: memcmp(&amp;rxs_backup,
xnb_unit_pvt.rxs, sizeof(rxs_backup)) == 0
+52 Tests Passed
+1 Tests FAILED
</pre>
      </blockquote>
      <pre class="moz-quote-pre" wrap="">
So you have failed tests for netback. Maybe the issue is with
netback rather than blkback.</pre>
    </blockquote>
    <br>
    Just found that there are no errors with network, most probably.
    Based on <a class="moz-txt-link-freetext" href="https://reviews.freebsd.org/D9234?id=24172#201024">https://reviews.freebsd.org/D9234?id=24172#201024</a>, since i
    was monitoring sysctl variables with 'sysctl -a', most probably that
    did "<span class="transaction-comment"
      data-sigil="transaction-comment" data-meta="0_31">sysctl
      dev.xnb.0.unit_test_results" implicitly for each xnb interface,
      thus the extra output, since reading value actually runs tests. So
      now i upgraded my monitoring command to 'sysctl -a -N'. Never did
      i expect that sysctl variable reading could trigger test suite
      execution.<br>
      <br>
      I did not understand from comments though - how do you instantiate
      xnb interface?<br>
      <br>
      "ifconfig xnb create" returned:<br>
      ifconfig: SIOCIFCREATE2: Invalid argument<br>
    </span><br>
    Thanks.<br>
    <br>
    <br>
    <pre class="moz-quote-pre" wrap="">
</pre>
  </body>
</html>

--------------0001BD4272D7E3B6557D5819--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5dfdecd5-f94d-29b4-791e-0adde5405cf5>