From owner-freebsd-net@freebsd.org Tue Nov 21 11:54:19 2017 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 17384DEBFA4 for ; Tue, 21 Nov 2017 11:54:19 +0000 (UTC) (envelope-from s-2pptwqm0m71hfkvnuyao7l5j0pcdh7you9byn5lh603zvk4pecq4p8wy@bounce.linkedin.com) Received: from maile-ce.linkedin.com (maile-ce.linkedin.com [108.174.6.197]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.linkedin.com", Issuer "DigiCert SHA2 Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C8ED46AEFB for ; Tue, 21 Nov 2017 11:54:18 +0000 (UTC) (envelope-from s-2pptwqm0m71hfkvnuyao7l5j0pcdh7you9byn5lh603zvk4pecq4p8wy@bounce.linkedin.com) From: =?UTF-8?B?546L6ZGr54Sx?= Message-ID: <257285933.713565.1511265256079.JavaMail.app@lsg1-app0216.prod.linkedin.com> Subject: =?UTF-8?B?546L6ZGr54Sx6YKA6K+35oKo5Yqg5YWl6aKG6Iux?= MIME-Version: 1.0 To: Date: Tue, 21 Nov 2017 11:54:16 +0000 (UTC) X-LinkedIn-Class: INVITE-GUEST X-LinkedIn-Template: invite_guest X-LinkedIn-fbl: m2-aszv4g20i20htmn0e594n16r0ufztaw1tzzvogqv71532dus2pae958720bsb4vc4hxhke2pimd46jpxdmx1c8xd8qtes9fwmsf33p X-LinkedIn-Id: xdi44-ja9kb81v-ew Content-Type: text/plain;charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Nov 2017 11:54:19 -0000 =E7=8E=8B=E9=91=AB=E7=84=B1=E9=82=80=E8=AF=B7=E6=82=A8=E5=8A=A0=E5=85=A5=E5= =85=A8=E7=90=83=E9=A2=86=E5=85=88=E7=9A=84=E8=81=8C=E4=B8=9A=E7=A4=BE=E4=BA= =A4=E7=BD=91=E7=AB=99 LinkedIn (=E9=A2=86=E8=8B=B1)=E3=80=82=E9=A2=86=E8=8B= =B1=E8=83=BD=E5=B8=AE=E5=8A=A9=E6=82=A8=E7=AE=A1=E7=90=86=E8=81=8C=E4=B8=9A= =E7=94=9F=E6=B6=AF=EF=BC=8C=E8=AE=A9=E6=82=A8=E6=8E=A5=E8=A7=A6=E5=88=B0=E5= =85=A8=E7=90=83=E5=90=84=E5=9C=B0=E7=9A=84=E5=B7=A5=E4=BD=9C=E6=9C=BA=E4=BC= =9A=E3=80=82 =E5=BF=AB=E6=9D=A5=E5=8A=A0=E5=85=A5=E9=A2=86=E8=8B=B1=EF=BC= =8C=E5=88=9B=E5=BB=BA=E6=82=A8=E7=9A=84=E8=81=8C=E4=B8=9A=E6=A1=A3=E6=A1=88= =EF=BC=8C=E8=BD=BB=E6=9D=BE=E6=8B=93=E5=B1=95=E8=81=8C=E5=9C=BA=E4=BA=BA=E8= =84=89=EF=BC=81 =E7=8E=8B=E9=91=AB=E7=84=B1 =E6=B7=B1=E5=9C=B3=E5=B8=82=E5=8D=8E=E8=9E=8D=E5=87=AF=E7=A7=91=E6=8A=80=E6= =9C=89=E9=99=90=E5=85=AC=E5=8F=B8 - =E6=80=BB=E7=BB=8F=E7=90=86 =E4=B8=AD=E5=9B=BD =E5=B9=BF=E4=B8=9C =E6=B7=B1=E5=9C=B3 =E6=8E=A5=E5=8F=97=E9=82=80=E8=AF=B7: https://www.linkedin.com/comm/start/a= ccept-invitation?sharedKey=3DkdBMJXHW&invitationId=3D6338700858955530248&tr= k=3Deml-china-m2g-a-cta&trkEmail=3Deml-invite_guest-null-31-null-null-xdi44= %7Eja9kb81v%7Eew-ssuw-start%7Esignup%7Ewarm&lipi=3Durn%3Ali%3Apage%3Aemail_= invite_guest%3BcYV6WddyTleNt7TNpMv1Sg%3D%3D =E6=9C=89=E4=BC=9A=E5=91=98=E9=82=80=E8=AF=B7=E6=82=A8=E5=BB=BA=E7=AB=8B=E8= =81=94=E7=B3=BB=E3=80=82=E5=9C=A8=E2=80=9C=E7=8C=9C=E6=82=A8=E8=AE=A4=E8=AF= =86=E2=80=9D=E7=AD=89=E5=8A=9F=E8=83=BD=E4=B8=AD=EF=BC=8C=E9=A2=86=E8=8B=B1= =E5=B0=86=E4=BD=BF=E7=94=A8=E6=82=A8=E7=9A=84=E9=82=AE=E7=AE=B1=E5=9C=B0=E5= =9D=80=E6=9D=A5=E5=90=91=E4=BC=9A=E5=91=98=E6=8E=A8=E8=8D=90=E4=BA=BA=E8=84= =89=E3=80=82=E5=9C=A8=E6=AD=A4=E5=A4=84=E9=80=80=E8=AE=A2: https://www.link= edin.com/e/v2?e=3Dxdi44-ja9kb81v-ew&t=3Dlun&midToken=3DAQFmrFVhGEd36Q&ek=3D= invite_guest&loid=3DAQFG2Uc-l29EvwAAAV_ebcIvyqsvh1iov-6ykOCp9vSfTr4drnTZZ5V= 4CIEgx7qycDaCydxJdHGEDtv2dUzroBAXx0CaoUvpc5ZykOTt8A&eid=3Dxdi44-ja9kb81v-ew =E6=AD=A4=E9=82=AE=E4=BB=B6=E7=9A=84=E6=94=B6=E4=BB=B6=E4=BA=BA=E6=98=AFfre= ebsd-net@freebsd.org=E3=80=82 If you need assistance or have questions, please contact LinkedIn Customer = Service: https://www.linkedin.com/e/v2?e=3Dxdi44-ja9kb81v-ew&lipi=3Durn%3Al= i%3Apage%3Aemail_invite_guest%3BcYV6WddyTleNt7TNpMv1Sg%3D%3D&a=3DcustomerSe= rviceUrl&ek=3Dinvite_guest =C2=A9 2017 LinkedIn Corporation, 1000 West Maude Avenue, Sunnyvale, CA 940= 85.LinkedIn =E5=92=8C LinkedIn =E6=A0=87=E5=BF=97=E6=98=AF=E9=A2=86=E8=8B= =B1=E7=9A=84=E6=B3=A8=E5=86=8C=E5=95=86=E6=A0=87=E3=80=82 From owner-freebsd-net@freebsd.org Tue Nov 21 13:26:29 2017 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BC41BDEED4E for ; Tue, 21 Nov 2017 13:26:29 +0000 (UTC) (envelope-from v.maffione@gmail.com) Received: from mail-qk0-x22f.google.com (mail-qk0-x22f.google.com [IPv6:2607:f8b0:400d:c09::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 730486E95D for ; Tue, 21 Nov 2017 13:26:29 +0000 (UTC) (envelope-from v.maffione@gmail.com) Received: by mail-qk0-x22f.google.com with SMTP id w125so11971646qkb.6 for ; Tue, 21 Nov 2017 05:26:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=GYoPBPInMrhqNzwh9LZBRHKk1ZONStGdcgPlkXuLhBI=; b=hr8iwvOZswcAlrRJjEfguLAKQx2MDb7nXiLnl5zWoAJniakbT5RicSAubCKW9eLQ+f m6kMV6q8H0k+sA5xI2mUffPRPOmQfAji3tHYqXQm18fMdXTLRXeOt0webhI3EtqjJaLK BqFoEPyOhNgcyrfcFXITn/aD8qXpsmp7TyPGpo7uR/X2gdO4vVgPjgHXZmuifFaPZGdJ y4YFeJLLr7bzJh8fPBC7gfwaUs/D2sopSubL3cCe0kCpLvgRZvJsQ4Mh1tRR8FDrSJkA 3Y8+FefBG/FvYt6riAuRVLeVxED8OyRNdfBlgpLP7sDwIw18WfaWAooaTfq71Fgkb+8g 6mKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=GYoPBPInMrhqNzwh9LZBRHKk1ZONStGdcgPlkXuLhBI=; b=FvYj0lalrE0qKOyjNFIczaCn116CAte8ZBlmcEbsI6Lm7ibyNHcYgDi8ZD4zL6f0mX BldUZTNteg0X2gIc3yzU79pOequHSSL/SiX6yGBNJDcNPXHDyO+h4AX6XLsO3s8aRj/I TTIED1UFjzPSXCSyLvWbwqpwDcmywIC4EVxmxMz67pOV0ZfQhqgkkZNthpx6rAEgzpqO n06byCF/CShsLwcpaLxYqpXCuzue7+Dbz5OqVf6IbgEoVmXmhMT7nEaVWFVqhT3CA0VG m+oVO8KRUMHvh6pIsILAmsp8FmCH6nGtKYkFAkgYYjpA5FNcBGxMPZ6dRLCz47BAuzLT /kWw== X-Gm-Message-State: AJaThX47VO/YH02C2sG1+2dDtB57GLjr6QCiySlbhsSc2nic5Dlbxl55 s7T4VTURHrSWC8hr/yEyg0uY0HX+uwVK3KY43B4= X-Google-Smtp-Source: AGs4zMb+xyCgZs3BeS/KUZPaPz8z/6bvlAvw9ZiFrYzjWp2AU7NtLxf8eOqObYdmhCbOTDE2eCCMXWPN4oFegVdcmbg= X-Received: by 10.55.48.199 with SMTP id w190mr11247855qkw.87.1511270788358; Tue, 21 Nov 2017 05:26:28 -0800 (PST) MIME-Version: 1.0 Received: by 10.12.174.25 with HTTP; Tue, 21 Nov 2017 05:26:27 -0800 (PST) In-Reply-To: <5A13F8A8.2020209@omnilan.de> References: <5A0F14CD.3040407@omnilan.de> <5A13F8A8.2020209@omnilan.de> From: Vincenzo Maffione Date: Tue, 21 Nov 2017 14:26:27 +0100 Message-ID: Subject: Re: netmap/vale periodic deadlock To: Harry Schmalzbauer Cc: "freebsd-net@freebsd.org" , Giuseppe Lettieri Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Nov 2017 13:26:29 -0000 It may be that your is not a deadlock but some kind of crash. Enabling debugging features would probably help (e.g. to get a stack trace). Maybe your lockup/crash happened because you did some reconfiguration (ring size, number of rings, etc.) while netmap was active and doing so you triggered some hidden bug. You probably need to enable all these options https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerne= ldebug-options.html 2017-11-21 10:58 GMT+01:00 Harry Schmalzbauer : > Bez=C3=BCglich Vincenzo Maffione's Nachricht vom 21.11.2017 09:39 (localt= ime): > > Hi, > > It's hard to say, specially because it happens after two days of > > normal use. > > Can't you enable deadlock debugging features in your kernel? > > https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/ > kerneldebug-deadlocks.html > > > > However, if I understand correctly you have created some VLAN interface= s > > vlan0, vlan1, vlan2, ... on top of a NIC (say em0). And you have > > attached each VLAN interface to a vale switch: > > > > # vale-ctl -a vale0:vlan0 > > # vale-ctl -a vale1:vlan1 > > # vale-ctl -a vale2:vlan2 > > > > and each VALE switch is attached to a different set of bhyve guests. > > Hello Vincenzo, > > thank you very much for your help again! > > Your assumption is correct, here's my vale-ctl: > 603.811416 bdg_ctl [148] bridge:0 port:0 vale1:nic1_dmz > 603.811428 bdg_ctl [148] bridge:0 port:1 vale1:styx0 > 603.811430 bdg_ctl [148] bridge:0 port:2 vale1:korso > 603.811432 bdg_ctl [148] bridge:0 port:3 vale1:kallisto > > 603.811434 bdg_ctl [148] bridge:1 port:0 vale2:nic1_inop > 603.811435 bdg_ctl [148] bridge:1 port:1 vale2:styx0 > > 603.811437 bdg_ctl [148] bridge:2 port:0 vale3:nic1_vnl > 603.811439 bdg_ctl [148] bridge:2 port:1 vale3:styx0 > > 603.811441 bdg_ctl [148] bridge:3 port:0 vale4:nic1_egn > 603.811442 bdg_ctl [148] bridge:3 port:1 vale4:styx0 > 603.811444 bdg_ctl [148] bridge:3 port:2 vale4:preed > =E2=80=A6 > > > > If this is the case, although you are allowed to do that, I don't think > > it's a convenient way to use netmap. > > Since VLAN interfaces like vlan0 do not have (and cannot have) native > > netmap support, you are falling back to emulated netmap adapters (which > > are probably buggy on FreeBSD, specially when combined with VALE). > > Apart from bugs I think that with this setup you can't get decent > > performance that would justify using netmap rather than the standard > > kernel bridge and TAP devices. > > I'm aware about the lost netmap-performace-benefit due to emulated > netmap fallback. > But there were some resonons why I chose vale(4) instead if_bridge(4): > > 1) Inter-Guest-traffic (virtio-net causes lot of LAPIC/IRQ overhead, but > still less overhead than tap(4)/if_bridge(4)) > > 2) Future ptnetmap(4) upgrade path (which should save a lot of LAPIC/IRQ > CPU cycles and unleash huge performace benefits with inter-vm traffic) > > 3) Admin-mess and MTU limitation. Each if_bridge(4) causes a host-stack > interface, which I don't use and which spams ifconfig(8) output; which > if_vtnet(4) even doubles. > Most important disadvantage: if_bridge(4) needs all members to have > exactly the same MTU. This has been a problem for me many times over > the last years in various setups. > > So with my current setup the overhead/efficiency of host-external packet > flow of > bhyve_virtio-net+dyn_vale_port+vale(4) > is equal to > bhyve_virtio-net+if_vtnet(4)+if_bridge(4) > > But I have less disadvanteges with vale(4); as long as emulated netmap > mode doesn't destabilize my setup :-( > > > My second choice was ng_bridge(4). Which I made great experiences in my > router-vm, running on that host in question (and in turn uses virtio-net > interfaces attached to the individual vale(4) switches on the host). > [ Even more impressive: pf(4) runs in a VIMAGE jail in that guest, > utilizing those vale(4) interfaces. Reason for that complicated setup: > Closest hardware abstraction possible. The setup (guest) should be > easily migratable to real hardware ]. > > > > The right way to do it imho would be to write your own (userspace) > > netmap application that forwards packets between your bhyve guests and > > the NIC, prepending/stripping VLAN headers according to configuration > > (e.g. guest A is configured to be on VLAN 100, guest B on VLAN 200), et= c. > > I think this would be a very interesting netmap application in general, > > and more importantly you would get the performance that you can't get > > with your setup. > > I agree that having a userland application which, like you described, > utilizes netmap to enable minimalistic SDN features, would be a great > solution. But I would need really a lot of time, since my C skills are > lousy, and I really don't have any time, not even one more day. > I see. But just FYI, there isn't that much to implement :) Cheers, Vincenzo > > > I'll see if I can get any useful information with the kernel deadlock > debuging feature you suggested > (https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/ > kerneldebug-deadlocks.html), > as soon as the problem shows up again. > Since I forgot to add all production-RAM, I had to shutdown yesterday, > so the lockup counter was reset ;-) > Another last-minute change was with netmap ring size: I changed the > vale-uplink interface. The one I used for passthrough had 2 queues > (with EM_MULTIQUEUE support) and the one for the vale uplink onyl one, > and during evaluation phase I reduced rx/tx descriptors to make netmap's > default ring size working. > Now I use the 2-queue NIC with vale uplink and increased ring size to > 81920 while leaving the hardware default of 4096 rx/tx desriptors. > > But my wording wasn't technically correct I think, because I guess what > I'm suffering isn't a real deadlock in terms of locking, but any > netmap-internal lockup/overflow/limit/whatever. Just guesing here! I > don't know netmap code! I only link symptoms, and since that setup is > working really nice for some limited time, I hoped you or any other > netmap expert could teach me how to find the root cause. > Your sentence about FreeBSD's netmap-interface-emulation leaves a bad > presentiment... > > Thank you very much, > > -harry > --=20 Vincenzo Maffione