From owner-freebsd-stable@FreeBSD.ORG Wed May 15 17:08:28 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 190BFFB1 for ; Wed, 15 May 2013 17:08:28 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-ve0-x230.google.com (mail-ve0-x230.google.com [IPv6:2607:f8b0:400c:c01::230]) by mx1.freebsd.org (Postfix) with ESMTP id D0C65D8C for ; Wed, 15 May 2013 17:08:27 +0000 (UTC) Received: by mail-ve0-f176.google.com with SMTP id jz10so1024668veb.21 for ; Wed, 15 May 2013 10:08:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=8xMpKP7JGyIhMT27B4YTPSq9NU7Z9Gx4JBo6j0K8wFU=; b=pTwEWkRM0HMiqbFzj/c3g/cuSpUXBHTJ+W9qu5YeG0cPOSuiTiZ649Oj/L26F9BsN8 50EpIhEugj35/QoDhVW5PNgWYxBS9m2olbW4KmKraCa5gV4psNC27LaXAnYtLgjtteO0 62BArBlVFJcOaxV9vDBeAHzBuaqSpj25O5Jb4vskM4f0vLqmIz3KZa/8OPs8pYbZ3JLr ppVTjayoFH0WC7vQNv0C6VS73tt6UYWdbr0RPo4O2646Evbw9lLyr4fywjp2povyjovx JSzoiNj2yI0IrQAucbl8pOsyQs2+ql7hDvmhCTpSRlHK1GW62QYpsWtpR9j6MWk0V6ny W8NQ== MIME-Version: 1.0 X-Received: by 10.52.230.164 with SMTP id sz4mr22192852vdc.118.1368637238590; Wed, 15 May 2013 10:00:38 -0700 (PDT) Received: by 10.220.55.143 with HTTP; Wed, 15 May 2013 10:00:38 -0700 (PDT) In-Reply-To: References: Date: Wed, 15 May 2013 10:00:38 -0700 Message-ID: Subject: Re: still mbuf leak in 9.0 / 9.1? From: Jack Vogel To: dennis berger Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD stable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 May 2013 17:08:28 -0000 So, you stop getting 10G transmission and so you are looking at mbuf leaks? I don't see anything in your data that makes it look like you've run out of available mbufs. You said you're running jumbos, what size? You do realize that if you do this the clusters are coming from different pools and you are not displaying those. What are all your nmb limits set to? So, this is 9.1 RELEASE, or stable? If you are using the driver from release I would first off suggest you test the code from HEAD. What is the 10G device, I see its using Twinax, and I have been told there is a problem at times with those that is corrected in recent shared code, this is why you should try the latest code. Cheers, Jack On Wed, May 15, 2013 at 2:00 AM, dennis berger wrote: > Hi list, > since we activated 10gbe on ixgbe cards + jumbo frames(9k) on 9.0 and now > on 9.1 we recognize that after a random period of time, sometimes a week, > sometimes only a day, the > system doesn't send any packets out. The phenomenon is that you can't > login via ssh, nfs and istgt is not operative. Yet you can login on the > console and execute commands. > A clean shutdown isn't possible though. It hangs after vnode cleaning, > normally you would see detaching of usb devices here, or other devices > maybe? > I've read the other post on this ML about mbuf leak in the arp handling > code in if_ether.c line 558. We don't see any of those notices in dmesg s= o > I don't think that glebius fix would apply for us. > I'm collecting system and memory information every hour. > > > Script looks like this. > less /etc/periodic/hourly/100.report-memory.sh > #!/bin/sh > > reporttimestamp=3D`date +%d-%m-%Y-%H-%M` > reportname=3D${reporttimestamp}.txt > > cd /root/memory-mon > > top -b > $reportname > echo "" >> $reportname > vmstat -m >> $reportname > echo "" >> $reportname > vmstat -z >> $reportname > echo "" >> $reportname > netstat -Q >> $reportname > echo "" >> $reportname > netstat -n -x >> $reportname > echo "" >> $reportname > netstat -m >> $reportname > /usr/bin/perl /usr/local/bin/zfs-stats -a >> $reportname > > When you grep for mbuf or mbuf usage you will see this for example: > > root@freenas:/root/memory-mon # grep mbuf_packet: * > 14-05-2013-14-09.txt:mbuf_packet: 256, 0, 9246, > 2786,201700429, 0, 0 > 14-05-2013-15-09.txt:mbuf_packet: 256, 0, 9256, > 2776,201773122, 0, 0 > 14-05-2013-16-09.txt:mbuf_packet: 256, 0, 9266, > 2766,201871553, 0, 0 > 14-05-2013-17-09.txt:mbuf_packet: 256, 0, 9276, > 2756,201915405, 0, 0 > 14-05-2013-18-09.txt:mbuf_packet: 256, 0, 9286, > 2746,201927956, 0, 0 > 14-05-2013-19-09.txt:mbuf_packet: 256, 0, 9296, > 2352,201935681, 0, 0 > 14-05-2013-20-09.txt:mbuf_packet: 256, 0, 9306, > 2342,201943754, 0, 0 > 14-05-2013-21-09.txt:mbuf_packet: 256, 0, 9316, > 2332,201950961, 0, 0 > 14-05-2013-22-09.txt:mbuf_packet: 256, 0, 9326, > 2450,201958150, 0, 0 > 14-05-2013-23-09.txt:mbuf_packet: 256, 0, 9336, > 2440,201967178, 0, 0 > 15-05-2013-00-09.txt:mbuf_packet: 256, 0, 9346, > 2430,201974561, 0, 0 > 15-05-2013-01-09.txt:mbuf_packet: 256, 0, 9356, > 2420,201982105, 0, 0 > 15-05-2013-02-09.txt:mbuf_packet: 256, 0, 9366, > 2410,201989463, 0, 0 > 15-05-2013-03-09.txt:mbuf_packet: 256, 0, 9378, > 1502,203019168, 0, 0 > 15-05-2013-04-09.txt:mbuf_packet: 256, 0, 9384, > 1624,205953601, 0, 0 > 15-05-2013-05-09.txt:mbuf_packet: 256, 0, 9394, > 1870,205959258, 0, 0 > 15-05-2013-06-09.txt:mbuf_packet: 256, 0, 9404, > 2500,205969396, 0, 0 > 15-05-2013-07-09.txt:mbuf_packet: 256, 0, 9414, > 3386,207945161, 0, 0 > 15-05-2013-08-09.txt:mbuf_packet: 256, 0, 9424, > 3376,208094689, 0, 0 > 15-05-2013-09-09.txt:mbuf_packet: 256, 0, 9434, > 2982,208172465, 0, 0 > 15-05-2013-10-09.txt:mbuf_packet: 256, 0, 9444, > 3100,208270369, 0, 0 > > and > > root@freenas:/root/memory-mon # grep "mbufs in use" * > 14-05-2013-14-09.txt:58444/5816/64260 mbufs in use (current/cache/total) > 14-05-2013-15-09.txt:58455/5805/64260 mbufs in use (current/cache/total) > 14-05-2013-16-09.txt:58464/5796/64260 mbufs in use (current/cache/total) > 14-05-2013-17-09.txt:58475/5785/64260 mbufs in use (current/cache/total) > 14-05-2013-18-09.txt:58484/5776/64260 mbufs in use (current/cache/total) > 14-05-2013-19-09.txt:58493/5767/64260 mbufs in use (current/cache/total) > 14-05-2013-20-09.txt:58503/5757/64260 mbufs in use (current/cache/total) > 14-05-2013-21-09.txt:58513/5747/64260 mbufs in use (current/cache/total) > 14-05-2013-22-09.txt:58523/5737/64260 mbufs in use (current/cache/total) > 14-05-2013-23-09.txt:58533/5727/64260 mbufs in use (current/cache/total) > 15-05-2013-00-09.txt:58543/5717/64260 mbufs in use (current/cache/total) > 15-05-2013-01-09.txt:58554/5706/64260 mbufs in use (current/cache/total) > 15-05-2013-02-09.txt:58563/5697/64260 mbufs in use (current/cache/total) > 15-05-2013-03-09.txt:58639/5621/64260 mbufs in use (current/cache/total) > 15-05-2013-04-09.txt:58581/5679/64260 mbufs in use (current/cache/total) > 15-05-2013-05-09.txt:58591/5669/64260 mbufs in use (current/cache/total) > 15-05-2013-06-09.txt:58602/5658/64260 mbufs in use (current/cache/total) > 15-05-2013-07-09.txt:58613/5647/64260 mbufs in use (current/cache/total) > 15-05-2013-08-09.txt:58623/6027/64650 mbufs in use (current/cache/total) > 15-05-2013-09-09.txt:58634/6016/64650 mbufs in use (current/cache/total) > 15-05-2013-10-09.txt:58645/6005/64650 mbufs in use (current/cache/total) > > > This increasing number of used mbuf_packets and mbufs in use makes me > nervous. > See the complete reports http://knownhosts.org:/reports-14-15.tgz > > Thanks for help, > > -dennis > > > > --------------BEGIN System information--------------- > It's a stock FreeBSD 9.1, yet the hostname is called freenas. Don't be > confused. > > > igb0: flags=3D8c02 metric 0 mtu 1500 > > options=3D401bb > ether 00:25:90:34:c1:12 > nd6 options=3D21 > media: Ethernet autoselect (1000baseT ) > status: active > igb1: flags=3D8843 metric 0 mtu 1= 500 > > options=3D401bb > ether 00:25:90:34:c1:13 > inet 172.16.1.6 netmask 0xfffff000 broadcast 172.16.15.255 > inet6 fe80::225:90ff:fe34:c113%igb1 prefixlen 64 scopeid 0x2 > nd6 options=3D21 > media: Ethernet autoselect (1000baseT ) > status: active > ix0: flags=3D8843 metric 0 mtu 90= 00 > > options=3D401bb > ether 00:1b:21:cc:12:8b > inet 10.254.254.242 netmask 0xfffffffc broadcast 10.254.254.243 > inet6 fe80::21b:21ff:fecc:128b%ix0 prefixlen 64 scopeid 0xb > nd6 options=3D21 > media: Ethernet autoselect (10Gbase-Twinax ) > status: active > ix1: flags=3D8843 metric 0 mtu 90= 00 > > options=3D401bb > ether 00:1b:21:cc:12:8a > inet 10.254.254.254 netmask 0xfffffffc broadcast 10.254.254.255 > inet6 fe80::21b:21ff:fecc:128a%ix1 prefixlen 64 scopeid 0xc > nd6 options=3D21 > media: Ethernet autoselect (10Gbase-Twinax ) > status: active > ix2: flags=3D8843 metric 0 mtu 90= 00 > > options=3D401bb > ether 00:1b:21:cc:12:b3 > inet 10.254.254.246 netmask 0xfffffffc broadcast 10.254.254.247 > inet6 fe80::21b:21ff:fecc:12b3%ix2 prefixlen 64 scopeid 0xd > nd6 options=3D21 > media: Ethernet autoselect > status: no carrier > ix3: flags=3D8802 metric 0 mtu 1500 > > options=3D401bb > ether 00:1b:21:cc:12:b2 > nd6 options=3D21 > media: Ethernet autoselect > status: no carrier > lo0: flags=3D8049 metric 0 mtu 16384 > options=3D600003 > inet6 ::1 prefixlen 128 > inet6 fe80::1%lo0 prefixlen 64 scopeid 0xf > inet 127.0.0.1 netmask 0xff000000 > nd6 options=3D21 > > #dmesg > =85.. > mfi0: 21294 (421879975s/0x0008/info) - Battery started charging > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > ix1: link state changed to DOWN > ix1: link state changed to UP > > > I should add that the servers that are directly connected to this freebsd > server reboot every night. This is why you see ix0 UP/DOWN > messages in dmesg. > > > > > > > ------------- END System information------------ > > > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >