From owner-freebsd-net@freebsd.org Thu Sep 28 20:30:06 2017 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4A399E08830 for ; Thu, 28 Sep 2017 20:30:06 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-wr0-x22c.google.com (mail-wr0-x22c.google.com [IPv6:2a00:1450:400c:c0c::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CAB6D6D184 for ; Thu, 28 Sep 2017 20:30:05 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-wr0-x22c.google.com with SMTP id o42so4796212wrb.3 for ; Thu, 28 Sep 2017 13:30:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-transfer-encoding; bh=Ym1gaO0mHusvaZwSqjIpcuOWxH93zL6D/yI1U2Zgvr0=; b=ntxtxzY+T8tV5EjmkmwrhVoAqrvC+dQfJEZvlOYNLpMF+/01mej0W/Ylj2X4xkKg1U 0iT8xBnjPDOPzKuFhYJiJltit1nQbL5W6hD9Tz/F0WZqNVwwgaR9+xL36MNavXe8l7IL taRHRKZm08MAbcyY1nvjCT+gVP/59BYPVsrN7JaSERixa48/jbNTBInBI1PWMZq/FF5o FtXJQfIGcit6AtLRjI5Tw8dv4CZ/Tkf37dVh00FmVn56HaWjitdH5WYVFdZxL7T9nfQb tFPtZe5WvLu6RTZMoJUjQFlkF/2jv9W4awdiXcHxt4rkomKEhF2sKyzL2STuVcaHDvuI PXxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc:content-transfer-encoding; bh=Ym1gaO0mHusvaZwSqjIpcuOWxH93zL6D/yI1U2Zgvr0=; b=IIxZBYotNweZrX+DTbawO9W0OCc9DTWM2kdq69eOOko+9wvMznVZ43YLt0i45hnSAB IAPYETBY5uX6cxNzpNXRmAoH6pAKIJ1s2n5aBVjMYljI775Kx4o6FEfhxe1RSRCx4rEf vT5p0tQjarWTXbycG7O6pd2U4UN9EzZTWEHOb6m/u9QyiKuUcP1Pt1/iRLgFjPGnBmnI lLqX0RExMJ4eBU0baDSUyBPyfOdKSV+z1u+iEEOcZLVTrNTMMq4KennnyrXZ0ETZEjAQ 1u1n+4TvuYrJRtDkGPgA3rPnLvxSZ0dnzi9+sJluzboL6awvvPZEC8fiRTaqRtMoK8kL YS9Q== X-Gm-Message-State: AHPjjUjs4VUToRdu5iUz/gdJL2wchG+7srTZN9rV6W5163XvZIbeP8by 1CaAZcYjSX3MFv01JGSl/Uq/+DfQP7B6PjJzE2o= X-Google-Smtp-Source: AOwi7QBG+4w2BWzQ6JOtABMUTHa0eWNXcOtHQTFxwhcV6xbqRa142+aUOKjNihCRTpwsGfd5pVlU6lQkFz87Vd52HQY= X-Received: by 10.25.23.95 with SMTP id n92mr584089lfi.95.1506630604050; Thu, 28 Sep 2017 13:30:04 -0700 (PDT) MIME-Version: 1.0 Sender: asomers@gmail.com Received: by 10.179.26.38 with HTTP; Thu, 28 Sep 2017 13:30:03 -0700 (PDT) In-Reply-To: <322F6F4B-1153-4ECE-B854-B2981B0CDDF2@goboomtown.com> References: <322F6F4B-1153-4ECE-B854-B2981B0CDDF2@goboomtown.com> From: Alan Somers Date: Thu, 28 Sep 2017 14:30:03 -0600 X-Google-Sender-Auth: RlduPWt-peRpBXf1xEAiiHZNaj0 Message-ID: Subject: Re: Help with mbuf exhaustion To: Josh Gitlin Cc: FreeBSD Net Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Sep 2017 20:30:06 -0000 First of all, 10.3-RELEASE-p2 is very old and has known security vulnerabilities. Have you tried 10.3-RELEASE-p21 or even 10.4-RELEASE ? On Thu, Sep 28, 2017 at 1:30 PM, Josh Gitlin wrote= : > Hi FreeBSD Gurus! > > We're having an issue with mbuf exhaustion on a FreeBSD server which was = recently upgraded from 10.3-STABLE to 10.3-RELEASE-p2. Under the course of = normal operation, we see mbuf usage steadily increasing until we reach kern= .ipc.nmbufs limit, at which point the machine becomes unresponsive over the= network (due to lack of mbufs for network access) and the console displays= : > > cxl0: Interface stopped DISTRIBUTING, possible flapping > cxl1: Interface stopped DISTRIBUTING, possible flapping > [zone: mbuf] kern.ipc.nmbufs limit reached > [zone: mbuf] kern.ipc.nmbufs limit reached > The machine runs pf and acts as a packet filter, router, gateway and DHCP= /DNS server. It has two Chelsio NICs in it, and is a CARP master with a sec= ondary. The secondary has identical configuration of hardware and software = and does not exhibit this issue. > > Given the downtime this causes, we set up our Nagios/Check_MK to graph th= e output of `netstat -m` and alert when mbufs in use approaches `kern.ipc.n= mbufs` and we see a steady linear increase in mbuf usage until we reboot: > > https://i.stack.imgur.com/8bzAq.png > > mbuf *clusters* in use does not change when this happens and increasing m= buf cluster limits has no effect: > > https://i.stack.imgur.com/7OzdN.png > > This appears to be a kernel bug of some sort to me, looking for advice on= further troubleshooting or assistance in resolving this! > > Helpful (maybe) information: > > netstat -m: > > 679270/3080/682350 mbufs in use (current/cache/total) > 10243/1657/11900/985360 mbuf clusters in use (current/cache/total/max) > 10243/1648 mbuf+clusters out of packet secondary zone in use (current/cac= he) > 8128/482/8610/124025 4k (page size) jumbo clusters in use (current/cache/= total/max) > 0/0/0/36748 9k jumbo clusters in use (current/cache/total/max) > 128/0/128/20670 16k jumbo clusters in use (current/cache/total/max) > 224863K/6012K/230875K bytes allocated to network (current/cache/total) > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile > > vmstat -z|grep -E '^ITEM|mbuf': > > ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP > mbuf_packet: 256, 1587540, 10239, 1652,84058893, 0, 0 > mbuf: 256, 1587540, 671533, 1206,914478880, 0, = 0 > mbuf_cluster: 2048, 985360, 11891, 9, 11891, 0, 0 > mbuf_jumbo_page: 4096, 124025, 8128, 512,15011847, 0, 0 > mbuf_jumbo_9k: 9216, 36748, 0, 0, 0, 0, 0 > mbuf_jumbo_16k: 16384, 20670, 128, 0, 128, 0, 0 > mbuf_ext_refcnt: 4, 0, 0, 0, 0, 0, 0 > > vmstat -m: > > Type InUse MemUse HighUse Requests Size(s) > NFSD lckfile 1 1K - 1 256 > filedesc 103 383K - 1134731 16,32,128,2048,4096,8192,163= 84,65536 > sigio 1 1K - 1 64 > filecaps 0 0K - 973 64 > kdtrace 292 59K - 1099386 64,256 > kenv 121 13K - 125 16,32,64,128,8192 > kqueue 14 22K - 5374 256,2048,8192 > proc-args 54 5K - 578448 16,32,64,128,256 > hhook 2 1K - 2 256 > ithread 146 24K - 146 32,128,256 > KTRACE 100 13K - 100 128 > NFS fh 1 1K - 584 32 > linker 207 1052K - 234 16,32,64,128,256,512,1024,20= 48,4096,8192,16384,65536 > lockf 29 3K - 20042 64,128 > loginclass 2 1K - 1192 64 > devbuf 17205 36362K - 17523 16,32,64,128,256,512,1024,20= 48,4096,8192,65536 > temp 149 51K - 1280113 16,32,64,128,256,512,1024,20= 48,4096,8192,16384,65536 > ip6opt 5 2K - 6 256 > ip6ndp 27 2K - 27 64,128 > module 230 29K - 230 128 > mtx_pool 2 16K - 2 8192 > osd 3 1K - 5 16,32,64 > pmchooks 1 1K - 1 128 > pgrp 30 4K - 2222 128 > session 29 4K - 2187 128 > proc 2 32K - 2 16384 > subproc 211 368K - 1099014 512,4096 > cred 204 32K - 6025704 64,256 > plimit 19 5K - 3985 256 > uidinfo 9 5K - 11892 128,4096 > NFSD session 1 1K - 1 1024 > sysctl 0 0K - 63851 16,32,64 > sysctloid 7196 365K - 7369 16,32,64,128 > sysctltmp 0 0K - 17834 16,32,64,128 > tidhash 1 32K - 1 32768 > callout 5 2184K - 5 > umtx 522 66K - 522 128 > p1003.1b 1 1K - 1 16 > SWAP 2 549K - 2 64 > bus 802 86K - 6536 16,32,64,128,256,1024 > bus-sc 57 1671K - 2431 16,32,64,128,256,512,1024,20= 48,4096,8192,16384,65536 > newnfsmnt 1 1K - 1 1024 > devstat 8 17K - 8 32,4096 > eventhandler 116 10K - 116 64,128 > kobj 124 496K - 296 4096 > acpiintr 1 1K - 1 64 > Per-cpu 1 1K - 1 32 > acpica 14355 1420K - 216546 16,32,64,128,256,512,1024,20= 48,4096 > pci_link 16 2K - 16 64,128 > pfs_nodes 21 6K - 21 256 > rman 316 37K - 716 16,32,128 > sbuf 1 1K - 41375 16,32,64,128,256,512,1024,20= 48,4096,8192,16384 > sglist 8 8K - 8 1024 > GEOM 88 15K - 1871 16,32,64,128,256,512,1024,20= 48,8192,16384 > acpipwr 5 1K - 5 64 > taskqueue 43 7K - 43 16,32,256 > Unitno 22 2K - 1208250 32,64 > vmem 3 144K - 6 1024,4096,8192 > ioctlops 0 0K - 185700 256,512,1024,2048,4096 > select 89 12K - 89 128 > iov 0 0K - 19808992 16,64,128,256,512,1024 > msg 4 30K - 4 2048,4096,8192,16384 > sem 4 106K - 4 2048,4096 > shm 1 32K - 1 32768 > tty 20 20K - 499 1024 > pts 1 1K - 480 256 > accf 2 1K - 2 64 > mbuf_tag 0 0K - 291472282 32,64,128 > shmfd 1 8K - 1 8192 > soname 32 4K - 1210442 16,32,128 > pcb 36 663K - 76872 16,32,64,128,1024,2048,8192 > CAM CCB 0 0K - 182128 2048 > acl 0 0K - 2 4096 > vfscache 1 2048K - 1 > cl_savebuf 0 0K - 480 64 > vfs_hash 1 1024K - 1 > vnodes 1 1K - 1 256 > entropy 1026 65K - 49107 32,64,4096 > mount 64 3K - 140 16,32,64,128,256 > vnodemarker 0 0K - 4212 512 > BPF 112 20504K - 131 16,64,128,512,4096 > CAM path 11 1K - 63 32 > ifnet 29 57K - 30 128,256,2048 > ifaddr 315 105K - 315 32,64,128,256,512,2048,4096 > ether_multi 232 13K - 282 16,32,64 > clone 10 2K - 10 128 > arpcom 23 1K - 23 16 > gif 4 1K - 4 32,256 > lltable 155 53K - 551 256,512 > UART 6 5K - 6 16,1024 > vlan 56 5K - 74 64,128 > acpitask 1 16K - 1 16384 > acpisem 110 14K - 110 128 > raid_data 0 0K - 108 32,128,256 > routetbl 516 136K - 101735 32,64,128,256,512 > igmp 28 7K - 28 256 > CARP 76 30K - 83 16,32,64,128,256,512,1024 > ipid 2 24K - 2 8192,16384 > in_mfilter 112 112K - 112 1024 > in_multi 43 11K - 43 256 > ip_moptions 224 35K - 224 64,256 > CAM periph 7 2K - 19 16,32,64,128,256 > acpidev 128 8K - 128 64 > CAM queue 15 5K - 39 16,32,512 > encap_export_host 4 4K - 4 1024 > sctp_a_it 0 0K - 36 16 > sctp_vrf 1 1K - 1 64 > sctp_ifa 115 15K - 204 128 > sctp_ifn 21 3K - 23 128 > sctp_iter 0 0K - 36 256 > hostcache 1 32K - 1 32768 > syncache 1 64K - 1 65536 > in6_mfilter 1 1K - 1 1024 > in6_multi 15 2K - 15 32,256 > ip6_moptions 2 1K - 2 32,256 > CAM dev queue 6 1K - 6 64 > kbdmux 6 22K - 6 16,512,1024,2048,16384 > mld 26 4K - 26 128 > LED 20 2K - 20 16,128 > inpcbpolicy 365 12K - 119277 32 > secasvar 7 2K - 214 256 > sahead 10 3K - 10 256 > ipsecpolicy 748 187K - 241562 256 > ipsecrequest 18 3K - 72 128 > ipsec-misc 56 2K - 1712 16,32,64 > ipsec-saq 0 0K - 24 128 > ipsec-reg 3 1K - 3 32 > pfsync 2 2K - 893 32,256,1024 > pf_temp 0 0K - 78 128 > pf_hash 3 2880K - 3 > pf_ifnet 36 11K - 9510 256,2048 > pf_tag 7 1K - 7 128 > pf_altq 5 2K - 125 256 > pf_rule 964 904K - 17500 128,1024 > pf_osfp 1130 115K - 28250 64,128 > pf_table 49 98K - 948 2048 > crypto 37 11K - 1072 64,128,256,512,1024 > xform 7 1K - 1530156 16,32,64,128,256 > rpc 12 20K - 304 64,128,512,1024,8192 > audit_evclass 187 6K - 231 32 > ufs_dirhash 93 18K - 93 16,32,64,128,256,512 > ufs_quota 1 1024K - 1 > ufs_mount 3 13K - 3 512,4096,8192 > vm_pgdata 2 513K - 2 128 > UMAHash 5 6K - 10 512,1024,2048 > CAM SIM 6 2K - 6 256 > CAM XPT 30 3K - 1850 16,32,64,128,256,512,1024,20= 48,65536 > CAM DEV 9 18K - 16 2048 > fpukern_ctx 3 6K - 3 2048 > memdesc 1 4K - 1 4096 > USB 23 33K - 24 16,128,256,512,1024,2048,409= 6 > DEVFS3 136 34K - 2027 256 > DEVFS1 108 54K - 594 512 > apmdev 1 1K - 1 128 > madt_table 0 0K - 1 4096 > DEVFS_RULE 55 26K - 55 64,512 > DEVFS 12 1K - 13 16,128 > DEVFSP 22 2K - 167 64 > io_apic 1 2K - 1 2048 > isadev 8 1K - 8 128 > MCA 15 2K - 15 32,128 > msi 30 4K - 30 128 > nexusdev 5 1K - 5 16 > USBdev 21 8K - 21 32,64,128,256,512,1024,4096 > NFSD V4client 1 1K - 1 256 > cdev 5 2K - 5 256 > cxgbe 41 956K - 44 128,256,512,1024,2048,4096,8= 192,16384 > ipmi 0 0K - 20155 128,2048 > htcp data 127 4K - 13675 32 > aesni_data 3 3K - 3 1024 > solaris 142 12302K - 3189 16,32,64,128,512,1024,8192 > kstat_data 6 1K - 6 64 > > TCP States: > > https://i.stack.imgur.com/G7850.png > > > -- > > Josh Gitlin > Senior Full Stack Developer > (415) 690-1610 x155 > > Stay up to date and join the conversation in Relay . > > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"