From owner-freebsd-current@FreeBSD.ORG Thu Mar 6 20:11:40 2008 Return-Path: Delivered-To: current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 661A91065672; Thu, 6 Mar 2008 20:11:40 +0000 (UTC) (envelope-from danger@cvsup.sk.freebsd.org) Received: from cvsup.sk.freebsd.org (priest.sk.FreeBSD.org [IPv6:2a01:b0:10aa:200::1]) by mx1.freebsd.org (Postfix) with ESMTP id 02A528FC25; Thu, 6 Mar 2008 20:11:38 +0000 (UTC) (envelope-from danger@cvsup.sk.freebsd.org) Received: from cvsup.sk.freebsd.org (danger@localhost [127.0.0.1]) by cvsup.sk.freebsd.org (8.13.8/8.13.4) with ESMTP id m26K5Wpc085439; Thu, 6 Mar 2008 21:05:32 +0100 (CET) (envelope-from danger@cvsup.sk.freebsd.org) Received: (from danger@localhost) by cvsup.sk.freebsd.org (8.13.8/8.13.3/Submit) id m26K5WfM085438; Thu, 6 Mar 2008 21:05:32 +0100 (CET) (envelope-from danger) Date: Thu, 6 Mar 2008 21:05:32 +0100 From: Daniel Gerzo To: current@FreeBSD.org Message-ID: <20080306200532.GA84961@cvsup.sk.freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Organization: The FreeBSD Project User-Agent: Mutt/1.5.12-2006-07-14 Cc: yongari@FreeBSD.org Subject: re(4) problem X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Mar 2008 20:11:40 -0000 Hello people, I would like to report a problem with re(4) device. I am running the following system: FreeBSD 7.0-STABLE #2: Sat Mar 1 18:55:23 CET 2008 amd64 The system is build including a patch available at: http://people.freebsd.org/~yongari/re/re.HEAD.patch The problem occoured already 3 times (in around a week period of time), always suddenly after some time. I don't know how to reproduce it :-( The machine in a question has two NIC cards, one em(4) based and one re(4) based. When a problem occurs, I am able to connect to the machine only through em(4) - with no problems. The symptons are following: - the machine does not reply to a icmp echo requests to the re(4) device - When I try to ping some remote host over re(4) based card I get: ping: sendto: No buffer space available - When I run tcpdump -vv -i re0, I can see only arp requests (ha-web1 is the machine in question) no other reasonable traffic: 20:30:20.945662 arp who-has 85.10.197.188 tell 85.10.197.161 20:30:20.947624 arp who-has 85.10.197.189 tell 85.10.197.161 20:30:20.949021 arp who-has 85.10.197.190 tell 85.10.197.161 20:30:21.136417 arp who-has ha-web1 tell 85.10.199.1 20:30:22.153493 arp who-has 85.10.197.169 tell 85.10.197.161 20:30:23.286400 arp who-has ha-web1 tell 85.10.199.1 20:30:23.299547 arp who-has 85.10.199.12 tell 85.10.199.1 - The output of netstat -m: root@[ha-web1 /home/danger]# netstat -m 1047/648/1695 mbufs in use (current/cache/total) 879/335/1214/25600 mbuf clusters in use (current/cache/total/max) 879/267 mbuf+clusters out of packet secondary zone in use (current/cache) 16/265/281/12800 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) 2092K/1892K/3984K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/0/0 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 37742 requests for I/O initiated by sendfile 0 calls to protocol drain routines - ifconfig re0 output: danger@[ha-web1 ~]> ifconfig re0: flags=8c43 metric 0 mtu 1500 options=19b ether 00:1d:92:34:12:7a inet 85.10.199.6 netmask 0xffffffe0 broadcast 85.10.199.31 media: Ethernet autoselect (100baseTX ) status: active - When I run ifconfig re0 down, the devices doesn't go down unless I type also ifconfig re0 up. In the meantime ifconfig still says that the device is active and /var/log/messages doesn't mention it has gone down. When I also type ifconfig re0 up, the device goes down and immediately up, but the network still doesn't work, however I don't get ENOBUFS error when I try to ping a remote host anymore. After this procedure I am unable to ssh to this box over em(4) as well (ping works). Now, when I run /etc/rc.d/netif restart, I can connect to the machine over em(4) again. When I ping remote host over re(4), I get ping: sendto: No route to host. When I run /etc/rc.d/routing restart, ping doesn't report anything, but I can see again arp requests over tcpdump. - No interrupt storms are being reported in /var/log/messages, also it doesn't include anything strange, either dmesg. I suppose its a bug in re(4), otherwise I assume that the network wouldn't work over em(4) as well. If you need any information I can provide to help debug this problem, please let me know, I will leave the machine in this status if a customer permits me to do so. -- Best Regards, Daniel Gerzo mailto:danger@FreeBSD.org