From owner-freebsd-current@FreeBSD.ORG  Thu Mar  6 20:11:40 2008
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: current@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 661A91065672;
	Thu,  6 Mar 2008 20:11:40 +0000 (UTC)
	(envelope-from danger@cvsup.sk.freebsd.org)
Received: from cvsup.sk.freebsd.org (priest.sk.FreeBSD.org
	[IPv6:2a01:b0:10aa:200::1])
	by mx1.freebsd.org (Postfix) with ESMTP id 02A528FC25;
	Thu,  6 Mar 2008 20:11:38 +0000 (UTC)
	(envelope-from danger@cvsup.sk.freebsd.org)
Received: from cvsup.sk.freebsd.org (danger@localhost [127.0.0.1])
	by cvsup.sk.freebsd.org (8.13.8/8.13.4) with ESMTP id m26K5Wpc085439;
	Thu, 6 Mar 2008 21:05:32 +0100 (CET)
	(envelope-from danger@cvsup.sk.freebsd.org)
Received: (from danger@localhost)
	by cvsup.sk.freebsd.org (8.13.8/8.13.3/Submit) id m26K5WfM085438;
	Thu, 6 Mar 2008 21:05:32 +0100 (CET) (envelope-from danger)
Date: Thu, 6 Mar 2008 21:05:32 +0100
From: Daniel Gerzo <danger@FreeBSD.org>
To: current@FreeBSD.org
Message-ID: <20080306200532.GA84961@cvsup.sk.freebsd.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Organization: The FreeBSD Project
User-Agent: Mutt/1.5.12-2006-07-14
Cc: yongari@FreeBSD.org
Subject: re(4) problem
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Mar 2008 20:11:40 -0000

Hello people,

  I would like to report a problem with re(4) device.

  I am running the following system:
  FreeBSD 7.0-STABLE #2: Sat Mar  1 18:55:23 CET 2008 amd64

  The system is build including a patch available at:
  http://people.freebsd.org/~yongari/re/re.HEAD.patch

  The problem occoured already 3 times (in around a week period of
  time), always suddenly after some time. I don't know how to reproduce
  it :-(

  The machine in a question has two NIC cards, one em(4) based and one
  re(4) based. When a problem occurs, I am able to connect to the
  machine only through em(4) - with no problems.

  The symptons are following:

  - the machine does not reply to a icmp echo requests to the re(4)
    device

  - When I try to ping some remote host over re(4) based card I get:
 
  ping: sendto: No buffer space available

  - When I run tcpdump -vv -i re0, I can see only arp requests (ha-web1
    is the machine in question) no other reasonable traffic:

20:30:20.945662 arp who-has 85.10.197.188 tell 85.10.197.161
20:30:20.947624 arp who-has 85.10.197.189 tell 85.10.197.161
20:30:20.949021 arp who-has 85.10.197.190 tell 85.10.197.161
20:30:21.136417 arp who-has ha-web1 tell 85.10.199.1
20:30:22.153493 arp who-has 85.10.197.169 tell 85.10.197.161
20:30:23.286400 arp who-has ha-web1 tell 85.10.199.1
20:30:23.299547 arp who-has 85.10.199.12 tell 85.10.199.1

  - The output of netstat -m:

root@[ha-web1 /home/danger]# netstat -m
1047/648/1695 mbufs in use (current/cache/total)
879/335/1214/25600 mbuf clusters in use (current/cache/total/max)
879/267 mbuf+clusters out of packet secondary zone in use
(current/cache)
16/265/281/12800 4k (page size) jumbo clusters in use
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
2092K/1892K/3984K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
37742 requests for I/O initiated by sendfile
0 calls to protocol drain routines

  - ifconfig re0 output:

danger@[ha-web1 ~]> ifconfig
re0: flags=8c43<UP,BROADCAST,RUNNING,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
        ether 00:1d:92:34:12:7a
        inet 85.10.199.6 netmask 0xffffffe0 broadcast 85.10.199.31
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active

  - When I run ifconfig re0 down, the devices doesn't go down unless I
    type also ifconfig re0 up. In the meantime ifconfig still says that
    the device is active and /var/log/messages doesn't mention it has gone
    down.
    When I also type ifconfig re0 up, the device goes down and
    immediately up, but the network still doesn't work, however I don't get
    ENOBUFS error when I try to ping a remote host anymore.
    After this procedure I am unable to ssh to this box over em(4) as
    well (ping works).
    Now, when I run /etc/rc.d/netif restart, I can connect to the
    machine over em(4) again. When I ping remote host over re(4), I get
    ping: sendto: No route to host. When I run /etc/rc.d/routing
    restart, ping doesn't report anything, but I can see again arp
    requests over tcpdump.

  - No interrupt storms are being reported in /var/log/messages, also it
    doesn't include anything strange, either dmesg.

  I suppose its a bug in re(4), otherwise I assume that the network
  wouldn't work over em(4) as well. 

  If you need any information I can provide to help debug this problem,
  please let me know, I will leave the machine in this status if a
  customer permits me to do so.

-- 
Best Regards,
   Daniel Gerzo				mailto:danger@FreeBSD.org