Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 27 Jul 2007 16:35:27 +0300
From:      Oleg <agile.quad@gmail.com>
To:        freebsd-net@freebsd.org
Subject:   reincarnation of bug kern/95665: [if_tun] "ping: sendto: No buffer space available"
Message-ID:  <d0fcb8ea0707270635k5c260c8fvf0dc55257782591b@mail.gmail.com>

next in thread | raw e-mail | index | archive | help

[-- Attachment #1 --]
Hi All,

I can reproduce this bug easly with tap echo server (attached here), that I
was small reworked.

steps (almost same):
(All ip addresses/macs hardcoded in code).

On first machine run echo server, on second add

root@pc2# route add -net 192.168.125.1/24 ip-addr-of-first-machine

and

root@pc2# ping -f -n 192.168.125.2

While flood pinging, on first machine run ping for checking:

root@pc1#  ping 192.168.125.2
PING 192.168.125.2 (192.168.125.2): 56 data bytes
64 bytes from 192.168.125.2 : icmp_seq=0 ttl=64 time=0.554 ms
64 bytes from 192.168.125.2: icmp_seq=1 ttl=64 time=0.180 ms
...
wait for a while
...
ping: sendto: No buffer space available
ping: sendto: No buffer space available
ping: sendto: No buffer space available

With best regards,
    Oleg Dolgov.

[-- Attachment #2 --]
/*
Ping reply server:-) using TUN (FreeBSD 6.0).

Test originally made to test the throuput of TUN using 'ping' in flood
ping mode. However there seems to be a bug in FreeBSD that when sending
a massive amount of requests to the opened TUN interface it will "hang".
Ping will say "ping: sendto: No buffer space available"

I have searched google for this error and many people have reported it
with no solution presented. The problem seem to exist for a number of
network drivers (I suffer this with the nve driver as well). This is
a simple test program to produce this bug with a TUN interface.

This test assumes you have a lan with two PCs. The FreeBSD box is PC1.

  +-- LAN --+
  |         |
 PC1       PC2

I'm lazy so to configure the TUN interface on PC1 do:
# sysctl net.inet.ip.forwarding=1
# cat /dev/tap
# ^C
# ifconfig

You should see at least one tap interface listed. Replace the
define TUN_DEVICE in this file with the listed interface. e.g.
/dev/tap1

Compile this file like this:
# gcc -o tap_icmp_echo_reply tun_icmp_echo_reply.c

Run the program:
# ./tap_icmp_echo_reply

Configure the TUN interface. The reason it's done now is because when
the TUN device is opened by the program the route is removed.

The IP address 10.0.0.4 should be replaced by the IP address of the
external interface of PC1.

# ifconfig tap1 inet 10.0.0.4 192.168.254.33 netmask 255.255.255.255 

On PC1, 'ifconfig' should show this:

tap1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1500
        inet 10.0.0.4 --> 192.168.254.33 netmask 0xffffffff 
        inet6 fe80::213:d4ff:fe83:e378%tap1 prefixlen 64 scopeid 0x6 
        Opened by PID 1253

'netstat -nr' should show this:

192.168.254.33     10.0.0.4           UH          0        0   tap1

Okay, now to the tests. If you flood ping this program locally from
PC1 you will suffer from the "No buffer space available" syndrome
after some X thousand packets. If you on the other hand flood ping
from PC2 over the LAN everything will be dandy. Probably because the
LAN is slower. I have sent 1,000,000 packets this way with no problem.

Do this locally:
# ping -c 100000 -s 972 -f -n -q 192.168.254.33

You can run the 'ifstat' program in another terminal window to see
when nothing goes to the TUN interface anymore.

If the ping command above seem to have frozen, do a ^C to abort it.
If it doesn't freeze, up the number of packets to send.

# ping 192.168.254.33
PING 192.168.254.33 (192.168.254.33): 56 data bytes
ping: sendto: No buffer space available
ping: sendto: No buffer space available

Voila!

Sending fewer packets (so ping doesn't hang) confirm that no packets
are lost. So I don't see how any buffer space could be used up.

# ping -c 20000 -s 972 -f -n -q 192.168.254.33
PING 192.168.254.33 (192.168.254.33): 972 data bytes

--- 192.168.254.33 ping statistics ---
20000 packets transmitted, 20000 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.013/0.020/0.191/0.004 ms


To ping from PC2 you have to setup a route to PC1 of course but I don't
describe how to do that.

Remember that you have to configure the tap interface with ifconfig
everytime you restart this program (since the route is removed when
the tap interface is opened).

Some information about PC1 in my case:
CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ (2010.31-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x20fb1  Stepping = 1
  Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUS
H,MMX,FXSR,SSE,SSE2,HTT>
  Features2=0x1<SSE3>
  AMD Features=0xe2500800<SYSCALL,NX,MMX+,<b25>,LM,3DNow+,3DNow>
  Hyperthreading: 2 logical CPUs
real memory  = 1073676288 (1023 MB)
avail memory = 1037393920 (989 MB)
ACPI APIC Table: <Nvidia AWRDACPI>
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
*/

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>

#include <sys/types.h>
#include <sys/socket.h>
#include <net/ethernet.h>
#include <net/if_arp.h>

typedef unsigned char  U8;
typedef unsigned short U16;
typedef unsigned int   U32;

#define TAP_DEVICE   "tap3"
#define TAP_MAC      "00:13:EC:00:00:AE"
#define TAP_IP       "192.168.125.2"

#define TAP_INET     "192.168.125.1"
#define TAP_INET_MAC "00:13:EC:12:34:56"
#define TAP_MASK     "255.255.255.0"
#define TAP_BCAST    "192.168.125.255"

typedef struct
{
    U8  ver_hdr_len;     /* Version, Header Length (in words) */
    U8  tos;             /* Type of Service */
    U16 pkt_len;         /* Packet Length (in bytes) */
    U16 id;              /* Identification */
    U16 flags_frag_off;  /* Flags, Fragment Offset*/
    U8  ttl;             /* Time to Live */
    U8  proto;           /* Protocoll */
    U16 cksum;           /* Header Checksum */
    U32 s_addr;          /* Source Address */
    U32 d_addr;          /* Destination Address */
} IpHeader_t;

typedef struct
{
    U8  type;
    U8  code;
    U16 cksum;
    U16 id;
    U16 seqno;
} IcmpHeader_t;

U32 IpHeader_GetHeaderLength(IpHeader_t* p);

U32 IpHeader_GetPacketLength(IpHeader_t* p);

U32 IpHeader_GetProtocoll(IpHeader_t* p);

void IpHeader_SwapAddr(IpHeader_t* p);

void EthHeader_SwapMac(struct ether_header* p);

void ArpHeader_SwapAddr(struct arphdr *p);

U16 IpCksum(U16* data, int n_bytes);

void HexDump(U8* p, int n_bytes);

/*
 * Read from previously setup /dev/tapX device, calculate ICMP echo reply and write it back.
 */
int main ()
{
    IpHeader_t*   ip_hdr;
    IcmpHeader_t* icmp_hdr;
    U32 buf[400];  /* Should cover MTU = 1500 */
    U32 s_addr;
    U32 ip_hdr_len;
    U32 ip_pkt_len;
    U32 icmp_pkt_len;
    int tap_fd;
    int n_pkts = 0;
	struct ether_header *eth_hdr;
	struct arphdr *arp_hdr;

    tap_fd = open("/dev/"TAP_DEVICE, O_RDWR);
    if (tap_fd == -1)
    {
        perror("failed to open /dev/" TAP_DEVICE "\n");
        exit(1);
    }

	if (system("ifconfig " TAP_DEVICE " link " TAP_INET_MAC) != 0 ||
		system("ifconfig " TAP_DEVICE " inet " TAP_INET " netmask " TAP_MASK " broadcast " TAP_BCAST) != 0)
	{
		printf("cant configure " TAP_DEVICE " device\n");
		exit(2);
	}

    /* packet (pkt) = header (hdr) + payload */
    
    while ((ip_pkt_len = read(tap_fd, buf, sizeof(buf))) >= sizeof(IpHeader_t))
    {
		//printf("readed %d\n", ip_pkt_len);
		eth_hdr = (struct ether_header*) buf;
		if (ntohs(eth_hdr->ether_type) == ETHERTYPE_ARP) {
			arp_hdr = (struct arphdr*) (eth_hdr+1);
			if (ntohs(arp_hdr->ar_op) != ARPOP_REQUEST) {
				printf("E: unknown arp op 0x%08X\n", arp_hdr->ar_op);
				continue;
			}
			if (inet_addr(TAP_IP) != *(u_long*)ar_tpa(arp_hdr)) {
				printf("E: not my ip address %s\n", inet_ntoa(*(u_long*)ar_tpa(arp_hdr)));
				continue;
			}
			arp_hdr->ar_op = htons(ARPOP_REPLY);
			memcpy(ar_tha(arp_hdr), ether_aton(TAP_MAC), ETHER_ADDR_LEN);
			ArpHeader_SwapAddr(arp_hdr);
			//memcpy(eth_hdr->ether_dhost, ether_aton(TAP_MAC), ETHER_ADDR_LEN);
			EthHeader_SwapMac(eth_hdr);
			write(tap_fd, buf, ip_pkt_len);
			//printf("arp reply sended (%lu)\n", ip_pkt_len);
			continue;
		}
		if (ntohs(eth_hdr->ether_type) != ETHERTYPE_IP) {
			printf("E: unknown packet 0x%08X\n", eth_hdr->ether_type);
			continue;
		}
        ip_hdr       = (IpHeader_t*)(eth_hdr+1);
        ip_hdr_len   = IpHeader_GetHeaderLength(ip_hdr);
        icmp_hdr     = (IcmpHeader_t*)((char*)ip_hdr + ip_hdr_len*4);
        icmp_pkt_len = ip_pkt_len - (ip_hdr_len * 4);

        if (ip_pkt_len - ETHER_HDR_LEN != IpHeader_GetPacketLength(ip_hdr))
        {
            printf("E: read data length does not match IP packet length (%d %d)\n",
                   ip_pkt_len, IpHeader_GetPacketLength(ip_hdr));
            continue;
        }
        
        if (IpHeader_GetProtocoll(ip_hdr) != 1)
        {
            printf("E: No ICMP IP packet!");
            continue;
        }
        
        if (icmp_pkt_len < sizeof(*icmp_hdr))
        {
            printf("E: No ICMP data in IP packet!\n");
            continue;
        }
        
        if (icmp_hdr->type != 8)
        {
            printf("E: No ICMP echo message n_pkts=%d!\n", n_pkts);
            continue;
        }

        IpHeader_SwapAddr(ip_hdr);
        
        icmp_hdr->type  = 0;
        icmp_hdr->cksum = 0;

        icmp_hdr->cksum = IpCksum((U16*)icmp_hdr, icmp_pkt_len);

		EthHeader_SwapMac(eth_hdr);

        /* Send back packet. */
        write(tap_fd, buf, ip_pkt_len);
        
        n_pkts++;
    }
    
    printf("That's enough for today kids!\n");
    
    close(tap_fd);
    
    return 0;
}


/* Get header length in 4 byte words. */
U32 IpHeader_GetHeaderLength(IpHeader_t* p)
{
    return p->ver_hdr_len & 0x0F;
}

U32 IpHeader_GetPacketLength(IpHeader_t* p)
{
    return ntohs(p->pkt_len);
}

U32 IpHeader_GetProtocoll(IpHeader_t* p)
{
    return p->proto;
}

void IpHeader_SwapAddr(IpHeader_t* p)
{
    U32 s_addr = p->s_addr;
    p->s_addr  = p->d_addr;
    p->d_addr  = s_addr;
}

void EthHeader_SwapMac(struct ether_header* p)
{
	u_char ether_host[ETHER_ADDR_LEN];

	memcpy(ether_host,     p->ether_dhost, ETHER_ADDR_LEN);
	memcpy(p->ether_dhost, p->ether_shost, ETHER_ADDR_LEN);
	memcpy(p->ether_shost, ether_host,     ETHER_ADDR_LEN);
}

void ArpHeader_SwapAddr(struct arphdr *p)
{
	caddr_t sha = ar_sha(p);
	caddr_t tha = ar_tha(p);
	size_t const tsz = p->ar_hln + p->ar_pln;
	caddr_t t = malloc(tsz);
	memcpy(t, sha, tsz);
	memcpy(sha, tha, tsz);
	memcpy(tha, t, tsz);
	free(t);
}

void HexDump(U8* p, int n_bytes)
{
	while(n_bytes--) {
		printf("%02X ", (int) *p++);
	}
	printf("\n");
}


/**
 * Checksum routine for Internet Protocol family headers (C Version)
 *
 * @param data     Pointer to data to calculate checksum for.
 * @param n_bytes  Length in bytes.
 */
U16 IpCksum(U16* data, int n_bytes)
{
    int nleft;
    int sum;
    U16 *w;
    union {
        U16 us;
        U8  uc[2];
    } last;
    U16 answer;

    nleft = n_bytes;
    sum = 0;
    w = data;

    /*
     * Our algorithm is simple, using a 32 bit accumulator (sum), we add
     * sequential 16 bit words to it, and at the end, fold back all the
     * carry bits from the top 16 bits into the lower 16 bits.
     */
    while (nleft > 1)
    {
        sum += *w++;
        nleft -= 2;
    }

    /* mop up an odd byte, if necessary */
    if (nleft == 1)
    {
        last.uc[0] = *(U8 *)w;
        last.uc[1] = 0;
        sum += last.us;
    }

    /* add back carry outs from top 16 bits to low 16 bits */
    sum = (sum >> 16) + (sum & 0xffff);     /* add hi 16 to low 16 */
    sum += (sum >> 16);                     /* add carry */
    answer = ~sum;                          /* truncate to 16 bits */
    
    return(answer);
}

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d0fcb8ea0707270635k5c260c8fvf0dc55257782591b>