From owner-freebsd-net@FreeBSD.ORG Mon Aug 20 14:52:31 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9303C106566B; Mon, 20 Aug 2012 14:52:31 +0000 (UTC) (envelope-from mitya@cabletv.dp.ua) Received: from mail.cabletv.dp.ua (mail.cabletv.dp.ua [193.34.20.8]) by mx1.freebsd.org (Postfix) with ESMTP id 4E8468FC0A; Mon, 20 Aug 2012 14:52:31 +0000 (UTC) Received: from [193.34.20.2] (helo=m18.cabletv.dp.ua) by mail.cabletv.dp.ua with esmtp (Exim 4.72 (FreeBSD)) (envelope-from ) id 1T3ToR-000JLs-0f; Mon, 20 Aug 2012 18:22:43 +0300 Message-ID: <50324DB4.6080905@cabletv.dp.ua> Date: Mon, 20 Aug 2012 17:46:12 +0300 From: Mitya User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:12.0) Gecko/20120425 Thunderbird/12.0 MIME-Version: 1.0 To: freebsd-hackers@freebsd.org, freebsd-net@freebsd.org Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Replace bcopy() to update ether_addr X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Aug 2012 14:52:31 -0000 Hi. I found some overhead code in /src/sys/net/if_ethersubr.c and /src/sys/netgraph/ng_ether.c It contains strings, like bcopy(src, dst, ETHER_ADDR_LEN); When src and dst are "struct ether_addr*", and ETHER_ADDR_LEN equal 6. This code call every time, when we send Ethernet packet. On example, on my machine in invoked nearly 20K per second. Why we are use bcopy(), to copy only 6 bytes? Answer - in some architectures we are can not directly copy unaligned data. I propose this solution. In file /usr/src/include/net/ethernet.h add this lines: static inline void ether_addr_copy(ether_addr* src, ether_addr* dst) { #if defined(__i386__) || defined(__amd64__) *dst = *src; #else bcopy(src, dst, ETHER_ADDR_LEN); #endif } On platform i386 gcc produce like this code: leal -30(%ebp), %eax leal 6(%eax), %ecx leal -44(%ebp), %edx movl (%edx), %eax movl %eax, (%ecx) movzwl 4(%edx), %eax movw %ax, 4(%ecx) And clang produce this: movl -48(%ebp), %ecx movl %ecx, -26(%ebp) movw -44(%ebp), %si movw %si, -22(%ebp) All this variants are much faster, than bcopy()