From owner-freebsd-net@FreeBSD.ORG Wed Aug 22 14:33:02 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4AE87106564A; Wed, 22 Aug 2012 14:33:02 +0000 (UTC) (envelope-from mitya@cabletv.dp.ua) Received: from mail.cabletv.dp.ua (mail.cabletv.dp.ua [193.34.20.8]) by mx1.freebsd.org (Postfix) with ESMTP id F2B988FC14; Wed, 22 Aug 2012 14:33:01 +0000 (UTC) Received: from [193.34.20.2] (helo=m18.cabletv.dp.ua) by mail.cabletv.dp.ua with esmtp (Exim 4.72 (FreeBSD)) (envelope-from ) id 1T4CSz-000KVR-3d; Wed, 22 Aug 2012 18:03:33 +0300 Message-ID: <5034EC27.1070203@cabletv.dp.ua> Date: Wed, 22 Aug 2012 17:26:47 +0300 From: Mitya User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:12.0) Gecko/20120425 Thunderbird/12.0 MIME-Version: 1.0 To: Luigi Rizzo , freebsd-net@freebsd.org, freebsd-hackers@freebsd.org References: <20120821112415.GA50078@onelab2.iet.unipi.it> <201208220232.q7M2WLCL020204@ref10-i386.freebsd.org> <20120822143632.GA64686@onelab2.iet.unipi.it> In-Reply-To: <20120822143632.GA64686@onelab2.iet.unipi.it> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: Subject: Re: speed tests (Re: Replace bcopy() to update ether_addr) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Aug 2012 14:33:02 -0000 22.08.2012 17:36, Luigi Rizzo написал: > On Wed, Aug 22, 2012 at 02:32:21AM +0000, Bruce Evans wrote: >> luigi wrote: >> >>> even more orthogonal: >>> >>> I found that copying 8n + (5, 6 or 7) bytes was much much slower than >>> copying a multiple of 8 bytes. For n=0, 1,2,4,8 bytes are efficient, >>> other cases are slow (turned into 2 or 3 different writes). >>> >>> The netmap code uses a pkt_copy routine that does exactly this >>> rounding, gaining some 10-20ns per packet for small sizes. >> I don't believe 10-20ns for just the extra bytes. memcpy() ends up >> with a movsb to copy the extra bytes. This can be slow, but I don't >> believe 10-20ns (except on machines running at i486 speeds of course). > I am adding at the end a test program so people can try things on their hw. > > Build it with > > cc -O2 -Werror -Wall -Wextra -lpthread -lrt testlock.c -o testlock > > # uname -a FreeBSD m18.cabletv.dp.ua 9.0-STABLE FreeBSD 9.0-STABLE #1: Tue Apr 24 13:23:05 EEST 2012 root@m18.cabletv.dp.ua:/usr/src/sys/i386/compile/m18 i386 cc -O2 -Werror -Wall -Wextra -lpthread -lrt testlock.c -o testlock testlock.c: In function 'test_rdtsc': testlock.c:151: error: can't find a register in class 'AD_REGS' while reloading 'asm' testlock.c:151: error: 'asm' operand has impossible constraints