From owner-freebsd-net@freebsd.org Sun Jan 24 00:11:05 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2C3C3A8FCE6 for ; Sun, 24 Jan 2016 00:11:05 +0000 (UTC) (envelope-from rizzo.unipi@gmail.com) Received: from mail-lb0-x22d.google.com (mail-lb0-x22d.google.com [IPv6:2a00:1450:4010:c04::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A581C1C00 for ; Sun, 24 Jan 2016 00:11:04 +0000 (UTC) (envelope-from rizzo.unipi@gmail.com) Received: by mail-lb0-x22d.google.com with SMTP id bc4so58002020lbc.2 for ; Sat, 23 Jan 2016 16:11:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:content-type; bh=jLmmoXb2IOFBQrET2mmoY8WJes5i7xps4vtIyT9pE+o=; b=0wNYM5YAnjO8pgPBfVRFffleOfpEq8MdvtWHw5G6bdQBgDpxUUWibvW+SJkL5HXNrx hzO3hhBvo809IZB5fgWILH+C3TcCwDxbjybEf01YRXFs3rxk+pmir3o0mPZR5Eg1Aqbb 4ylTxKBHD29uJ6MgOwcFrnFTRXO40I8Em0CxGSnht0/mKZByAtGbMzVHA31F8V9GV3oA e9qwGYUJtRn34PAQguuIyLf88NHz/XLQAtDprRpiS3iqXcpnF4EL9z+zuVxOzEBTaIVD jlmSN5OilSbMSDsWEKscVG9D9NN5iY6n8SGLS9Qb3pnK8KoxyCpwf5jT5f7NH98JVsIc Mr7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=jLmmoXb2IOFBQrET2mmoY8WJes5i7xps4vtIyT9pE+o=; b=Iy0NSCaxDeXZWTHTavqoc160GgkNbip6PzgVuqeJOxcs0RF7o2ZH9GoeSpFtSxQyMz 5YRNUhme97OVw+2Q8N6FYuKxQLNNpJGOd2XbqQ4eSgwiSuRs/bMMCbysa7+S7V2aQlPB ri8iukUAwQmZ32yAzjV9FHXTu29BZFNPFJ86efK9OzJmNzOpiyLyTqDQQuqBQIub8rOC jhTVHGA7LCB2jQkmI7q+QpUJkbbeqAgT6t+2isf7as/nrUW6hS7v1mvipBdv3OaJ22H9 eScUmTGXNjlBqNjOrePY3Wphn9qReiEAI7R2FGuOrrzFhLEKPABhLJrNdEAA0JVED8x2 Tn+Q== X-Gm-Message-State: AG10YOStoIKKYrU1vVQSbwITVntdxrDlB/2NdrER93W0MBZ2n6rqWlgdxSLUv5DJv6SGjXHlvy4jGEj24tqiRw== MIME-Version: 1.0 X-Received: by 10.112.13.99 with SMTP id g3mr3655273lbc.86.1453594262329; Sat, 23 Jan 2016 16:11:02 -0800 (PST) Sender: rizzo.unipi@gmail.com Received: by 10.114.4.232 with HTTP; Sat, 23 Jan 2016 16:11:02 -0800 (PST) In-Reply-To: <20160123211816.GE4574@ox> References: <20160123053428.2091EA0121@smtp.hushmail.com> <20160123154052.GA4574@ox> <20160123171300.0F448A0121@smtp.hushmail.com> <20160123174840.32B1DA0121@smtp.hushmail.com> <20160123183836.GB4574@ox> <20160123211816.GE4574@ox> Date: Sat, 23 Jan 2016 16:11:02 -0800 X-Google-Sender-Auth: 6syN4Knu_ev6hngJqRlrLzMYJ-4 Message-ID: Subject: Re: Chelsio T520-SO-CR low performance (netmap tested) for RX From: Luigi Rizzo To: Luigi Rizzo , Marcus Cenzatti , "freebsd-net@freebsd.org" , Navdeep Parhar Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Jan 2016 00:11:05 -0000 On Sat, Jan 23, 2016 at 1:18 PM, Navdeep Parhar wrote: > On Sat, Jan 23, 2016 at 11:12:28AM -0800, Luigi Rizzo wrote: >> On Sat, Jan 23, 2016 at 10:38 AM, Navdeep Parhar wrote: >> > On Sat, Jan 23, 2016 at 03:48:39PM -0200, Marcus Cenzatti wrote: >> > ... >> >> >> >> woops, my bad, yes probably we had some drop, with -S and -D now I get 1.2Mpps. >> > >> > Run "netstat -hdw1 -i cxl" on the receiver during your test. >> >> Navdeep, does this give any info on the ncxl port rather >> than the cxl port connected to the host stack ? > > You're right, it should have been "netstat -hdw 1 -I ncxl". In these > kinds of experiments it might even be best to run two netstats in > parallel on cxl and ncxl. > >> >> ... >> > Do you know if the transmitter will pad up so as not to put runts on the >> > wire? If not then you might want to bump up the size of the frame >> > explicitly (there's some pkt-gen knob for this). >> > >> >> ix/ixl do automatic padding, and in any case pkt-gen >> by default generates valid packet sizes (and so it does >> with the variable-size tests I suggested). >> >> Is there any parameter that controls interrupt moderation ? >> >> In any case we need to know the numbers when sending to the >> ncxl MAC address as opposed to broadcast. >> >> I suspects one of these problems: >> >> - the interrupt moderation interval is too long thus limiting >> the rate to one ring/interval. Unlikely though, even >> with 1k slots, the measured 1.2 Mpps corresponds to almost >> 1ms which is too long >> >> - the receiver cannot cope with the input load and somehow >> takes a long time to recover from congestion. If this is >> the case, running the sender at a lower rate might reach >> a peak throughput > 1.2 Mpps when the receiver can still >> keep up, and then drop to the low rate under congestion. >> >> - and of course bus errors, when the device is connected on >> a PCIe slot with only 1-2 data lanes. >> This actually happens a lot, physical connector sizes >> do not reflect the number of active PCIe lanes. > > There are no drops or PAUSE or any sign of backpressure. The netstat > counters show 900K incoming and 0 drops/errors, which means 900K packets > on the wire for the port and all were delivered to the driver > successfully. I am not 100% convinced by the above explanation, both because of the way the experiment was conducted (broadcast destination, duplication in the hw to two queues, stats only from cxl0 and not ncxl0), and because sending to the unicast DMAC already showed a higher throughput. So I suspect the 900k counter does not actually reflect packets on the wire. Where and how the drops are counted I have no idea (presumably some internal counter in the NIC, but for broadcast traffic replicated to multiple queues I don't know which counter tracks the drop - one per queue, a global one, etc. > The mismatch in the transmitter's counter and the incoming counter can > only be explained by > a) Frames whose DMAC address didn't match the local interface's MAC. the DMAC was the same for all frames. > This can be tested by switching cxl0 and ncxl0 to promisc mode to see if > that opens the flood gates. > b) Frames mangled badly enough to be discarded. But these should show > as an error or drop in at least one of these: > > sysctl dev.cxl..stats > sysctl -n dev.t5nex.0.misc.tp_err_stats one question -- do the sysctl for cxl. and t5nex.0 also report the ncxl* stats, or you have a separate entry for those ? cheers luigi