From owner-freebsd-net@freebsd.org Tue Jan 2 23:07:10 2018 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C5F2AEBE243 for ; Tue, 2 Jan 2018 23:07:10 +0000 (UTC) (envelope-from charlie@atech.media) Received: from mailman.ysv.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 8E25877B0F for ; Tue, 2 Jan 2018 23:07:10 +0000 (UTC) (envelope-from charlie@atech.media) Received: by mailman.ysv.freebsd.org (Postfix) id 8A959EBE241; Tue, 2 Jan 2018 23:07:10 +0000 (UTC) Delivered-To: net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8A0A6EBE240 for ; Tue, 2 Jan 2018 23:07:10 +0000 (UTC) (envelope-from charlie@atech.media) Received: from EUR02-AM5-obe.outbound.protection.outlook.com (mail-eopbgr00089.outbound.protection.outlook.com [40.107.0.89]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "Microsoft IT SSL SHA2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CA66677B0E for ; Tue, 2 Jan 2018 23:07:08 +0000 (UTC) (envelope-from charlie@atech.media) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=atech.media; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=FHj7BWZYvoJotajglFIOkLPlB7yHgOwERSL0ppCONwY=; b=BmU0P4jCCQ+VaMXT2MNje41BOS12Il1KYmzmLIuajTNkmRqTGRNOQcsVGwoiqnnE3WPxFA65UvRcITJiAFFFdzEknzamjPKSgusOuJVR1G6JRd1AVfIGXLswr+Lxi4Ja5K5gRT5hN7knPDT7hLli7EncxkuJMfgJSZMsFhD6+RM= Received: from [10.0.8.11] (185.102.133.45) by VI1PR05MB3501.eurprd05.prod.outlook.com (2603:10a6:802:1e::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.386.5; Tue, 2 Jan 2018 23:07:05 +0000 Subject: Re: Linux netmap memory allocation To: Vincenzo Maffione Cc: "freebsd-net@freebsd.org" References: <7b85fc73-9cc8-0a60-5264-d26f47af5eae@atech.media> <6c5de1ed-0545-31b3-d0e2-4258fa4ccf1c@atech.media> From: Charlie Smurthwaite Message-ID: Date: Tue, 2 Jan 2018 23:07:01 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Originating-IP: [185.102.133.45] X-ClientProxiedBy: AM5PR0601CA0031.eurprd06.prod.outlook.com (2603:10a6:203:68::17) To VI1PR05MB3501.eurprd05.prod.outlook.com (2603:10a6:802:1e::31) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: faf1bb57-f5c2-4e37-df0f-08d552358a91 X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(5600026)(4604075)(4534020)(4602075)(4603075)(4627115)(201702281549075)(2017052603307)(7153060); SRVR:VI1PR05MB3501; X-Microsoft-Exchange-Diagnostics: 1; VI1PR05MB3501; 3:zS+1XwQttvblqctnrzPAcAwz0D5Go361lXlzfcOn2HlRyYdIX6x6Sc8RVoEWhDVElLvZsPsA0AVEk6IyzPInJGAE4s4kHIJYSzbREfdolYTdxil8PPvQRrnm/Jv9Zw4UlnM/p2wD//Yydreg7JCPSdPKWv1bJb3nU2gmSy1EXcbpAl7xS3azcBR4uQkSYvxyZjM5fySmbVsu6vciH7bdkRJ4ZCLwp+nQ9D83DZzsdJ4SMu6uHJvfS045dRK8/ORV; 25:zm+vugeLktRMigJfRCkf/lxE2pbSlHMAoL/ZNfTCNuDHjtaScmzqVHvONtTH7l3rfDFkvyDn96mvNSnOsSmM8og5yQUjd8uXB8Nz/+WpTF+NF5Lc9Rwv+8WyvxejltENXvDDdHLtyzBXSJTdVjZZkHUGK43UjmwT3VjCKnW/6lzXqNvwauJ/a+1Rs9vlmzMNhcm/M2Pf9lIuAIS7Zy84zzCzYxialyBNJUVfBApeoOhxuJHzG315DVmTpFxOA2MZuZKUv+i2FyCjVpg4LbteUSzw7hyEE/IQNRUtTjLU+H+rGNppF7xK7h1TR8UDzqvDkANPGQaRHhHOpYxsYsjxKA==; 31:pC6NS2kBXk1mllTOu63gmqmindegvzZswoLMCs3FNfQon3nLgP28TN4wD5wQiJ5H1EPM/K7FeBmbIwO3h5J500c6+Plx00U2AxVPDmd1LZE80tGc2o4HLw84/kXZ3Y6/fWX+blZ4cjHynnGjQWnpNbZl0432dBsEbaXX9M8ZABFW8SDP5Y29OWjB9P2NR9Paire6MdVjTRjInYNf1LfDTh3fNkQN5xz9E3oOv6P2ns0= X-MS-TrafficTypeDiagnostic: VI1PR05MB3501: X-Microsoft-Exchange-Diagnostics: 1; VI1PR05MB3501; 20:OaoEXYrUeFkmesWZe3Wl3Z9eilKU5SFqIUX0cB87P4Pove9Slk2eXtZCG6+cMvA8BTSZ998hSqipfp/9yh+SNKJewRCJhcb1ZAMJupo5olzXosbv6tBJVhUFCw1RKwcqI4R5sPcuZdYjIWP0dHvxPWkmMQG8X0+kzL7RRGcrwxSZPVa+8Y6tliL8Pj7yC2rdnn+S+kmBY7owWWv1I4nWvU9DMtPqBcfgnR5G286qURmFC6586bJKZfK1lL0GMmSp; 4:07ZN8ArV8jtHVl0NU1+wBHAiSvVQmL23PLL1zmW+9P0yB0BI9nluKTo051RoLkW0pCg02jLB9eKZqjuayliQZgpvAwpIU8nKGD86E/PMq2mhprvHagYOkSSyMD3Dri3ChTBR8hfT+ufJkKtr3BwqskTFSdTXN3bebtpKLkstgQCfm17RanlDFFVwH+YdnF/qS/o5Uq17D0kjUi5J2KwFx25s0HtJkOP1MJ4dSDsVxNKBvZfnIcJHY4IbzfOXKsfwlRYOBfZMS8EuoS9VXbDZiFMvGNQUr8hIrWuUuHQB5KaMDF2HsliR0PnjACFMz92AfVXs6ymCsCi1HRX19QrfLtqEI/R2EyKqo/GtNHsxa10= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(190756311086443)(166708455590820); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(6040470)(2401047)(8121501046)(5005006)(10201501046)(3002001)(93006095)(93001095)(3231023)(944501075)(6041268)(20161123560045)(20161123558120)(20161123562045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(2016111802025)(20161123564045)(6072148)(6043046)(201708071742011); SRVR:VI1PR05MB3501; BCL:0; PCL:0; RULEID:(100000803101)(100110400095); SRVR:VI1PR05MB3501; X-Forefront-PRVS: 0540846A1D X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(979002)(6049001)(39830400003)(346002)(376002)(39380400002)(366004)(396003)(69224002)(189003)(199004)(52544003)(3846002)(2950100002)(5660300001)(84326002)(6666003)(58126008)(606006)(93886005)(106356001)(97736004)(16586007)(236005)(8936002)(6916009)(52116002)(37036004)(83506002)(105586002)(53936002)(316002)(65826007)(7736002)(16576012)(16526018)(6116002)(31686004)(64126003)(345774005)(553524004)(8676002)(478600001)(77096006)(386003)(575784001)(966005)(86362001)(31696002)(81166006)(229853002)(6486002)(59450400001)(81156014)(33964004)(33896004)(4326008)(68736007)(6306002)(54896002)(3480700004)(39060400002)(2906002)(25786009)(65956001)(76176011)(65806001)(6246003)(66066001)(46492003)(969003)(989001)(999001)(1009001)(1019001); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR05MB3501; H:[10.0.8.11]; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; Received-SPF: None (protection.outlook.com: atech.media does not designate permitted sender hosts) Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=charlie@atech.media; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; VI1PR05MB3501; 23:OY1PwCdqxZRIThNv459mnLwyYX7VXB1VexWt72/NO?= =?us-ascii?Q?p6uUsnm+QOxyXhqdid3hgs6VGIFuGEngUkL4H9KUw5CmUFId/mBsLECguZQd?= =?us-ascii?Q?rst/XsLD4pKSNzNmn2L3xw3cmCBvEb4l0liWeYyJKQut6qE43PM/n6CIBT4+?= =?us-ascii?Q?veR/KNNs/BQifSrXBjv+EE8X/awXXcHDZbGOnjUmH3LRapmNZoXNFtK4SL21?= =?us-ascii?Q?+3C9yt0dKx94y1L7guDVE7ZCJIR+L3NuwdsZ1Igei+QPNHj8TZZwql2n/My7?= =?us-ascii?Q?VyjtcXuwS6Zoqh3h+Vy9M1ziXuTlxwWYmhomp5b45KOOIFtmZfiFyvqjpElh?= =?us-ascii?Q?bVzu1ETLkLT+ClAZuImbnfZmZOXWaJZAM16T0subUzk22PLIhPmXu5FmeKFE?= =?us-ascii?Q?p/1u6jnAnLN2K2pLGowA//ifAv1ACbi1GkuFLg23rNg+AvyEGtOOQJyzIOQY?= =?us-ascii?Q?E55Cr6bZnUnUbxZ/z81+qTG0LkEmKWi1b5Fa9U1QII88NRygleGU+UEcBeku?= =?us-ascii?Q?uHso1UOw+5ZVzYAdN6kNxxvjG3+jIrfj6neEjehpGPKUqIMG6WBJjzlrT6mX?= =?us-ascii?Q?PaQJ9l8psiuL1/JwZMGSdfCB+Q5YZrd4P+tsJj/qjGVpc7IYa64SbAiiMEeM?= =?us-ascii?Q?iATx5dhXX7twzWl5i+4tTMIzdZ+wnCeDUfET7IkhEqCTHv9ZIGW/GdPaf1gr?= =?us-ascii?Q?wotf00O0HT9CFzbIOOeNoG82A3zrfmFutykH/fz7og2sHXB7qFOLzbQO7zkD?= =?us-ascii?Q?5wYeNPZ85b0vh+YH4L4ZhrMbc/YuxAbZmb6IdUifkfs2++5UuyIvb4Q/R88P?= =?us-ascii?Q?U0/7e5Vp/i+tD9Ua4kPcnr4dxK2fSeQKMuPBwG573FS4O5PeJsRLXBKeu+ou?= =?us-ascii?Q?dCklB7HPtXAc4voUUSWQsA77Mj+qRuwhW6BFb2Jd4jXYBKvSphuAA3HMY342?= =?us-ascii?Q?gw3eJYZr9PZOTwa5YIuHkGuoWsAooHfL7c7g4YpLoFTux2E14U/wTiqNFMUA?= =?us-ascii?Q?dqd+tyfRy4drobo6D0fMgYZdhb+lrUqZclk5QnykxV3X2mH95SZPghTGW7kc?= =?us-ascii?Q?YMMy3gFbRNUCRhWig6rJ3+yP/K95KnDAIG3QXf5cJmMkKqIw+lpUgSOnWWlU?= =?us-ascii?Q?+g+rTmE2c8+Gugnxbgpw0fIIc0Bde84ZIRGYibIm4CogfhPy5DQn2tbZpnZv?= =?us-ascii?Q?8yHnxAA6qH4HCmLgSBFlvhBPnPFvCTFhnUDEore9sfX7aCfNKc7xRKgvQ3x8?= =?us-ascii?Q?c92olXHxPlbOmZ8+NWO6v2E4JNNmlyUJBn7B6xAc5OjeGoqYh7+wxncqqdJR?= =?us-ascii?Q?wM2opD6qpMdx+XELDKVtAd8Qpe5D4jXNtQtVBd1boAIen9SeSSZGduAmqY5f?= =?us-ascii?Q?dYvDK6bb5uu+qMXeRPxY6K1w4RDBytg59opnR+Qej52xGl5B4UMb2IIGmqCc?= =?us-ascii?Q?mtx3lPR4RRzCuBHMYmRPWaNupwu7/+6itWt3LhwkOubKzy1dLK22W3qPvXsd?= =?us-ascii?Q?6VEinoYtSsg0SeI/LnSqkJwreJ5vCeMoe5tEyTiEmoPNa7XRDvkNHEOdfp/5?= =?us-ascii?Q?ugRIc24IU9rEL+mv40gS3NsD9QchDybWH4kcNwCL+2AgjiN3f75Pqtsfox3Y?= =?us-ascii?Q?Ox00UHBKQK47R1jSYGzwJdr0SqOm9NwEfcpZrdACfNdctZvcWOZm/HYGq2qW?= =?us-ascii?Q?J98?= X-Microsoft-Exchange-Diagnostics: 1; VI1PR05MB3501; 6:KQN/jFcfaXsE8ewG3QDE68qbQ0fe8PDHTlpi+k4xd31TClNFbmdTUAk0oVwtTGClHerv4eTELnaLqya4PDMR+cCatMWzAFWjJa4veGQXSv8f67hmzxIz4K3g3Ssm4blSLa26ULM4AW2Ca0Agd5iurjawZ7nyogHUasHdxXbhZqmsCh3l522Hu5nbmKmaIcwnDv7q6AJKCWe1RBSUeiBo6GCv8JvmENWXhSvf3FaTVCsjdhcCwr/K8SnLmp7wrsWejKHnddQen1U4AgQOkKGhvYx9VzurWq8H/N83QMB1UNxxPeGpFADxFanXdvz6Syyk+hY2iEmM3zrs0Kp4Dl7fsqxJb30vLBS/9OVuoBeHnMQ=; 5:XorkjxfNCVTAwdMTUfm/55HTbdtKC4xt6oYocljwCdqGbSxP0cJE5PTKDBZkOn50cDId5+m6OTPeREYdhNMTmwYRH8rR0GyJYq85FLdgYVVfWKDRtK8H0+zklRwEIfMxgoDN8ElYvwi900jEASjL60ETWGzeCwGJIpRQm48ZpOE=; 24:AcXqUHgoru43GdfrdlcivTK5uyCHsPSWFFbzU2VEf1MfwKgrPUIWZ/w/PyT1OHWEesYp3F/qESwmb78UrEJAQncY7lWeQCHy1xF3brqdAxQ=; 7:64O5Vd2AOyayqLw1xRWzuXoTugtUGwIWCqeNYonfLbyy1Y/mEwD5oYSqwuHAxwz5xKLFUX96rKoWwnjCefzT1VQfQDgMkWQibz86f43XkPOG6DrdXsvQZuMUud306dO4ngic4mJpScwX2BWyuBGIABI2gu73yytEDSGl8cpar9GtWVfCGqqhwAuwjEjhdo/ho0ZuVLQBs4d6O0QDquMl/jx5v36LNPR9l0m8854CHm8KQ/+6IHCWELx2ZF8f1yve SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: atech.media X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Jan 2018 23:07:05.2478 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: faf1bb57-f5c2-4e37-df0f-08d552358a91 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 7a8f6edf-720f-4e3d-b767-1360e39a8cdf X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR05MB3501 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Jan 2018 23:07:10 -0000 Hi Vincenzo, I am using poll(), and I am not specifying NETMAP_NO_TX_POLL, and have foun= d that sometimes frames and sent only when the TX buffer is full, and somet= imes they are not sent at all. They are never sent as expected on every inv= ocation of poll(). If I run ioctl(NIOCTXSYNC) manually, everything works co= rrectly. I assume I have simply missed something from my nmreq. I don't think you have missed anything within nmreq. I see that you are wa= iting for POLLIN only (and this is right in your router case), so poll() wi= ll actually invoke txsync on interface #i only when netmap intercepts an RX= or TX interrupt on interface #i. This means that packets may stall for lon= g time in the TX rings if you don't call ioctl(TXSYNC). The manual is not w= rong, however. You can look at the apps/bridge/bridge.c example to understa= nd where this "poll automatically calls txsync" thing is useful. Thank you for the clarification. I have now altered my code to call TXSYNC = after each iteration, but only if I have modified the TX ring for that inte= rface. This seems to work perfectly. The patch can be seen at https://githu= b.com/catphish/netmap-router/commit/2961ab16f14a8b2a2561c9d73f73857e523cc17= 7 You also mentioned: "whether netmap calls or does not call txsync/rxsync on= certain rings depends on the parameters passed to nm_open()". I do not use= the nm_open helper method, but I am extremely interested to know what para= meters would affect this bahaviour, as this would seem very relevant to my = problem. Yes, we do not normally use the low level interface (ioctl(REGIF)), because= it's just simpler to use the nm_open() interface. Within the first paramet= er of nm_open() you can specify to open just one RX/TX rings couple, e.g. w= ith "enp1f0s1-3". Then you usually want to mmap() just once (as you do in y= our program); with nm_open(), you do that with the NM_OPEN_NO_MMAP flag. I did look at nm_open, and even read the source of nm_open to discover how = to implement the shared memory, but (for no good reason) I preferred to set= up the interface manually. If you are interested or if it helps explain my question, my complete code = (hopefully well commented but far from complete) can be found here: https:/= /github.com/catphish/netmap-router/blob/58a9b957c19b0a012088c491bd58bc3161a= 56ff1/router.c Specifically, if the ioctl call at line 92 is removed, the code does not wo= rk (packets are not transmitted, or are only transmitted when the buffer is= full, which of these 2 behaviours seems to be random), however I would exp= ect it to work because I do not specify NETMAP_NO_TX_POLL, and I would ther= efore hope that the poll() call on line 80 would have the same effect. Yes, that depends on when netmap_poll() is called by the kernel, that depen= ds on when something is ready for receive on the file descriptor. Looking at your program, I think you need to call ioctl(TXSYNC), at least b= ecause you don't want to introduce artificial/unbounded latency. However, s= ince these calls are expensive, you could use them only when necessary (e.g= . when you nm_ring_space(txring) =3D=3D 0 or when you actually forwarded so= me packets on txring. Per the patch above I now call TXSYNC on an interface only after pushing a = batch of packets to it and this seems to work perfectly, at least with a go= od balance between performance and latency. If nm_ring_space(txring) =3D=3D= 0 I just drop frames until the next batch. I don't TXSYNC part way through= a batch, it hasn't yet seemed necessary, but I may need to look into this = later. I'm running this on a 6-core 2.8GHz Xeon with a 4-port i350-T4 NIC. I thoug= ht I'd just post some stats of the performance I observe using my code (exc= luding the routing table lookup as this isn't relevant to netmap). Not real= ly looking for any advice here, just thought I'd share my results. All examples are with 1.488Mpps (1 x 1Gbps) input and no packet loss observ= ed: 1 thread - CPU usage =3D 100%, batch size =3D 4 2 thread - CPU usage =3D 54% (27% x 2), batch size =3D 12 4 thread - CPU usage =3D 98% (25% x 4), batch size =3D 8 6 thread - CPU usage =3D 124% (21% x 6), batch size =3D 8 And again with 2.976Mpps (2 x 1Gbps) input and no packet loss observed: 1 thread - CPU usage =3D 100%, batch size =3D 12 2 thread - CPU usage =3D 68% (34% x 2), batch size =3D 21 4 thread - CPU usage =3D 100% (25% x 4), batch size =3D 17 6 thread - CPU usage =3D 105% (18% x 6), batch size =3D 16 These results seem excellent and demonstrate that netmap is scaling as expe= cted with both threads and packet volume. The higher thread count will be m= ore beneficial when I am doing more processing on each packet. I hope this all makes sense, and again, I hope I have simply missed somethi= ng from the nmreq i pass to NIOCREGIF. It is worth mentioning that with the exception of this problem / confusion,= I am getting extremely good results from this code and netmap in general. That's nice to hear :) Your program looks simple enough that we could even add it to the examples = (as an example of routing logic). I'd be very happy to contribute to the documentation in any way that may be= helpful. I have added a permissive licence to my Github repository just in= case my code of of use to anyone else. It is currently somewhat incomplete= as an IPv4 router as it doesn't update MAC addresses on frames before forw= arding them, and because the interface names are hardcoded, but when it's m= ore complete I'd be very happy for it to be contributed to the examples. Of= course anyone is free to use my code for any purpose too. Thanks for all your assistance! I'm happy enough with this that I will move= on to looking at my IP routing code. Charlie Charlie Smurthwaite Technical Director tel. email. charlie@atech.media web. https://at= ech.media This e-mail has been sent by aTech Media Limited (or one of its assoicated = group companys, Dial 9 Communications Limited or Viaduct Hosting Limited). = Its contents are confidential therefore if you have received this message i= n error, we would appreciate it if you could let us know and delete the mes= sage. aTech Media Limited is a UK limited company, registration number 5523= 199. Dial 9 Communications Limited is a UK limited company, registration nu= mber 7740921. Viaduct Hosting Limited is a UK limited company, registration= number 8514362. All companies are registered at Unit 9 Winchester Place, N= orth Street, Poole, Dorset, BH15 1NX.