From owner-freebsd-transport@freebsd.org Sat Dec 19 01:32:27 2015 Return-Path: Delivered-To: freebsd-transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6EDCFA4BD6F for ; Sat, 19 Dec 2015 01:32:27 +0000 (UTC) (envelope-from jtl@freebsd.org) Received: from na01-by2-obe.outbound.protection.outlook.com (mail-by2on0106.outbound.protection.outlook.com [207.46.100.106]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "MSIT Machine Auth CA 2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 14B9A1960; Sat, 19 Dec 2015 01:32:26 +0000 (UTC) (envelope-from jtl@freebsd.org) Received: from BL2PR05CA0023.namprd05.prod.outlook.com (10.255.226.23) by BN1PR05MB058.namprd05.prod.outlook.com (10.255.202.145) with Microsoft SMTP Server (TLS) id 15.1.361.13; Sat, 19 Dec 2015 01:32:19 +0000 Received: from BL2FFO11OLC003.protection.gbl (2a01:111:f400:7c09::174) by BL2PR05CA0023.outlook.office365.com (2a01:111:e400:c04::23) with Microsoft SMTP Server (TLS) id 15.1.361.13 via Frontend Transport; Sat, 19 Dec 2015 01:32:19 +0000 Authentication-Results: spf=softfail (sender IP is 66.129.239.19) smtp.mailfrom=freebsd.org; gmail.com; dkim=none (message not signed) header.d=none;gmail.com; dmarc=none action=none header.from=freebsd.org; Received-SPF: SoftFail (protection.outlook.com: domain of transitioning freebsd.org discourages use of 66.129.239.19 as permitted sender) Received: from p-emfe01b-sac.jnpr.net (66.129.239.19) by BL2FFO11OLC003.mail.protection.outlook.com (10.173.161.187) with Microsoft SMTP Server (TLS) id 15.1.355.15 via Frontend Transport; Sat, 19 Dec 2015 01:32:18 +0000 Received: from magenta.juniper.net (172.17.27.123) by p-emfe01b-sac.jnpr.net (172.24.192.21) with Microsoft SMTP Server (TLS) id 14.3.123.3; Fri, 18 Dec 2015 17:32:15 -0800 Received: from [172.29.33.199] ([172.29.33.199]) by magenta.juniper.net (8.11.3/8.11.3) with ESMTP id tBJ1WDD96294; Fri, 18 Dec 2015 17:32:13 -0800 (PST) (envelope-from jtl@freebsd.org) User-Agent: Microsoft-MacOutlook/14.5.9.151119 Date: Fri, 18 Dec 2015 20:32:10 -0500 Subject: Re: Extending FIBs to support multi-tenancy From: "Jonathan T. Looney" Sender: Jonathan Looney To: Ryan Stone CC: Gleb Smirnoff , "freebsd-transport@freebsd.org" Message-ID: Thread-Topic: Extending FIBs to support multi-tenancy References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-EOPAttributedMessage: 0 X-Microsoft-Exchange-Diagnostics: 1; BL2FFO11OLC003; 1:HbIgk+9hD81q4wsIA2wWKBYoc3cTp9Rc2kufCgkgnft/Fsu/3/h+0EuHFIAugPFHWRJAqUpQLeKqpKEs5qyQE086GQ7W4BfKq2PrHeXGfJATtFcyYakgFrhx4ruPI4YRg6AbOYktR31uvfc0bTrj07LN8CTT9XPhqvfhqWe9WYpoVPiPN82zSzWz4tg4lIqiFmYHd3hmiTnAufH/x/8hF3eb2fZvrUYZIH+wnMBl+BtpAoUIrxtjK71kGqjnajBEe4+1eoG8VNFPxJy1Y30zHcEx0dm/DDQxVQlls/ZQbHx6EJzk2ZzKZG0HdR3fjjqmmN+v6rC4Fkwl4zyAr7dGVA== X-Forefront-Antispam-Report: CIP:66.129.239.19; CTRY:US; IPV:NLI; EFV:NLI; SFV:NSPM; SFS:(10019020)(6009001)(2980300002)(377454003)(199003)(52604005)(479174004)(24454002)(54094003)(189002)(54356999)(23726003)(69596002)(46406003)(86362001)(19580405001)(1096002)(83506001)(106466001)(92566002)(87936001)(586003)(76176999)(105596002)(230700001)(50986999)(4001350100001)(97736004)(81156007)(47776003)(1220700001)(77096005)(16796002)(19580395003)(1411001)(2950100001)(11100500001)(50466002)(36756003)(6806005)(189998001)(110136002)(5001960100002)(42262002); DIR:OUT; SFP:1102; SCL:1; SRVR:BN1PR05MB058; H:p-emfe01b-sac.jnpr.net; FPR:; SPF:SoftFail; PTR:InfoDomainNonexistent; A:1; MX:1; LANG:en; X-Microsoft-Exchange-Diagnostics: 1; BN1PR05MB058; 2:aPJNle06DTFaUaQ2J/pjpJKebZrp5bFIjF4XLLMWa5MaOtxtmQK9k3qHU61ZC+z1TlELygKY3+nAJCjoAZyL/k94cGr2fkn0PVK5IZ/UCy5UYRImWowto2Cz60vj7ayZqHeutsZ5fo05YZ170hMchg==; 3:VYOZ95TrXprtAWdBcs+0WhEMNLKFR8dz5TvWBIPGvI2DzoJrEk4+3wAhtedlQJ7ZiEULz+IaO1y8u/kr1bQR2TcBrFUiguQbU5p5YV2TCA8rr8xjwY+HRF7nAEMA+rcaLX8gudpExPepi4QYg/jr67HzrW+4w5COvl0yt6DCcx/5T3KVPNNPULr85cF9UqmW8n5T2Zeg5O5aQY9QeM0pLZgUUZQ8/Pt1G7RXdPd+KmY=; 25:KdDgO6EggX5GY8URQJPYJntlboXC7ZYi50RkSUgFl4hLp9hg5wE666UW4XwnyGwUbTg9U4eI7+98S6NxpyMSaj1/SK4w031nQE11DzP03PW/Ygpn2kV61y9FmcoN3pHfy9X/dgBaePT20jSiHqxcXMsLYZObgMb+8rH9lB5ZNxE9sbKpUM5q+VZ91ku/mqXZ4auiWLGb5r2Fqy5yltVr2AM+S+24pQe2rlW9mhzy9XO1aU6kh+M4cn7fffk5oPKXtOD8kb7bm94D1kWaUlgs+Q== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BN1PR05MB058; X-Microsoft-Exchange-Diagnostics: 1; BN1PR05MB058; 20:irKydRf3UdorNCriWAEOBKo6Wt/JVr2GMeUckUFmTLvzITIq6okUY6b3AObNTXxXEYgkZM4LYrjdt6jsRlJebElnUFcfHxMkCZ82uh0diV3lTQ/c4aeQJP/vqxeKCsyxsr2sz9lQbeA57jfRzSSHIeq0a8ea/riazjtFyxBQVexQ8KCNz7QM2szv8abob9ebt3ryjbGpBX5soIoQwfatTmy9UB0kJZoZQI7BzqRqtF/7U3FWdPSjESDk7UpWU1BUg0zcoaDRvBNgL/lvheSrReKAC6GtZ0LXB4t1plOVE4Yx/6ymybKJiDfLlt97FAz+8MbUMn8xee+i1lclbPkTXV9Y99YKSQ4sfjN/RPTLigx5rLWhDw+D47/o4h8Cf4HRMTyF/D+fme4GIp/HneP8nZX5e57EsIEcb/yKs+6ZcNBCQ+8rOJ4A8AcgTvqZQoeiAMtWBDQe6JODTsHTlfSP08ftAuDhxabC8dQiHQAt5CJ9A1An/EHgRhBxQ58Hr/V3; 4:t5P6KIP17G8ey+Q4LFu/K8JEn3Pqy2Ks1UK1gxSHhjgOjXY0seJPhUlH2k2mBxGYWniSNJq9Mc2n8mFRG0yG6xj+RR4TOwtljoyhRuTlVpbqv1wam45Kais0LT6gZWMT8C2z4m2ohn9siL16wmQBmpTvOFMdxY7buBPjzxDtx4RwO3+vg6ZbJJwiNijbgv2R2Xairasl2TWE3/iN4sVIkeMQ4bliv1QHEebs8hmOfAK9ZFiJKm+KglxM9rHkd3wiW0/zDMk3Ywu0KbVy7vfq1+WpZkyCazjU5cRm1zzq8aFLiJwWK6UzJkrI2a73EFzQMKm3R3efXcqHLsqEVOtQ0/mtI6VnCc1jwuTbzYTJH9/mbXwFT0twllzIiTkZ0BhL X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(601004)(2401047)(5005006)(520078)(8121501046)(10201501046)(3002001); SRVR:BN1PR05MB058; BCL:0; PCL:0; RULEID:; SRVR:BN1PR05MB058; X-Forefront-PRVS: 07954CC105 X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; BN1PR05MB058; 23:oaoWoJncqa+K+DaK0k35gBVuNs5OmeIWI1j9NoAxKa?= =?us-ascii?Q?ccYbC4nkJXEHZqA+HvnzORa9lHvmh6adsXeez5H5RtK+g1vvfLlQ1rlr3JXF?= =?us-ascii?Q?hWDPlZ44nzfvE6KWP4mhJWt28SQ9nYFvjvvXIFw/hqbzq2aGi6cvz5AjKH2b?= =?us-ascii?Q?Iw5dV7Nuz4LA6WtycD8l1ky45v8SR/n6tDmCLLXvEFSk/NoiryIQOY9XJij5?= =?us-ascii?Q?xZWABMiw20MDYsi33GCC1Wv/kf9oMNIodS4V7+XUU9y+IoDYS2q6jgaOgmr6?= =?us-ascii?Q?BMxS/kFlqSdM9nrPtSEP0EPTNoEMfSzxhKDttcOyWkCC6HYrBf2K4czZg7jG?= =?us-ascii?Q?GcWfm+5TpWUgnjNzG30aW7P2d3WJRCPoJCotoFicxLj0AbOSUBVdN75UpNxN?= =?us-ascii?Q?aBbyApgkeyrSJufTVw0bjONeOcqweb0XQfaWDJCDXfa0zokB2Jps9E8rQjsf?= =?us-ascii?Q?6eGJ9CQ3aaE+5QORNf3DtucFa1q88RiCvfDacaVMrUAU7rQlwaEVHETAAin7?= =?us-ascii?Q?mELIQ+u3jlRkiOI46WwTa1ltP9bxYhMmYx8sIxb1u0Gmnf3fdUEEYYI1T6c7?= =?us-ascii?Q?ShrQtC1dZWPkXly/PH22GAFowKZKVLqzLwB5MhFM2sJDuHvErC1++vo67jBq?= =?us-ascii?Q?o2su7jhuQ24LYzEt/qpBA1jvjZDOi+rfSTumv35UYhxbaMJOEC9Wg73z4URZ?= =?us-ascii?Q?PZnlt3+pweM82WKFzszMmyl3R1x9sTSpoQTge5NrXr9G/dz/+xH0ulCRzrze?= =?us-ascii?Q?RLLzpV0pgtLFHw7b/RFdZlSZlUw6kx/qnvsWFJ6RwvqRckclfUxW3Epd5hk4?= =?us-ascii?Q?XXruWKTnnPTEgHgDUz7FJXAk7tz+Q2CR9pEI5+m0JeCv3cE6XsB426VM7lRE?= =?us-ascii?Q?DXqSLdQLUMPeMsctcASecDhTV826HOcgdG4lOiVZg+OATp4hQ7DEq6NytGu0?= =?us-ascii?Q?ZpHhHN5jwMnjosrkP1n9rdRmoEsH+dYFxnuo8ihEsqEjlSCIDfsPlmq9Ax7A?= =?us-ascii?Q?+gRwF8g9eiz4j1L25sI9+B+avT6LMZYYzU4F20bSqDRJYnJjqKSjQelie9HS?= =?us-ascii?Q?Ksio6l51HnD/DVWFEiByJTIrN1GDTUjg57GB/Q9alNxjCzICAAhLVzDEOUcb?= =?us-ascii?Q?FMcdhmmYM=3D?= X-Microsoft-Exchange-Diagnostics: 1; BN1PR05MB058; 5:6w8fr3EB2KGzc/+Lqv8J/2eLeiIano5i/KT9a8dLVuX7L8HeZMBiPPRasOR3rsVtJaKtbRgd+3vWWQHxleOQgUsl+jbDIcub0si3zQxX0oFCLAFUieoSXUs2CVy8KNDVriRXgw0brdEmHhKTO54lUQ==; 24:vw7pVlTsMye+lyBSzdIJPjRrYiR/Wi3oypDpedDWZyYnxe/yLG67+/Bmo9UHZLUMfb1qUEjBuGxcbkbsNZlJBnHh8GtqSGmaVDqu9DcPA0c= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: juniper.net X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Dec 2015 01:32:18.8131 (UTC) X-MS-Exchange-CrossTenant-Id: bea78b3c-4cdb-4130-854a-1d193232e5f4 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=bea78b3c-4cdb-4130-854a-1d193232e5f4; Ip=[66.129.239.19]; Helo=[p-emfe01b-sac.jnpr.net] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN1PR05MB058 X-BeenThere: freebsd-transport@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions of transport level network protocols in FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Dec 2015 01:32:27 -0000 On 12/18/15, 5:26 PM, "owner-freebsd-transport@freebsd.org on behalf of Ryan Stone" wrote: >- they may use independent routing tables [...] >- traffic from different tenant networks is not guaranteed to be >segregated >in any way -- it might all come in the same network interface, without any >vlan tagging or any other encapsulation that might differentiate tenant >networks The combination of these two requirements seems slightly odd to me. Usually, you need separate routing tables because you have separate interfaces. When you have shared interfaces, you can usually use the same routing table. I think it might help to have more information about the reasoning for these requirements, as it seems that this combination is what is leading you towards making the FIB assignment be an address property. >1) >We don't really want to change all of our services to instantiate one >listening socket for every tenant network. Instead we're looking at >implementing (and upstreaming) a kernel extension that allows a listening >socket to be wildcarded across all FIBs (note: yesterday I described this >feature as allowing us to pick-and-choose FIBs, but people internally have >convinced me that a wildcard match would make their lives significantly >easier). When a new connection attempt to a listening socket in this mode >is accepted, the socket would not inherit its FIB from the listening >socket. Instead, it would be set based on the local IP address of the >connection. Makes sense. My employer does something similar in their stack: listen sockets can be assigned to a particular FIB or be wildcard entries that listen in all FIBs. We haven't noticed any scaling problems, but we typically don't have high connection setup rates, either. In any case, I think this makes sense. >2) >Currently, FIBs are a property of an interface (struct ifnet). We aren't >very enthusiastic about the prospect of having to create thousands of >interfaces to support thousands of network interfaces. We would instead >like to make the FIB a property of the interface address. I don't understand the motivation for this. It would help if you would provide more context for the use case. (See my earlier comments.) At minimum, before proceeding, you should connect with the folks who had talked about wanting to make changes to ifnet. (Among other things, I think they had considered creating separate physical interface, logical interface, and interface address constructs.) I'm not sure what happened to that project, but I think it is still an ongoing project. I think Gleb (cc'd) was involved in that, so you might want to check with him. >3) >The idea of a per-thread FIB has gotten the most pushback so far, and I >understand the objection. I'll explain the problem that we're trying to >solve with this. When a new request comes in, we may need to perform >authentication through LDAP or Kerberos. The problem is that the existing >open-source implementations that we are using manage sockets directly. We >really don't want to have to go through them and make their APIs entirely >FIB-aware -- that is far too much churn. By moving awareness of the >current FIB into the kernel, existing calls to socket() can do the right >thing transparently. > >We're not entirely happy with the solution, but the "right" way to solve >the problem involves rototilling a number of libraries. Even if we could >convince the upstream projects to take patches, it's far more work than >we're willing to take on. Thanks for sharing more details on the use case. It certainly helps clarify the reasoning. However, I wonder if this really solves all of your problems. For example, you talk about needing to perform LDAP or Kerberos authentication. You are already going to need to make your application smart enough to figure out which servers to use based on the source of the incoming request. That may or may not require adding intelligence to your libraries to give you enough information to identify the incoming connection. Further, per-thread FIBs may not solve your scaling problem. You initially stated that your objection to VNET was that you would need a minimum of "A * B * C threads to ensure that any given service on any single tenant network could fully utilize the system's resources to process requests". If you assign threads to a particular FIB, then you are back in the A * B * C scaling model that you didn't want. However, on the other hand, if you maintain a smaller pool of threads and continually reassign their FIB, you could hit interesting problems if any of your libraries implement their own thread pools or event-driven libraries (e.g. libisc2). In those cases, they may try to switch contexts between connections as events occur. How will you ensure the thread's FIB is always assigned correctly? It seems like this could become quite complicated, depending on the exact situation. Per-thread FIBs have a lot of potential concerns, ESPECIALLY when implemented by programs or libraries that aren't expecting to work this way. The biggest concerns I see are complexity and troubleshooting: you need to make sure that every thread knows which FIB it is using and only handles connections for that FIB. If you make one mistake, your connection suddenly can go to the wrong place. Just my 2c. Others may disagree. Jonathan