From owner-freebsd-arch@freebsd.org Sat Dec 16 03:28:02 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BBDD9E99A58 for ; Sat, 16 Dec 2017 03:28:02 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-io0-x235.google.com (mail-io0-x235.google.com [IPv6:2607:f8b0:4001:c06::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7B20174307 for ; Sat, 16 Dec 2017 03:28:02 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-io0-x235.google.com with SMTP id h12so4625424iof.6 for ; Fri, 15 Dec 2017 19:28:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=h+b8cpnyNsY842D9pRSPKOLtNvFYnE7tvekWx82q4js=; b=U+mEVUWB3kkW4p6ppbE5VSq/jy+Cz4dGNbXaeU+HLCufjqwSqNecbx9eWN8KWugZTQ upn0MFuDXHgEq2Y5yJO9BsBWFJhq2DwoLED6Dagysh4S4LFlki0sirV9mjffb9M2Gst9 58fk+lcIR7kfOMixAGf39jDTm3CHGx7o1qisbsl+eHhJF9i0EX9X+9yhEEAUG4RFaFOp fEdH5mIrLHX78dqjh6mj7lR5lsiTkU8wTz5YlnsKX8dL3HmibL0WOcS5hYM/Iib9WcVq 59ig18ugQSD9o67a/74bPrTwyWlmS8szdEHsrVyEOm+h6LJmTSHfkkiUu3fho2Oo3CkN DmlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=h+b8cpnyNsY842D9pRSPKOLtNvFYnE7tvekWx82q4js=; b=kaBEhkK8hgCSiozkfQwnBCGC1tTUTLW5Xcu4V6QtHSCLRruC5mzedCh1WNt2VYMIBQ FbV9b1lTLxk9vNqgWeW+MJ/vvuS/woFYYB1OSaz0mM9H/tyum4QpUjGWCITs3yW2HiQ4 60VT79BE6wsZYOC8B0yzqrsah3xh/R7qvT/C8H5+xCQpKJPrW9tkfFIydUbV60mvQVOt 6xSftFIOML/85UZ4jxSakpKAG/VhrHzCjt2R30CacszsSQz11YYzZvaHJqkJBlWBD7T/ tFgmSIDP3qHG7K44wy80+mYuXmiHe9YY/TBqvyHtBqjl5jbckufbfNtPjWkgWF5WNxo8 HGBA== X-Gm-Message-State: AKGB3mKOcEfrE0fHpscdgiiwL6nYu57m+igOmisQW+3a37VPlWkoNuBs o207Agkp7r8AmsqI087pWwodyHtCVZeUhdEVasRY8g== X-Google-Smtp-Source: ACJfBotPtm70C4+85chXneJwfuyLXYz8sYIlzH4MzAZGmEyY1z7sdA72g5wxjNZphom+9do8bAmQeNFXr1IHgmL7H1Y= X-Received: by 10.107.16.158 with SMTP id 30mr6610765ioq.291.1513394881588; Fri, 15 Dec 2017 19:28:01 -0800 (PST) MIME-Version: 1.0 Sender: wlosh@bsdimp.com Received: by 10.79.108.204 with HTTP; Fri, 15 Dec 2017 19:28:00 -0800 (PST) X-Originating-IP: [2603:300b:6:5100:1052:acc7:f9de:2b6d] In-Reply-To: References: <1fa7edde-6ac0-1d4f-e75a-503b23a5d4dc@metricspace.net> <46af04dd-8f74-b9dc-3d3a-343f022129ed@metricspace.net> From: Warner Losh Date: Fri, 15 Dec 2017 20:28:00 -0700 X-Google-Sender-Auth: OJet9z2R5MTgb7hymLkjUPBIiCE Message-ID: Subject: Re: loader.efi architecture for replacing boot1.efi To: Eric McCorkle Cc: "freebsd-arch@freebsd.org" , Warner Losh , Allan Jude Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 16 Dec 2017 03:28:02 -0000 On Fri, Dec 15, 2017 at 7:05 PM, Warner Losh wrote: > > > On Dec 15, 2017 6:43 PM, "Eric McCorkle" wrote: > > On 12/15/2017 20:09, Warner Losh wrote: > > > This should be second. Uefi variables Trump all. > > > > 2) If not, then attempt to read EFI vars to determine the boot > location > > > > 3) If no EFI vars are defined, and no partition was specified, fall > back > > to looking for an installed system on devices > > > > > > This is fine, so long as it is only on the device that the loader loaded > > from. > > It's fine if it's configurable, but there needs to be sane behavior if > the EFI vars aren't set. > > > Where do we get this info for such a broken setup? Do you have actual > examples? > > > 4) At the very last, do the legacy (what loader.efi currently does) > > behavior. > > > > > > This is bogus. It violates the uefi boot loader protocol. We must > > abandon this legacy behavior. The behavior is actively harmful since > > something random will boot. This has caused actual operational issues at > > Netflix. Guessing is really bad. > > We can't just ditch the current behavior and break everyone's existing > install, though. Legacy behavior should be supported at least until the > next major release. > > > What useful setups does this break? Absent a real example, we absolutely > are breaking this. There is a real cost to doing this that as the de facto > maintainer of stand I'm unwilling to maintain, test or commit to not > breaking. The legacy behavior is broken and has caused me hours of pain in > production. There has been no articulated use case this enables, especially > since boot loader can be interrupted to specify something in recovery > scenarios. > > > > > > Step (3) is done by attempting to stat /boot/loader.conf and > > /boot/kernel. First, all partitions on the same disk are searched, > then > > all remaining partitions are searched. > > > > This should allow mechanisms like EFI vars and command-line args to > work > > without interference from the fallback mechanisms. However, it also > > provides robustness in the face of failure modes and uninitialized > > systems (I personally ran into a problem a while back with a linux > > system, where I couldn't boot with EFI, because the EFI vars weren't > > set, because I couldn't set them if I couldn't boot with EFI; had to > use > > Shell.efi to sort out the mess...) > > > > More importantly, it provides a seamless transition from the way > things > > are now to the way we want things to be. > > > > Please provide comments and feedback. > > > > > > Please listen when I say searching all devices is actively harmful. The > > uefi boot manager, which I'm in the process of bringing in, offers a way > > to specifically say what you want to boot. If someone needs something > > complicated, they must use that moving forward. Part of what makes the > > protocol work is loaders giving up early so the next one on the list can > > be tried. > > We also have to deal with the reality that some EFI implementations are > adversarial. We have to be able to deal with implementations that make > it difficult to set EFI vars, or which mess with their values (Lenovo is > particularly notorious for this). > > You can disable fallback mechanisms with command-line args or macros or > whatever, but they need to be there. > > > No. Absent a sane use case, I refuse. Give me a reasonable use case, I > will reconsider. > > So the current behavior leads to absurd results that nobody else does, and that we don't do for legacy boot: If we boot loader.efi/boot1.efi off a hard drive, and find there's no kernel, we'll load off cdrom or a floppy if we happen to find a kernel there. That's nuts. What's more, we'll load off a different device (say a thumb drive), which is also crazy. The last thing you want is to accidentally pick the thumb drive recovery kernel that happens to be in a USB slot when you have a primary and secondary partition on two main disks, but today's behavior chooses that. It's so crazy that I can see no benefit from supporting, testing and maintaining this. If someone wants to recover a system, they can do it at the boot loader prompt now (they couldn't before). If someone really wants to boot his crazy thing, we have a new way to specify it specifically w/o any ambiguity based on how the devices might move around. We already support about 100 boot scenarios that are hard enough to test. I don't want to commit to supporting this and making it 120 or 150 once you work out all the combinatorics. We have to trim the matrix of useless things. So absent a use case that makes sense, that people are actually doing, I'm having a hard time justifying keeping it around as we transition. Warner P.S. On x86, we support geli/nogeli, gpt/mbr, ufs/zfs, and uefi/legacy/both (24 combinations). Plus we support booting off CDROM, netbooting, etc. For arm, and arm64 we have a similar number that are possible. zfs/ufs, u-boot/uefi, and mbr/gpt (plus a number of different u-boot boards). For mips we have a similar mix. Powerpc we support 4 or 6 ways. It's just too much to hope to test and ensure works. Each new thing has an non-trivial cost, and I see zero benefit from this one more thing, especially since it gets in the way of UEFI boot manager support.