Date: Tue, 17 Nov 2015 00:06:35 -0700 From: Warner Losh <imp@bsdimp.com> To: Alfred Perlstein <bright@mu.org> Cc: Elizabeth Myers <elizabeth@interlinked.me>, Anna Wilcox <AWilcox@wilcox-tech.com>, "Brian McGovern (bmcgover)" <bmcgover@cisco.com>, freebsd-arch <freebsd-arch@freebsd.org>, Marius Strobl <marius@alchemy.franken.de>, Sean Bruno <sbruno@freebsd.org>, "sparc64@freebsd.org" <sparc64@freebsd.org>, Jordan Hubbard <jkh@mail.turbofuzz.com> Subject: Re: Sparc64 doesn't care about you, and you shouldn't care about Sparc64 Message-ID: <E3676EF4-60AC-4501-A720-BA9BF923672D@bsdimp.com> In-Reply-To: <564ACCB3.4070603@mu.org> References: <563A5893.1030607@freebsd.org> <2AAC0EF3-528B-476F-BA9C-CDC3004465D0@bsdimp.com> <20151108155501.GA1901@alchemy.franken.de> <563F8385.3090603@freebsd.org> <56417100.5050600@Wilcox-Tech.com> <CANCZdfqO-SdjnonGzRr2H0pDon5oALsDGsmG3KOxPGRVdTbHPQ@mail.gmail.com> <39947478-4710-47D8-BAB1-FC93979570B6@mail.turbofuzz.com> <f4d1114833994331bd1fd2273f305abc@XCH-RTP-005.cisco.com> <5646D19C.9010304@interlinked.me> <CANCZdfoH7i9MBxjw1j4Pc3CpiZP=aP5vah2ay38cazkc7%2BreTA@mail.gmail.com> <564A889C.9070209@mu.org> <CANCZdfr2Ce3FAaAwJukcbuTvAN=1DDvEsRQHg6HdZCHamqLd_Q@mail.gmail.com> <564ACCB3.4070603@mu.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail=_FA5B2277-28B9-4431-A79E-40E7FA0627D3 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Nov 16, 2015, at 11:44 PM, Alfred Perlstein <bright@mu.org> wrote: >=20 > Warner, thanks for addressing this email. I think I wasn't clear = which lead to some misunderstanding. I'll keep this reply succinct and = the rest of it inline. Please don't take the succinctness as anything = other than getting to the point. >=20 > On 11/16/15 10:22 PM, Warner Losh wrote: >>=20 >>=20 >> On Mon, Nov 16, 2015 at 6:53 PM, Alfred Perlstein <bright@mu.org> = wrote: >>=20 >>=20 >>=20 >> On 11/14/15 9:16 AM, Warner Losh wrote: >> On Fri, Nov 13, 2015 at 11:15 PM, Elizabeth Myers = <elizabeth@interlinked.me> >> wrote: >>=20 >> You are seriously going to use "we're not NetBSD" as an argument? >>=20 >> You noticed I didn't reply to it. The argument is completely lame. = FreeBSD >> runs >> today in a variety of markets. Some new, some not so new. The thing = that >> makes >> each of these areas unique is that there's a thriving community = around them, >> FreeBSD still runs well enough on these machines to get something = done, and >> when things break, they get fixed in a timely manner. >>=20 >> Alpha was removed because it got broken by some changes, and stayed = broken >> for a long time despite repeated requests to fix it. Sparc64 is on = the cusp >> of that: >> some minor things are broken, but have been fixed. The current crisis = is >> due to >> the end of life of gcc in the tree and its fallout coupled with some >> neglect of the >> port due to time constraints. >>=20 >> At first I was all for removal. With more data, I'm less sure. If the >> promises are kept >> made in this thread, it looks to remain viable for a while, though = the lack >> of a >> qemu-user solution means that packages for a slow platform (where = they are >> really quite useful) will remain limited. Maybe there's enough = hardware >> around >> that third-party pkg repos can fill the gap, maybe not. I think we = should >> experiment >> with this model and see what it produces. Give the branching of 11 as = the >> deadline >> to show something viable... >>=20 >> One of the things I never understood about FreeBSD's method of = maintaining a port was the way the platform porting was done. We really = do things in a different manner than what my perception of other OSes = is. >>=20 >> My impression (please do correct me if I'm wrong) was that other OSes = such as NetBSD and Linux had "platform maintainers". >>=20 >> These maintainers were around to: >> 1) keep the ship sailing on those platforms >> 2) guide the general code base from becoming non-portable to other = architectures (within reason). >> 3) drive the release of the architecture in question, helping the = release engineer with image building and release testing. >>=20 >> For point 1 above, what that meant to me was that let's say Linus or = NetBSD in general wanted to do a major or minor change on a tier 1 = platform, then it was the responsibility of the *platform maintainers* = to do the work on the non tier-1 platforms to keep them up to date. = Those "platform maintainers" kept those ships sailing and in return they = got to be called "the $arch maintainer" which looks plenty good on a = resume and also feels good for those that get excited for status. >>=20 >> I'm not sure how the people that actually take care of these things = on FreeBSD differ. There are people >> recognized as go-to people for the different ports that are fairly = active in the on-going issues that come >> up with kernel code. Userland code doesn't seem to matter that much = given the platforms we support. >>=20 >> For PowerPC, you have Nathan W and Justin Hibbits. For mips, there's = Adrian, myself, Julie Mallet and a few >> others. For arm there's a long cast of characters. For PC98 there's = Takahashi-san. For sparc64 there's Marius. >> These people keep the ship sailing (or in some cases they remove the = ship form the tree). They advise >> discussions about issues that are relevant to the platform, like = cache lines and cache coherence, >=20 > This is how we diverge: >> they call people out when they break these platforms or when people = used to big systems adjust the >> tuning and break small systems. > That is not done as far as I can tell in NetBSD/Linux. In = Linux/NetBSD it is the job of the maintainer to keep the platform up to = date, not to call out when someone breaks it. >=20 > Meaning ideal: > "Oh someone broke alignment in this struct on my platform, let me ask = them how to fix it." >=20 > As opposed to (not ideal): > "Something broke my platform, let me track that guy down and make them = fix it.=E2=80=9D The person that broke it is often the best person to fix it. If they = don=E2=80=99t fix it themselves the maintainers fix it for them. It=E2=80=99s common courtesy. I think = you=E2=80=99re making too fine a distinction here to actually be useful. And if you know anything about NetBSD, you=E2=80=99ll know they do = exactly the same thing when someone does something that breaks a particular platform. The port = maintainers, who build and run the stuff the most (though they aren=E2=80=99t listed = in any file) notice and complain, often times within hours of the commit. They generally ask the = original committer to fix it, just like we do. And just like we do, those = suggestions sometimes come in the form of a patch or a general description of what to do and = why. Tell me, who are the NetBSD/hpcmips maintainers? Who are the = NetBSD/hpcarm folks that are still active? I just did a search, and couldn=E2=80=99t = find this information. You could look at the hpcmips or hpcarm trees, but they are rather quiet = these last few years, and many of the folks that contributed code there have = wandered off. >>=20 >> For point 2, let's say someone had a change that pushed some form of = *completely* non-portable code into the base which would break a = reasonable to support platform, then the "platform maintainer" would = speak up and tell the general community "uh no, you can't do X on this = platform, we need to rethink this". >>=20 >> People generally don't push this kind of code into the tree these = days. When they do, they get called >> out on it. Some of them even listen to the calling out and fix = things, others don't and one of the >> platform maintainers has to fix the stupid pushed into the tree. = Sometimes this happens right away >> and sometimes there's a lag. sometimes it's code for newer versions = of the platform that break older >> versions (or vice versa). Other times there's code from another = platform that breaks things. >>=20 >> USB is a textbook example of this happening. It went in, and didn't = worth a damn on arm or mips. >> The ports maintainers of the arm and mips platforms tried to explain = what the issues were. It took >> some time, but it got mostly worked out as the embedded folks got to = know USB issues, and hps >> got to understand the issues with embedded hardware. >>=20 >> For point 3, there may be a lag between release of the OS for tier 1 = (x86/x64) and the secondary architectures, but that is OK because the = maintainer will eventually provide images themselves or in collaboration = with the release engineer. >>=20 >> For this point, we've pushed the knowledge of how to build the images = into the release engineer. The >> folks that are around that are using the port test the images. = Sometimes it's the port maintainers, >> but recently it has been a large cast of characters for popular = platforms. >>=20 >> FreeBSD seems to take a different approach. This approach is that = someone (or some people) form a team to port to a platform. These = "platform porter" groups sole responsibility is to get a new = architecture running. After it's mostly running they are mostly without = responsibility, however we tend to give them the right of change-set = veto in perpetuity of the marginal relevance of the ported to platform. >>=20 >> so like when did this actually happen? > Well, earlier in this email you said this exact thing: >=20 >> they call people out when they break these platforms or when people = used to big systems adjust the >> tuning and break small systems. > That's how I see the difference. First step is always education. That=E2=80=99s a strength, not a = weakness. You educate people that break things because more often than not, those people didn=E2=80=99t bother to ask = for a review. Part of the education is needing to make sure people social changes appropriately. If that=E2=80=99s *ALL* maintainers did, I=E2=80=99d agree with you. = However, there=E2=80=99s much proactive education that also happens, which is exactly your number 2: everybody working together to = make sure that new changes fit will with the platform set. That is real. It happens every day. = Focusing on only one, narrow situation that happens maybe once a month and saying we=E2=80=99re doing that = wrong seems petty and wrong-headed and would actually break more than it fixes if we changed it. So because you have a world view that doesn=E2=80=99t match what=E2=80=99s= going on, you want to change what=E2=80=99s going on to match some ideal from another project when this project is = actually doing that idea? I still don=E2=80=99t get it. >=20 >>=20 >> What this means is that instead of a assigning a title and ownership = of the platform to someone, who maintains the status as "maintainer" by = keeping that platform working. By keeping the platform working I am = saying that they would do items 1, 2 and 3 from the NetBSD/Linux list. = However, instead nearly immediately hoist the "platform maintenance" = onto the general community of people that may not have access to the = hardware in question. >>=20 >> Do you have a specific example of when we've done this? As far as I = know, based on powerpc, arm and >> mips anyway, the people claiming to be maintainers are actively doing = 1, 2 and helping RE do number >> 3 to varying degrees. As far as I know, they all have access to some = or all of the hardware they are >> maintaining, and many of our power users participate in the process = as well. >=20 >>=20 >> Maybe this is just my perception, but it would seem to make a ton = more sense to follow the NetBSD/Linux model which implies a somewhat = decoupled release model (not all arches must come out on the same ) and = assigning ownership and responsibility in exchange for status based on = being the "platform maintainer". >>=20 >> So, rather than generalizations, be specific. Who do you think is = claiming to be a port maintainer, blocking >> progress and needs to be replaced? >>=20 >> And what, beyond what the re@ does today, would you do differently? = What do we gain over what we do >> for tier 1 platforms? Is there a platform wanting a release that = isn't getting one? mips has two different >> groups that have put out releases for it, with one of them fading = into the background. Adiran is making >> wifi builds available, already following this model you say we should = adopt. The japanese user groups are >> putting out PC98 releases now that the re@ has dropped them (they = never really stopped in the mean >> time). sparc64, powerpc, arm, i386 and amd64 are all released by re@. = ia64 has been removed from the >> tree. What other platforms are there? What else needs to be done. >>=20 >> Finally it would be pretty obvious when everyone steps down or just = doesn't participate in the release process that it may be time to sunset = a platform. >>=20 >> That's why we are having the conversation about sparc64. It looked = like it might no longer be >> participating in the normal process. Now, while there are some issues = that were identified with >> sparc64, some of them are real (see qemu and difficulties building in = the cluster). Some of them >> were just perception (the reduced numbers of commits to sparc64 = didn't seem to represent >> a problem with the platform and the perceived issues had been cleared = up) >>=20 >> So what, specific, actionable items do we as a project actually need = to do here? I'm sure there are some >> and that we can improve our process. I'm having trouble teasing out = what I, as someone who dabbles >> in arm and mips to varying degrees of 'maintainership' for different = parts, can do better or different. >>=20 > Three things: > 1) I am wondering if core (or the community in general) should have = some way of nominating a particular person as a platform maintainer. = This would give accolades to that person and at the same time give us a = point person. I believe part of the problem is that we don't give = enough status to the port maintainers, are they on the website? How = would I know who is the "king of mips" right now? Does someone get to = put in on their resume with backing from the project? If not, then the = maintainer will be grumbly as opposed to facilitating. We don=E2=80=99t have a =E2=80=9Cking of the kernel=E2=80=9D or =E2=80=9Ck= ing of mips=E2=80=9D or anything like that. We document that you send = mail to mips@ when you don=E2=80=99t know the right person to send. As for status reports: I agree with you on that. As for putting things on one=E2=80=99s resume: I never needed the = project=E2=80=99s blessing to claim to have done a lot with = FreeBSD/mips, FreeBSD/arm, CardBus, PC Card, SD, MMC, PCI, etc. I just = did it. The lack of a stamp from the project hasn=E2=80=99t really been an issue. > 2) The role of platform maintainer needs not to be a blocking role, = but rather a continual porting role. Specifically to avoid the "calling = out" (sorry for the quotes here, but just want to get the point across) = and instead function as a do-er/facilitator. Meaning if something = breaks a secondary platform it's a shared responsibility of the port = maintainer and developer to fix it, but solely the developer. To the best of my knowledge, it isn=E2=80=99t a blocking role today. All = the people working on arm, mips and powerpc that I interact with already do that. Rather than fix the stupid mistake = people make, educating them not to make them again in the future is the best way to keep them from repeating. That = sure as hell sounds like a shared responsibility to me. I fail to see any evidence to support your assertion that port = maintainers just whine when things break. I don=E2=80=99t see that in the commit logs (although they are full of hundreds of = commits from people that do complain). I don=E2=80=99t see that in the interactions I=E2=80=99ve had. > 3) Tight coupling of the system. While it's good to cross train = release engineering with various platforms and even more so it's = REALLY great that this is being well received, the end result is a very = tight coupling of the system that can lead to frustrations and logjams = when things go wrong. Something that FreeBSD very often gets wrong is = the strong unification of its various parts, you can see this in so many = aspects of how we do things (releases, VFS, platforms) as opposed to = other projects. It is an admirable aspiration, however it results in = much lock-step of the entire project and doesn't scale. I can=E2=80=99t recall any release ever being held up for longer than it = takes to build on the slowest hardware the RE has used to cut the release. I can recall many times a release wasn=E2=80=99t = held, or I wasn=E2=80=99t allowed to put changes into a secondary platform because they might affect the primary one too close = to a release. Absent any evidence of when, where and how these things happen, I=E2=80=99m having a hard time = seeing that the actual situation on the ground matches the supposed one you are complaining about, at least as it comes = to other platforms. What logjams are they causing? I=E2=80=99ve not seen any. Do you have specific examples? Warner --Apple-Mail=_FA5B2277-28B9-4431-A79E-40E7FA0627D3 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJWStH7AAoJEGwc0Sh9sBEAtKUQAMc5aDF/jmGZfV3/wwvQWdrD Kv34ABnKF2i2kpvW7ADWrLuetZ/JFmrKOX8Szru9JGDHF/Rd63vXxJuLjv5bX/5n sawlGo4MhNczAeqg2xmYUT2KXkNi9qTnX7DgNb960iLg7VM65PiTX2yj/CXoYGss Kc3b4AyCyv4TG3PYyHZj1WtIOdlNV/TVuCT1fcRitufItI4RbkkxKul4HuCZqN9u PNeh1eQ50gbJIEissRgLTLJKwhaeLs0Iqma6nGFbFUnTfeelCu+cgyhHjg6t4jQn wJaqzuEVIyIlhFhnO1z9NzmB0tezqiGfrPW7fgCAV2oiNPSWp5vhLiztpRK9DBRf oyhr2EHbZBuPnvhX7e5G869QkrAgLUAr0wARC+/KIjN+3cg2Vq2bS+NupiRTc9wc Qnq4OFt7Znkll+uMjuYtuNbLhY894WJ81dh8mUUbOItjkyOcExgzYnhIoRbacsU3 VQTsCfVW+UU/KVhwua/9OrxkF/n0GEq3lyjsbEt0fjDICltiWsuWCCISrZJS/RR/ gUPPbDrUjIsUSl/F7glr+iIVgJhMwnQGGia+SZ6tiA4uxKZWAMCx7tZ5a254UQfc 4O2fMyLsvexKDURCOvZcxu/TURVJ0THccq8/j4DpUdFoFqQCsstdtlbn4/KAtGix 14dWVaouwbiWgBjhJ9+E =GPrS -----END PGP SIGNATURE----- --Apple-Mail=_FA5B2277-28B9-4431-A79E-40E7FA0627D3--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E3676EF4-60AC-4501-A720-BA9BF923672D>