Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 Nov 2015 00:06:35 -0700
From:      Warner Losh <imp@bsdimp.com>
To:        Alfred Perlstein <bright@mu.org>
Cc:        Elizabeth Myers <elizabeth@interlinked.me>, Anna Wilcox <AWilcox@wilcox-tech.com>, "Brian McGovern (bmcgover)" <bmcgover@cisco.com>, freebsd-arch <freebsd-arch@freebsd.org>, Marius Strobl <marius@alchemy.franken.de>, Sean Bruno <sbruno@freebsd.org>, "sparc64@freebsd.org" <sparc64@freebsd.org>, Jordan Hubbard <jkh@mail.turbofuzz.com>
Subject:   Re: Sparc64 doesn't care about you, and you shouldn't care about Sparc64
Message-ID:  <E3676EF4-60AC-4501-A720-BA9BF923672D@bsdimp.com>
In-Reply-To: <564ACCB3.4070603@mu.org>
References:  <563A5893.1030607@freebsd.org> <2AAC0EF3-528B-476F-BA9C-CDC3004465D0@bsdimp.com> <20151108155501.GA1901@alchemy.franken.de> <563F8385.3090603@freebsd.org> <56417100.5050600@Wilcox-Tech.com> <CANCZdfqO-SdjnonGzRr2H0pDon5oALsDGsmG3KOxPGRVdTbHPQ@mail.gmail.com> <39947478-4710-47D8-BAB1-FC93979570B6@mail.turbofuzz.com> <f4d1114833994331bd1fd2273f305abc@XCH-RTP-005.cisco.com> <5646D19C.9010304@interlinked.me> <CANCZdfoH7i9MBxjw1j4Pc3CpiZP=aP5vah2ay38cazkc7%2BreTA@mail.gmail.com> <564A889C.9070209@mu.org> <CANCZdfr2Ce3FAaAwJukcbuTvAN=1DDvEsRQHg6HdZCHamqLd_Q@mail.gmail.com> <564ACCB3.4070603@mu.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_FA5B2277-28B9-4431-A79E-40E7FA0627D3
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8


> On Nov 16, 2015, at 11:44 PM, Alfred Perlstein <bright@mu.org> wrote:
>=20
> Warner, thanks for addressing this email.  I think I wasn't clear =
which lead to some misunderstanding.  I'll keep this reply succinct and =
the rest of it inline.  Please don't take the succinctness as anything =
other than getting to the point.
>=20
> On 11/16/15 10:22 PM, Warner Losh wrote:
>>=20
>>=20
>> On Mon, Nov 16, 2015 at 6:53 PM, Alfred Perlstein <bright@mu.org> =
wrote:
>>=20
>>=20
>>=20
>> On 11/14/15 9:16 AM, Warner Losh wrote:
>> On Fri, Nov 13, 2015 at 11:15 PM, Elizabeth Myers =
<elizabeth@interlinked.me>
>> wrote:
>>=20
>> You are seriously going to use "we're not NetBSD" as an argument?
>>=20
>> You noticed I didn't reply to it. The argument is completely lame. =
FreeBSD
>> runs
>> today in a variety of markets. Some new, some not so new. The thing =
that
>> makes
>> each of these areas unique is that there's a thriving community =
around them,
>> FreeBSD still runs well enough on these machines to get something =
done, and
>> when things break, they get fixed in a timely manner.
>>=20
>> Alpha was removed because it got broken by some changes, and stayed =
broken
>> for a long time despite repeated requests to fix it. Sparc64 is on =
the cusp
>> of that:
>> some minor things are broken, but have been fixed. The current crisis =
is
>> due to
>> the end of life of gcc in the tree and its fallout coupled with some
>> neglect of the
>> port due to time constraints.
>>=20
>> At first I was all for removal. With more data, I'm less sure. If the
>> promises are kept
>> made in this thread, it looks to remain viable for a while, though =
the lack
>> of a
>> qemu-user solution means that packages for a slow platform (where =
they are
>> really quite useful) will remain limited. Maybe there's enough =
hardware
>> around
>> that third-party pkg repos can fill the gap, maybe not. I think we =
should
>> experiment
>> with this model and see what it produces. Give the branching of 11 as =
the
>> deadline
>> to show something viable...
>>=20
>> One of the things I never understood about FreeBSD's method of =
maintaining a port was the way the platform porting was done.  We really =
do things in a different manner than what my perception of other OSes =
is.
>>=20
>> My impression (please do correct me if I'm wrong) was that other OSes =
such as NetBSD and Linux had "platform maintainers".
>>=20
>> These maintainers were around to:
>> 1) keep the ship sailing on those platforms
>> 2) guide the general code base from becoming non-portable to other =
architectures (within reason).
>> 3) drive the release of the architecture in question, helping the =
release engineer with image building and release testing.
>>=20
>> For point 1 above, what that meant to me was that let's say Linus or =
NetBSD in general wanted to do a major or minor change on a tier 1 =
platform, then it was the responsibility of the *platform maintainers* =
to do the work on the non tier-1 platforms to keep them up to date.  =
Those "platform maintainers" kept those ships sailing and in return they =
got to be called "the $arch maintainer" which looks plenty good on a =
resume and also feels good for those that get excited for status.
>>=20
>> I'm not sure how the people that actually take care of these things =
on FreeBSD differ.  There are people
>> recognized as go-to people for the different ports that are fairly =
active in the on-going issues that come
>> up with kernel code. Userland code doesn't seem to matter that much =
given the platforms we support.
>>=20
>> For PowerPC, you have Nathan W and Justin Hibbits. For mips, there's =
Adrian, myself, Julie Mallet and a few
>> others. For arm there's a long cast of characters. For PC98 there's =
Takahashi-san. For sparc64 there's Marius.
>> These people keep the ship sailing (or in some cases they remove the =
ship form the tree). They advise
>> discussions about issues that are relevant to the platform, like =
cache lines and cache coherence,
>=20
> This is how we diverge:
>> they call people out when they break these platforms or when people =
used to big systems adjust the
>> tuning and break small systems.
> That is not done as far as I can tell in NetBSD/Linux.   In =
Linux/NetBSD it is the job of the maintainer to keep the platform up to =
date, not to call out when someone breaks it.
>=20
> Meaning ideal:
> "Oh someone broke alignment in this struct on my platform, let me ask =
them how to fix it."
>=20
> As opposed to (not ideal):
> "Something broke my platform, let me track that guy down and make them =
fix it.=E2=80=9D

The person that broke it is often the best person to fix it. If they =
don=E2=80=99t fix it themselves
the maintainers fix it for them. It=E2=80=99s common courtesy. I think =
you=E2=80=99re making too fine a
distinction here to actually be useful.

And if you know anything about NetBSD, you=E2=80=99ll know they do =
exactly the same thing
when someone does something that breaks a particular platform. The port =
maintainers,
who build and run the stuff the most (though they aren=E2=80=99t listed =
in any file) notice and
complain, often times within hours of the commit. They generally ask the =
original
committer to fix it, just like we do. And just like we do, those =
suggestions sometimes
come in the form of a patch or a general description of what to do and =
why.

Tell me, who are the NetBSD/hpcmips maintainers? Who are the =
NetBSD/hpcarm
folks that are still active? I just did a search, and couldn=E2=80=99t =
find this information. You
could look at the hpcmips or hpcarm trees, but they are rather quiet =
these last
few years, and many of the folks that contributed code there have =
wandered off.


>>=20
>> For point 2, let's say someone had a change that pushed some form of =
*completely* non-portable code into the base which would break a =
reasonable to support platform, then the "platform maintainer" would =
speak up and tell the general community "uh no, you can't do X on this =
platform, we need to rethink this".
>>=20
>> People generally don't push this kind of code into the tree these =
days. When they do, they get called
>> out on it. Some of them even listen to the calling out and fix =
things, others don't and one of the
>> platform maintainers has to fix the stupid pushed into the tree. =
Sometimes this happens right away
>> and sometimes there's a lag. sometimes it's code for newer versions =
of the platform that break older
>> versions (or vice versa). Other times there's code from another =
platform that breaks things.
>>=20
>> USB is a textbook example of this happening. It went in, and didn't =
worth a damn on arm or mips.
>> The ports maintainers of the arm and mips platforms tried to explain =
what the issues were. It took
>> some time, but it got mostly worked out as the embedded folks got to =
know USB issues, and hps
>> got to understand the issues with embedded hardware.
>>=20
>> For point 3, there may be a lag between release of the OS for tier 1 =
(x86/x64) and the secondary architectures, but that is OK because the =
maintainer will eventually provide images themselves or in collaboration =
with the release engineer.
>>=20
>> For this point, we've pushed the knowledge of how to build the images =
into the release engineer. The
>> folks that are around that are using the port test the images. =
Sometimes it's the port maintainers,
>> but recently it has been a large cast of characters for popular =
platforms.
>>=20
>> FreeBSD seems to take a different approach.  This approach is that =
someone (or some people) form a team to port to a platform.  These =
"platform porter" groups sole responsibility is to get a new =
architecture running.  After it's mostly running they are mostly without =
responsibility, however we tend to give them the right of change-set =
veto in perpetuity of the marginal relevance of the ported to platform.
>>=20
>> so like when did this actually happen?
> Well, earlier in this email you said this exact thing:
>=20
>> they call people out when they break these platforms or when people =
used to big systems adjust the
>> tuning and break small systems.
> That's how I see the difference.

First step is always education. That=E2=80=99s a strength, not a =
weakness. You educate people that break things
because more often than not, those people didn=E2=80=99t bother to ask =
for a review. Part of the education is
needing to make sure people social changes appropriately.

If that=E2=80=99s *ALL* maintainers did, I=E2=80=99d agree with you. =
However, there=E2=80=99s much proactive education that also
happens, which is exactly your number 2: everybody working together to =
make sure that new changes
fit will with the platform set. That is real. It happens every day. =
Focusing on only one, narrow situation
that happens maybe once a month and saying we=E2=80=99re doing that =
wrong seems petty and wrong-headed
and would actually break more than it fixes if we changed it.

So because you have a world view that doesn=E2=80=99t match what=E2=80=99s=
 going on, you want to change what=E2=80=99s
going on to match some ideal from another project when this project is =
actually doing that idea? I
still don=E2=80=99t get it.

>=20
>>=20
>> What this means is that instead of a assigning a title and ownership =
of the platform to someone, who maintains the status as "maintainer" by =
keeping that platform working.  By keeping the platform working I am =
saying that they would do items 1, 2 and 3 from the NetBSD/Linux list.  =
However, instead nearly immediately hoist the "platform maintenance" =
onto the general community of people that may not have access to the =
hardware in question.
>>=20
>> Do you have a specific example of when we've done this? As far as I =
know, based on powerpc, arm and
>> mips anyway, the people claiming to be maintainers are actively doing =
1, 2 and helping RE do number
>> 3 to varying degrees. As far as I know, they all have access to some =
or all of the hardware they are
>> maintaining, and many of our power users participate in the process =
as well.
>=20
>>=20
>> Maybe this is just my perception, but it would seem to make a ton =
more sense to follow the NetBSD/Linux model which implies a somewhat =
decoupled release model (not all arches must come out on the same ) and =
assigning ownership and responsibility in exchange for status based on =
being the "platform maintainer".
>>=20
>> So, rather than generalizations, be specific. Who do you think is =
claiming to be a port maintainer, blocking
>> progress and needs to be replaced?
>>=20
>> And what, beyond what the re@ does today, would you do differently? =
What do we gain over what we do
>> for tier 1 platforms? Is there a platform wanting a release that =
isn't getting one? mips has two different
>> groups that have put out releases for it, with one of them fading =
into the background. Adiran is making
>> wifi builds available, already following this model you say we should =
adopt. The japanese user groups are
>> putting out PC98 releases now that the re@ has dropped them (they =
never really stopped in the mean
>> time). sparc64, powerpc, arm, i386 and amd64 are all released by re@. =
ia64 has been removed from the
>> tree. What other platforms are there? What else needs to be done.
>>=20
>> Finally it would be pretty obvious when everyone steps down or just =
doesn't participate in the release process that it may be time to sunset =
a platform.
>>=20
>> That's why we are having the conversation about sparc64. It looked =
like it might no longer be
>> participating in the normal process. Now, while there are some issues =
that were identified with
>> sparc64, some of them are real (see qemu and difficulties building in =
the cluster). Some of them
>> were just perception (the reduced numbers of commits to sparc64 =
didn't seem to represent
>> a problem with the platform and the perceived issues had been cleared =
up)
>>=20
>> So what, specific, actionable items do we as a project actually need =
to do here? I'm sure there are some
>> and that we can improve our process. I'm having trouble teasing out =
what I, as someone who dabbles
>> in arm and mips to varying degrees of 'maintainership' for different =
parts, can do better or different.
>>=20
> Three things:
> 1) I am wondering if core (or the community in general) should have =
some way of nominating a particular person as a platform     maintainer. =
 This would give accolades to that person and at the same time give us a =
point person.  I believe part of the problem is that we don't give =
enough status to the port maintainers, are they on the website?  How =
would I know who is the "king of mips" right now?  Does someone get to =
put in on their resume with backing from the project?  If not, then the =
maintainer will be grumbly as opposed to facilitating.

We don=E2=80=99t have a =E2=80=9Cking of the kernel=E2=80=9D or =E2=80=9Ck=
ing of mips=E2=80=9D or anything like that. We document that you send =
mail to mips@
when you don=E2=80=99t know the right person to send.

As for status reports: I agree with you on that.

As for putting things on one=E2=80=99s resume: I never needed the =
project=E2=80=99s blessing to claim to have done a lot with =
FreeBSD/mips, FreeBSD/arm, CardBus, PC Card, SD, MMC, PCI, etc. I just =
did it. The lack of a stamp from the project hasn=E2=80=99t really been
an issue.

> 2) The role of platform maintainer needs not to be a blocking role, =
but rather a continual porting role.  Specifically to avoid the "calling =
out" (sorry for the quotes here, but just want to get the point across) =
and instead function as a do-er/facilitator.  Meaning if something =
breaks a secondary platform it's a shared responsibility of the port =
maintainer and developer to fix it, but     solely the developer.

To the best of my knowledge, it isn=E2=80=99t a blocking role today. All =
the people working on arm, mips and powerpc that
I interact with already do that. Rather than fix the stupid mistake =
people make, educating them not to make them
again in the future is the best way to keep them from repeating. That =
sure as hell sounds like a shared responsibility
to me. I fail to see any evidence to support your assertion that port =
maintainers just whine when things break. I don=E2=80=99t
see that in the commit logs (although they are full of hundreds of =
commits from people that do complain). I don=E2=80=99t
see that in the interactions I=E2=80=99ve had.

> 3) Tight coupling of the system.  While it's good to cross train =
release engineering with various platforms and even more so it's     =
REALLY great that this is being well received, the end result is a very =
tight coupling of the system that can lead to frustrations and logjams =
when things go wrong.  Something that FreeBSD very often gets wrong is =
the strong unification of its various parts, you can see this in so many =
aspects of how we do things (releases, VFS, platforms) as opposed to =
other projects.  It is an admirable aspiration, however it results in =
much lock-step of the entire project and doesn't scale.

I can=E2=80=99t recall any release ever being held up for longer than it =
takes to build on the slowest hardware the RE has
used to cut the release. I can recall many times a release wasn=E2=80=99t =
held, or I wasn=E2=80=99t allowed to put changes into a
secondary platform because they might affect the primary one too close =
to a release. Absent any evidence of
when, where and how these things happen, I=E2=80=99m having a hard time =
seeing that the actual situation on the ground
matches the supposed one you are complaining about, at least as it comes =
to other platforms. What logjams are
they causing? I=E2=80=99ve not seen any. Do you have specific examples?

Warner

--Apple-Mail=_FA5B2277-28B9-4431-A79E-40E7FA0627D3
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename=signature.asc
Content-Type: application/pgp-signature;
	name=signature.asc
Content-Description: Message signed with OpenPGP using GPGMail

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJWStH7AAoJEGwc0Sh9sBEAtKUQAMc5aDF/jmGZfV3/wwvQWdrD
Kv34ABnKF2i2kpvW7ADWrLuetZ/JFmrKOX8Szru9JGDHF/Rd63vXxJuLjv5bX/5n
sawlGo4MhNczAeqg2xmYUT2KXkNi9qTnX7DgNb960iLg7VM65PiTX2yj/CXoYGss
Kc3b4AyCyv4TG3PYyHZj1WtIOdlNV/TVuCT1fcRitufItI4RbkkxKul4HuCZqN9u
PNeh1eQ50gbJIEissRgLTLJKwhaeLs0Iqma6nGFbFUnTfeelCu+cgyhHjg6t4jQn
wJaqzuEVIyIlhFhnO1z9NzmB0tezqiGfrPW7fgCAV2oiNPSWp5vhLiztpRK9DBRf
oyhr2EHbZBuPnvhX7e5G869QkrAgLUAr0wARC+/KIjN+3cg2Vq2bS+NupiRTc9wc
Qnq4OFt7Znkll+uMjuYtuNbLhY894WJ81dh8mUUbOItjkyOcExgzYnhIoRbacsU3
VQTsCfVW+UU/KVhwua/9OrxkF/n0GEq3lyjsbEt0fjDICltiWsuWCCISrZJS/RR/
gUPPbDrUjIsUSl/F7glr+iIVgJhMwnQGGia+SZ6tiA4uxKZWAMCx7tZ5a254UQfc
4O2fMyLsvexKDURCOvZcxu/TURVJ0THccq8/j4DpUdFoFqQCsstdtlbn4/KAtGix
14dWVaouwbiWgBjhJ9+E
=GPrS
-----END PGP SIGNATURE-----

--Apple-Mail=_FA5B2277-28B9-4431-A79E-40E7FA0627D3--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E3676EF4-60AC-4501-A720-BA9BF923672D>