Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 4 Jan 2019 01:17:52 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        mmel@freebsd.org
Cc:        Dennis Clarke <dclarke@blastwave.org>, freebsd-arm@freebsd.org, FreeBSD Current <freebsd-current@freebsd.org>
Subject:   Re: A reliable port cross-build failure (hangup) in my context (amd64->armv7 cross build, with native-tool speedup involved)
Message-ID:  <F6A047AF-D7E2-4298-86A2-02DE2ABA0D17@yahoo.com>
In-Reply-To: <ddae0990-f68b-549e-18da-1388e772a3ff@freebsd.org>
References:  <FF9B4284-4E6B-4D36-86A0-18861B527AC0@yahoo.com> <865A13C8-9749-486E-9F79-5EEDDECBE621@yahoo.com> <0154C3AC-D85B-4FCF-BA63-454BC26BC1A2@yahoo.com> <A6A58CE3-062B-4B79-A8C2-ADFDAA04C6AF@yahoo.com> <13f5e4dd-33fb-2170-e31a-1b5d5f155869@freebsd.org> <ABA957EA-B8EE-4B8C-9C2F-B745BA652BF6@yahoo.com> <2E3F6196-4652-40D2-937F-8860B6005A35@yahoo.com> <d99f28fd-5c6c-db6f-2d78-9ea6a697af2e@blastwave.org> <ddae0990-f68b-549e-18da-1388e772a3ff@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help


On 2019-Jan-3, at 22:56, Michal Meloun <melounmichal at gmail.com> wrote:

> On 29.12.2018 18:47, Dennis Clarke wrote:
>> On 12/28/18 9:56 PM, Mark Millard via freebsd-arm wrote:
>>> 
>>> On 2018-Dec-28, at 12:12, Mark Millard <marklmi at yahoo.com> wrote:
>>> 
>>>> On 2018-Dec-28, at 05:13, Michal Meloun <melounmichal at gmail.com>
>>>> wrote:
>>>> 
>>>>> Mark,
>>>>> this is known problem with qemu-user-static.
>>>>> Emulation of every single interruptible syscall is broken by design (it
>>>>> have signal related races). Theses races cannot be solved without major
>>>>> rewrite of syscall emulation code.
>>>>> Unfortunately, nobody actively works on this, I think.
>>>>> 
>> 
>> Following along here quietly and I had to blink at this a few times.
>> Is there a bug report somewhere within the qemu world related to this
>>  'broken by design' qemu feature?
> 
> Firstly, I apologize for late answer. Writing a technically accurate but
> still comprehensible report is extremely difficult for me.

Thanks for doing so.

> . . .
> Mark, I hope that this is also the answer to your question posted to
> hackers@ and also the exploitation why you see hang.

Again thanks: it was helpful for my gaining some understanding of
the code structure.

But it turns out that another of your list of problems is involved
in the hang-up:

> . . .
> - and last major one. At this time, all guest structures are maintained
> by hand. Due to huge amount of these structures, this is the extreme
> error prone approach.  We should convert this to script generated code,
> including guest syscalls definition.

It turns out that "struct target_cmsghdr" has the wrong overall size,
the wrong first field size, and the wrong offsets for later fields
for amd64->aarch64 use (or likely any 64-bit->64-bit host-target
pair, even amd64->x86_64). In fact the code reports via:

          gemu_log("Unsupported ancillary data: %d/%d\n",
              cmsg->cmsg_level, cmsg->cmsg_type);


because of msg->cmsg_level and cmsg->cmsg_type ending up with
messed up values. It hangs after that message shows up. The
more complete code containing that qemu_log call is:

      if ((cmsg->cmsg_level == TARGET_SOL_SOCKET) &&
          (cmsg->cmsg_type == SCM_RIGHTS)) {
          int *fd = (int *)data;
          int *target_fd = (int *)target_data;
          int i, numfds = len / sizeof(int);

          for (i = 0; i < numfds; i++) {
              fd[i] = tswap32(target_fd[i]);
          }
      } else if ((cmsg->cmsg_level == TARGET_SOL_SOCKET) &&
          (cmsg->cmsg_type == SCM_TIMESTAMP) &&
          (len == sizeof(struct timeval)))  {
          /* copy struct timeval to host */
          struct timeval *tv = (struct timeval *)data;
          struct target_freebsd_timeval *target_tv =
              (struct target_freebsd_timeval *)target_data;
          __get_user(tv->tv_sec, &target_tv->tv_sec);
          __get_user(tv->tv_usec, &target_tv->tv_usec);
      } else {
          gemu_log("Unsupported ancillary data: %d/%d\n",
              cmsg->cmsg_level, cmsg->cmsg_type);
          memcpy(data, target_data, len);
      }

Of 3 types of hangups that I've run into recently, one was from a
missing statement, one was from struct target_kevent having the
wrong overall size and wrong field offsets after the first field
(amd64->armv7 was an example), and the one involving struct
target_cmsghdr above. (There may be more to the target_cmsghdr
one.)

> Again, my apology for slightly (or much) chaotic report, but this is the
> best what's I capable.

Not chaotic in my view.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?F6A047AF-D7E2-4298-86A2-02DE2ABA0D17>