Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 May 2022 16:42:39 +0100
From:      Marco Devesas Campos <devesas.campos@gmail.com>
To:        Warner Losh <imp@bsdimp.com>
Cc:        Ronald Klop <ronald-lists@klop.ws>,  "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org>
Subject:   Re: [PATCH] Experimental vchiq and bcm2835_audio support for arm64
Message-ID:  <CADOynoTLzJfwHUkOzGeUWsY-V3GCSji9b3_JLo-xjvZRg=ianw@mail.gmail.com>
In-Reply-To: <CANCZdfpvUcmOu9KmpdXMOhmqabt1iS9wEKfqg%2B3JMHQVQNtOXA@mail.gmail.com>
References:  <A0775CDC-7382-4A15-8131-482572032308@gmail.com> <a02d8dd2-020a-3125-3418-08f0a069aa5e@klop.ws> <8EC05647-00D9-455B-98A9-B83A33DDFC5D@gmail.com> <48190d6a-fc5d-7da9-ddfd-fded48d429db@klop.ws> <106195874.50.1644310150579@localhost> <E7561C63-D0DF-4F38-9101-12B0D473982E@gmail.com> <CANCZdfoS0TaGkBWSkbzdisr_MJt6chZts2xBRJ5GQXjXoVaENA@mail.gmail.com> <op.1ibeyzzokndu52@joepie> <87A63A19-5807-4BA9-9821-D3378129CDB5@gmail.com> <CANCZdfpvUcmOu9KmpdXMOhmqabt1iS9wEKfqg%2B3JMHQVQNtOXA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Warner, and List,

so, quite clearly, this ended up not being a two day job...

On the other hand the vchiq code now not only works with the
bcm2835_audio driver but should be on par, feature-wise, with
the existing 32 bit code.

To wit, the patch below

  * updates vchiq and bcm2835_audio to work on 64 bit pi's

  * implements compat_freebsd32 ioctl calls so that 32 bit apps can
    be run on a 64 bit system -- including omxplayer*

  * fixes a few panics, stalls, busy waits and data corruptions

In the process of debugging things I also got the userland utilities
to work on arm64 and if this gets accepted for inclusion I'll update
the port.

On the issues that remain, audio play stalls intermittently after
a day or two of inactivity**; and running vchiq_test -p in parallel
with audio or video playing will result in stammering.

Anyway, output of git format-patch below.

Best,
Marco

* although the pi4 needs a special 32 bit version, not the one
from ports -- i'll get that out if there's interest

** workaround is to use the sysctl to change the output dest and
then change it back




>From 89f464839efca9483eabca454db0d78495e2f4ac Mon Sep 17 00:00:00 2001
From: Marco Devesas Campos <devesas.campos@gmailcom>
Date: Wed, 11 May 2022 15:19:41 +0100
Subject: [PATCH] arm64: Add support to vchiq and bcm2835_audio (plus some
 fixes)

Add 64 bit support to vchiq:
    * update fields to the appropriate fixed bit-size variants
      (everywhere [cf. e.g., ref:sizes and ref:sizes2])
    * update printfs to account for said sizes (everywhere)
    * update printfs to the different size of pointers (everywhere)
    * refer to event semaphores (that go into the very 32 bit VC) by
      offset instead of pointers [ref:sems]
    * dsb() is dsb(sy) in arm64 (vchiq_{core.c,core.h,kmod.c}) [ref:dsb]
    * comment out some unneeded code in parse_rx_slots around
      VCHIQ_MSG_BULK_RX (cf. [ref:deadcode])
    * adapt remote_event_signal to arm64 caching behaviours (vchiq_kmod.c)
    * refactor synchronization around remote_event_signal, forcing a
      wmb to be on the safe side; thereby make it look more like what linux
      does [ref:sync] (vchiq_{core,kmod}.c); and make a comment in
      vchiq_core.c true (wasn't before)
    * add a few more syncs to be on the safe side (vchiq_2835_arm.c)

    * use arm64 dcache invalidation mechanisms (vchiq_2835_arm.c)
    * explicitly invalidate pages on arm64 post bulk-read (vchiq_2835_arm.c=
)
    * support bulk transfers on rpi-4 (aka "long address space"
      transfers), by hard-coding their vc offset (0) and different
      bit-shift [ref:longbulk]  (vchiq_2835_arm.c)
    * refactor a loop-of-constant-test (vchiq_2835_arm.c)
    * use the correct (hard-coded) cache-line size on arm64

    * rework the handling of chipset "features" to account for the
      extra behaviours with 64 bit chipsets. (vchiq_kmod.c)
    * add compat_freebsd32 ioctls and respective datatypes.
      (vchiq_arm.c, vchiq_ioctl.h)
    * add sysctl-s (log, arm_log) to control debug (vchiq_kmod.c)

    * add example kernel config (GENERIC-VCHIQ)

Fixes:
    * Rework error handling in create_pagelist, avoiding a potential
      panic when freeing memory that had been dmamem_alloc, a potential
      null dereference, and a leak when having problems pinning pages
      (vchiq_2835_arm.c)
    * fix a confusion about the behaviour cv_wait_sig that lead to
      uninterruptible looping (vchiq_bsd.c)
    * implement detection of fatal signals (vchiq_bsd.c)
    * fix a confusion with the name of a variable introduced by
      #a0b8746 that could lead to a panic when closing the cdev file
      (vchiq_arm.c)
    * release user connection when destructing cdevpriv and avoid
      user processes sharing connection data, which lead to stalls
      and data corruption.  (vchiq_arm.c)

Update bcm2835_audio to work on 64bit systems:
    * update VC audio fields (vc_vchi_audioserv_defs.h, bcm2835_audio.c)
    * repurpose the hitherto unused `callback` field to help push a 64 bit
      pointer in (bcm2835_audio.c)
    * increase (hopefully) the robustness of the code that shifts data to
      VC (bcm2835_audio.c)
    * add a sysctl to control the amount of debugging info output by
      bcm2835_audio.c

Tested on armv6 userspace in zero2 and 4+ with ping, functional, bulk
and control vchiq_test-s, and omxplayer

    [ref:dsb]: https://github.com/raspberrypi/linux/commit/35b7ebda57affcfd=
3616d39d5a727a4495b31123
    [ref:sems]:
https://github.com/raspberrypi/linux/commit/24a4262afb10907fce3cdbc3ae336fc=
f4cdaece5
    [ref:sizes]:
https://github.com/raspberrypi/linux/commit/e64568b8ea6c04e747e432c17ce2452=
652075216
    [ref:sizes2]:
https://github.com/raspberrypi/linux/commit/f9bee6dd24addfa00c2c8d50c25b73e=
fbfbb28ba
    [ref:deadcode]:
https://github.com/raspberrypi/linux/commit/14f4d72fb799a9b3170a45ab80d4a3d=
dad541960
    [ref:sync]:
https://github.com/raspberrypi/linux/commit/51c071265079319583e4c6e8c61e096=
60300d0bf
    [ref:longbulk]:
https://github.com/raspberrypi/linux/commit/37f6f19a83722c9b866cecb5e455b2e=
16e5bbc6b
---
 sys/arm/broadcom/bcm2835/bcm2835_audio.c      | 248 +++++++--
 .../broadcom/bcm2835/vc_vchi_audioserv_defs.h |   8 +-
 sys/arm64/conf/GENERIC-VCHIQ                  |  23 +
 sys/contrib/vchiq/interface/compat/vchi_bsd.c |  12 +-
 .../interface/vchiq_arm/vchiq_2835_arm.c      | 159 +++++-
 .../vchiq/interface/vchiq_arm/vchiq_arm.c     | 494 +++++++++++++-----
 .../vchiq/interface/vchiq_arm/vchiq_core.c    | 285 +++++-----
 .../vchiq/interface/vchiq_arm/vchiq_core.h    |  11 +-
 .../vchiq/interface/vchiq_arm/vchiq_ioctl.h   | 121 +++++
 .../interface/vchiq_arm/vchiq_kern_lib.c      |   8 +-
 .../vchiq/interface/vchiq_arm/vchiq_kmod.c    |  76 ++-
 .../interface/vchiq_arm/vchiq_pagelist.h      |   8 +-
 .../vchiq/interface/vchiq_arm/vchiq_shim.c    |   4 +-
 13 files changed, 1100 insertions(+), 357 deletions(-)
 create mode 100644 sys/arm64/conf/GENERIC-VCHIQ

diff --git a/sys/arm/broadcom/bcm2835/bcm2835_audio.c
b/sys/arm/broadcom/bcm2835/bcm2835_audio.c
index 36b1dc86535b..8d978bc20f85 100644
--- a/sys/arm/broadcom/bcm2835/bcm2835_audio.c
+++ b/sys/arm/broadcom/bcm2835/bcm2835_audio.c
@@ -27,6 +27,10 @@
 #include "opt_snd.h"
 #endif

+/*
+    For the PRIu64 identifier
+*/
+#include <machine/_inttypes.h>
 #include <dev/sound/pcm/sound.h>
 #include <dev/sound/chip.h>

@@ -116,6 +120,12 @@ struct bcm2835_audio_chinfo {
  uint64_t retrieved_samples;
  uint64_t underruns;
  int starved;
+ struct bcm_log_vars {
+ unsigned int bsize ;
+ int slept_for_lack_of_space ;
+ } log_vars;
+#define DEFAULT_LOG_VALUES \
+ ((struct bcm_log_vars) { .bsize =3D 0 , .slept_for_lack_of_space =3D 0 })
 };

 struct bcm2835_audio_info {
@@ -135,6 +145,7 @@ struct bcm2835_audio_info {

  uint32_t flags_pending;

+ int verbose_trace;
  /* Worker thread state */
  int worker_state;
 };
@@ -143,6 +154,35 @@ struct bcm2835_audio_info {
 #define BCM2835_AUDIO_LOCKED(sc) mtx_assert(&(sc)->lock, MA_OWNED)
 #define BCM2835_AUDIO_UNLOCK(sc) mtx_unlock(&(sc)->lock)

+/* things that really have to be reported */
+#define REPORT_ERROR(sc,...) \
+ do{ device_printf((sc)->dev,__VA_ARGS__); }while(0)
+/* things that shouldn't clobber the output */
+#define INFORM_THAT(sc,...) \
+ do { \
+ if(sc->verbose_trace>0){ \
+ device_printf((sc)->dev,__VA_ARGS__); \
+ } \
+ }while(0)
+/* things that might clobber the output */
+#define WARN_THAT(sc,...) \
+ do { \
+ if(sc->verbose_trace>1){ \
+ device_printf((sc)->dev,__VA_ARGS__); \
+ } \
+ }while(0)
+/* things that are expected to (will) clobber the output */
+#define TRACE(sc,...) \
+ do { \
+ if(sc->verbose_trace>2){ \
+ device_printf((sc)->dev,__VA_ARGS__); \
+ } \
+ }while(0)
+
+/* Useful for circular buffer calcs */
+#define MOD_DIFF(front,rear,mod) (((mod) + (front) - (rear)) % (mod))
+
+
 static const char *
 dest_description(uint32_t dest)
 {
@@ -216,10 +256,21 @@ bcm2835_audio_callback(void *param, const
VCHI_CALLBACK_REASON_T reason, void *m
      m.type);
  }
  } else if (m.type =3D=3D VC_AUDIO_MSG_TYPE_COMPLETE) {
- struct bcm2835_audio_chinfo *ch =3D m.u.complete.cookie;
+   unsigned int signaled =3D 0;
+ struct bcm2835_audio_chinfo *ch ;
+#if defined(__aarch64__)
+ ch =3D (void *) ((((size_t)m.u.complete.callback) << 32)
+ | ((size_t)m.u.complete.cookie));
+#else
+ ch =3D (void *) (m.u.complete.cookie);
+#endif
+

  int count =3D m.u.complete.count & 0xffff;
  int perr =3D (m.u.complete.count & (1U << 30)) !=3D 0;
+
+ TRACE(sc,"in:: count:0x%x perr:%d\n",m.u.complete.count,perr);
+
  ch->callbacks++;
  if (perr)
  ch->underruns++;
@@ -239,18 +290,41 @@ bcm2835_audio_callback(void *param, const
VCHI_CALLBACK_REASON_T reason, void *m
  device_printf(sc->dev, "available_space =3D=3D %d, count =3D %d, perr=3D%=
d\n",
      ch->available_space, count, perr);
  device_printf(sc->dev,
-     "retrieved_samples =3D %lld, submitted_samples =3D %lld\n",
+     "retrieved_samples =3D %"PRIu64", submitted_samples =3D %"PRIu64"\n",
      ch->retrieved_samples, ch->submitted_samples);
  }
- ch->available_space +=3D count;
- ch->retrieved_samples +=3D count;
  }
- if (perr || (ch->available_space >=3D VCHIQ_AUDIO_PACKET_SIZE))
- cv_signal(&sc->worker_cv);
+ ch->available_space +=3D count;
+ ch->retrieved_samples +=3D count;
+ /*
+  *  XXXMDC
+  *  Experimental: if VC says it's empty, believe it
+  *  Has to come after the usual adjustments
+  */
+ if(perr){
+ ch->available_space =3D VCHIQ_AUDIO_BUFFER_SIZE;
+ perr =3D ch->retrieved_samples; // shd be !=3D 0
+ }
+
+ if ((ch->available_space >=3D 1*VCHIQ_AUDIO_PACKET_SIZE)){
+ cv_signal(&sc->worker_cv);
+ signaled =3D 1;
+ }
  }
  BCM2835_AUDIO_UNLOCK(sc);
+ if(perr){
+ WARN_THAT(sc,
+ "VC starved; reported %u for a total of %u\n"
+ "worker %s\n" ,
+   count,perr,
+ (signaled ? "signaled": "not signaled")
+ );
+ }
  } else
- printf("%s: unknown m.type: %d\n", __func__, m.type);
+ WARN_THAT(sc,
+ "%s: unknown m.type: %d\n",
+ __func__, m.type
+ );
 }

 /* VCHIQ stuff */
@@ -262,13 +336,13 @@ bcm2835_audio_init(struct bcm2835_audio_info *sc)
  /* Initialize and create a VCHI connection */
  status =3D vchi_initialise(&sc->vchi_instance);
  if (status !=3D 0) {
- printf("vchi_initialise failed: %d\n", status);
+ REPORT_ERROR(sc,"vchi_initialise failed: %d\n", status);
  return;
  }

  status =3D vchi_connect(NULL, 0, sc->vchi_instance);
  if (status !=3D 0) {
- printf("vchi_connect failed: %d\n", status);
+ REPORT_ERROR(sc,"vchi_connect failed: %d\n", status);
  return;
  }

@@ -300,7 +374,7 @@ bcm2835_audio_release(struct bcm2835_audio_info *sc)
  if (sc->vchi_handle !=3D VCHIQ_SERVICE_HANDLE_INVALID) {
  success =3D vchi_service_close(sc->vchi_handle);
  if (success !=3D 0)
- printf("vchi_service_close failed: %d\n", success);
+ REPORT_ERROR(sc,"vchi_service_close failed: %d\n", success);
  vchi_service_release(sc->vchi_handle);
  sc->vchi_handle =3D VCHIQ_SERVICE_HANDLE_INVALID;
  }
@@ -330,7 +404,10 @@ bcm2835_audio_start(struct bcm2835_audio_chinfo *ch)
      &m, sizeof m, VCHI_FLAGS_BLOCK_UNTIL_QUEUED, NULL);

  if (ret !=3D 0)
- printf("%s: vchi_msg_queue failed (err %d)\n", __func__, ret);
+ REPORT_ERROR(sc,
+ "%s: vchi_msg_queue failed (err %d)\n",
+ __func__, ret
+ );
  }
 }

@@ -345,11 +422,15 @@ bcm2835_audio_stop(struct bcm2835_audio_chinfo *ch)
  m.type =3D VC_AUDIO_MSG_TYPE_STOP;
  m.u.stop.draining =3D 0;

+ INFORM_THAT(sc,"sending stop\n");
  ret =3D vchi_msg_queue(sc->vchi_handle,
      &m, sizeof m, VCHI_FLAGS_BLOCK_UNTIL_QUEUED, NULL);

  if (ret !=3D 0)
- printf("%s: vchi_msg_queue failed (err %d)\n", __func__, ret);
+ REPORT_ERROR(sc,
+ "%s: vchi_msg_queue failed (err %d)\n",
+ __func__, ret
+ );
  }
 }

@@ -365,7 +446,10 @@ bcm2835_audio_open(struct bcm2835_audio_info *sc)
      &m, sizeof m, VCHI_FLAGS_BLOCK_UNTIL_QUEUED, NULL);

  if (ret !=3D 0)
- printf("%s: vchi_msg_queue failed (err %d)\n", __func__, ret);
+ REPORT_ERROR(sc,
+ "%s: vchi_msg_queue failed (err %d)\n",
+ __func__, ret
+ );
  }
 }

@@ -387,7 +471,10 @@ bcm2835_audio_update_controls(struct
bcm2835_audio_info *sc, uint32_t volume, ui
      &m, sizeof m, VCHI_FLAGS_BLOCK_UNTIL_QUEUED, NULL);

  if (ret !=3D 0)
- printf("%s: vchi_msg_queue failed (err %d)\n", __func__, ret);
+ REPORT_ERROR(sc,
+ "%s: vchi_msg_queue failed (err %d)\n",
+ __func__, ret
+ );
  }
 }

@@ -407,7 +494,10 @@ bcm2835_audio_update_params(struct
bcm2835_audio_info *sc, uint32_t fmt, uint32_
      &m, sizeof m, VCHI_FLAGS_BLOCK_UNTIL_QUEUED, NULL);

  if (ret !=3D 0)
- printf("%s: vchi_msg_queue failed (err %d)\n", __func__, ret);
+ REPORT_ERROR(sc,
+ "%s: vchi_msg_queue failed (err %d)\n",
+ __func__, ret
+ );
  }
 }

@@ -415,18 +505,25 @@ static bool
 bcm2835_audio_buffer_should_sleep(struct bcm2835_audio_chinfo *ch)
 {

+ ch->log_vars.slept_for_lack_of_space =3D 0;
  if (ch->playback_state !=3D PLAYBACK_PLAYING)
  return (true);

  /* Not enough data */
- if (sndbuf_getready(ch->buffer) < VCHIQ_AUDIO_PACKET_SIZE) {
- printf("starve\n");
+ /* XXXMDC Take unsubmitted stuff into account */
+ if (sndbuf_getready(ch->buffer)
+ - MOD_DIFF(
+ ch->unsubmittedptr,
+ sndbuf_getreadyptr(ch->buffer),
+ sndbuf_getsize(ch->buffer)
+ ) < VCHIQ_AUDIO_PACKET_SIZE) {
  ch->starved++;
  return (true);
  }

  /* Not enough free space */
  if (ch->available_space < VCHIQ_AUDIO_PACKET_SIZE) {
+ ch->log_vars.slept_for_lack_of_space =3D 1;
  return (true);
  }

@@ -447,22 +544,27 @@ bcm2835_audio_write_samples(struct
bcm2835_audio_chinfo *ch, void *buf, uint32_t
  m.type =3D VC_AUDIO_MSG_TYPE_WRITE;
  m.u.write.count =3D count;
  m.u.write.max_packet =3D VCHIQ_AUDIO_PACKET_SIZE;
- m.u.write.callback =3D NULL;
- m.u.write.cookie =3D ch;
+#if defined(__aarch64__)
+ m.u.write.callback =3D (uint32_t)(((size_t) ch) >> 32) & 0xffffffff;
+ m.u.write.cookie =3D (uint32_t)(((size_t) ch) & 0xffffffff);
+#else
+ m.u.write.callback =3D (uint32_t) NULL;
+ m.u.write.cookie =3D (uint32_t) ch;
+#endif
  m.u.write.silence =3D 0;

  ret =3D vchi_msg_queue(sc->vchi_handle,
      &m, sizeof m, VCHI_FLAGS_BLOCK_UNTIL_QUEUED, NULL);

  if (ret !=3D 0)
- printf("%s: vchi_msg_queue failed (err %d)\n", __func__, ret);
+ REPORT_ERROR(sc,"%s: vchi_msg_queue failed (err %d)\n", __func__, ret);

  while (count > 0) {
  int bytes =3D MIN((int)m.u.write.max_packet, (int)count);
  ret =3D vchi_msg_queue(sc->vchi_handle,
      buf, bytes, VCHI_FLAGS_BLOCK_UNTIL_QUEUED, NULL);
  if (ret !=3D 0)
- printf("%s: vchi_msg_queue failed: %d\n",
+ REPORT_ERROR(sc,"%s: vchi_msg_queue failed: %d\n",
      __func__, ret);
  buf =3D (char *)buf + bytes;
  count -=3D bytes;
@@ -494,6 +596,10 @@ bcm2835_audio_worker(void *data)
  while ((sc->flags_pending =3D=3D 0) &&
      bcm2835_audio_buffer_should_sleep(ch)) {
  cv_wait_sig(&sc->worker_cv, &sc->lock);
+ if((sc-> flags_pending =3D=3D 0)
+     && ch->log_vars.slept_for_lack_of_space) {
+ TRACE(sc,"slept for lack of space\n");
+ }
  }
  flags =3D sc->flags_pending;
  /* Clear pending flags */
@@ -520,16 +626,32 @@ bcm2835_audio_worker(void *data)
  BCM2835_AUDIO_LOCK(sc);
  bcm2835_audio_reset_channel(&sc->pch);
  ch->playback_state =3D PLAYBACK_IDLE;
+ long sub_total =3D ch->submitted_samples;
+ long retd =3D ch->retrieved_samples;
  BCM2835_AUDIO_UNLOCK(sc);
+ INFORM_THAT(sc,
+ "stopped audio. submitted a total of %lu "
+ "having been acked %lu\n",
+ sub_total, retd
+ );
  continue;
  }

  /* Requested to start playback */
  if ((flags & AUDIO_PLAY) &&
      (ch->playback_state =3D=3D PLAYBACK_IDLE)) {
+ INFORM_THAT(sc,
+ "starting audio\n"
+ );
+ unsigned int bsize =3D sndbuf_getsize(ch->buffer);
  BCM2835_AUDIO_LOCK(sc);
  ch->playback_state =3D PLAYBACK_PLAYING;
+ ch->log_vars.bsize =3D bsize;
  BCM2835_AUDIO_UNLOCK(sc);
+ INFORM_THAT(sc,
+ "buffer size is %u\n",
+ bsize
+ );
  bcm2835_audio_start(ch);
  }

@@ -538,20 +660,69 @@ bcm2835_audio_worker(void *data)

  if (sndbuf_getready(ch->buffer) =3D=3D 0)
  continue;
-
- count =3D sndbuf_getready(ch->buffer);
+ uint32_t i_count;
+
+ /* XXXMDC Take unsubmitted stuff into account */
+ count
+ =3D i_count
+ =3D sndbuf_getready(ch->buffer)
+ - MOD_DIFF(
+ ch->unsubmittedptr,
+ sndbuf_getreadyptr(ch->buffer),
+ sndbuf_getsize(ch->buffer)
+ );
  size =3D sndbuf_getsize(ch->buffer);
- readyptr =3D sndbuf_getreadyptr(ch->buffer);
+ readyptr =3D ch->unsubmittedptr;

+ int size_changed=3D0;
+ unsigned int available;
  BCM2835_AUDIO_LOCK(sc);
- if (readyptr + count > size)
+ if(size !=3D ch->log_vars.bsize){
+ ch->log_vars.bsize =3D size;
+ size_changed =3D 1;
+ }
+ available =3D ch->available_space;
+ /*
+  *  XXXMDC
+  *
+  *  On arm64, got into situations where
+  *  readyptr was less than a packet away
+  *  from the end of the buffer, which led
+  *  to count being set to 0 and, inexorably, starvation.
+  *  Code below tries to take that into account.
+  *  The problem might have been fixed with some of the
+  *  other changes that were made in the meantime,
+  *  but for now this works fine.
+  */
+ if (readyptr + count > size){
  count =3D size - readyptr;
- count =3D min(count, ch->available_space);
- count -=3D (count % VCHIQ_AUDIO_PACKET_SIZE);
+ }
+ if(count > ch->available_space){
+ count =3D ch->available_space;
+ count -=3D (count % VCHIQ_AUDIO_PACKET_SIZE);
+ }else if (count > VCHIQ_AUDIO_PACKET_SIZE){
+ count -=3D (count % VCHIQ_AUDIO_PACKET_SIZE);
+ }else if (size > count + readyptr) {
+ count =3D 0;
+ }
  BCM2835_AUDIO_UNLOCK(sc);
-
- if (count < VCHIQ_AUDIO_PACKET_SIZE)
+ if(count % VCHIQ_AUDIO_PACKET_SIZE !=3D 0){
+   WARN_THAT(sc,
+   "count: %u  initial count: %u  "
+         "size: %u  readyptr: %u  available: %u"
+ "\n",
+ count,i_count,size,readyptr, available);
+ }
+ if(size_changed) INFORM_THAT(sc,"bsize changed to %u\n",size);
+
+ if (count =3D=3D 0){
+ WARN_THAT(sc,
+ "not enough room for a packet: count %d,"
+ " i_count %d, rptr %d, size %d\n",
+ count, i_count, readyptr, size
+ );
  continue;
+ }

  buf =3D (uint8_t*)sndbuf_getbuf(ch->buffer) + readyptr;

@@ -560,8 +731,17 @@ bcm2835_audio_worker(void *data)
  ch->unsubmittedptr =3D (ch->unsubmittedptr + count) %
sndbuf_getsize(ch->buffer);
  ch->available_space -=3D count;
  ch->submitted_samples +=3D count;
+ long sub =3D count;
+ long sub_total =3D ch->submitted_samples;
+ long retd =3D ch->retrieved_samples;
  KASSERT(ch->available_space >=3D 0, ("ch->available_space =3D=3D %d\n",
ch->available_space));
  BCM2835_AUDIO_UNLOCK(sc);
+
+ TRACE(sc,
+ "submitted %lu for a total of %lu having been acked %lu; "
+ "rptr %d, had %u available \n",
+ sub, sub_total, retd, readyptr, available);
+
  }

  BCM2835_AUDIO_LOCK(sc);
@@ -580,7 +760,9 @@ bcm2835_audio_create_worker(struct bcm2835_audio_info *=
sc)
  sc->worker_state =3D WORKER_RUNNING;
  if (kproc_create(bcm2835_audio_worker, (void*)sc, &newp, 0, 0,
      "bcm2835_audio_worker") !=3D 0) {
- printf("failed to create bcm2835_audio_worker\n");
+ REPORT_ERROR(sc,
+ "failed to create bcm2835_audio_worker\n"
+ );
  }
 }

@@ -613,6 +795,8 @@ bcmchan_init(kobj_t obj, void *devinfo, struct
snd_dbuf *b, struct pcm_channel *
  return NULL;
  }

+ ch->log_vars =3D DEFAULT_LOG_VALUES;
+
  BCM2835_AUDIO_LOCK(sc);
  bcm2835_worker_update_params(sc);
  BCM2835_AUDIO_UNLOCK(sc);
@@ -833,6 +1017,9 @@ vchi_audio_sysctl_init(struct bcm2835_audio_info *sc)
  SYSCTL_ADD_INT(ctx, tree, OID_AUTO, "starved",
  CTLFLAG_RD, &sc->pch.starved,
  sc->pch.starved, "number of starved conditions");
+ SYSCTL_ADD_INT(ctx, tree, OID_AUTO, "trace",
+ CTLFLAG_RW, &sc->verbose_trace,
+ sc->verbose_trace, "enable tracing of transfers");
 }

 static void
@@ -864,6 +1051,7 @@ bcm2835_audio_delayed_init(void *xsc)
  bcm2835_audio_open(sc);
  sc->volume =3D 75;
  sc->dest =3D DEST_AUTO;
+ sc->verbose_trace =3D 0;

      if (mixer_init(sc->dev, &bcmmixer_class, sc)) {
  device_printf(sc->dev, "mixer_init failed\n");
diff --git a/sys/arm/broadcom/bcm2835/vc_vchi_audioserv_defs.h
b/sys/arm/broadcom/bcm2835/vc_vchi_audioserv_defs.h
index 143c54385916..04292df1c261 100644
--- a/sys/arm/broadcom/bcm2835/vc_vchi_audioserv_defs.h
+++ b/sys/arm/broadcom/bcm2835/vc_vchi_audioserv_defs.h
@@ -114,8 +114,8 @@ typedef struct
 typedef struct
 {
  uint32_t count; /* in bytes */
- void *callback;
- void *cookie;
+ uint32_t callback;
+ uint32_t cookie;
  uint16_t silence;
  uint16_t max_packet;
 } VC_AUDIO_WRITE_T;
@@ -131,8 +131,8 @@ typedef struct
 typedef struct
 {
  int32_t count;  /* Success value */
- void *callback;
- void *cookie;
+ uint32_t callback;
+ uint32_t cookie;
 } VC_AUDIO_COMPLETE_T;

 /* Message header for all messages in HOST->VC direction */
diff --git a/sys/arm64/conf/GENERIC-VCHIQ b/sys/arm64/conf/GENERIC-VCHIQ
new file mode 100644
index 000000000000..422ed425894c
--- /dev/null
+++ b/sys/arm64/conf/GENERIC-VCHIQ
@@ -0,0 +1,23 @@
+#
+# GENERIC-VCHIQ
+#
+# Custom kernel for arm64 plus VCHIQ
+#
+# $FreeBSD$
+
+#NO_UNIVERSE
+
+include GENERIC
+ident GENERIC-VCHIQ
+
+device vchiq
+
+# If you want to have any chance of compiling this in a RPI Zero 2
+# uncomment the stuff below
+
+# nomakeoptions DEBUG
+# nomakeoptions WITH_CTF
+# nooptions DDB_CTF
+# makeoptions MALLOC_PRODUCTION=3D1
+
+
diff --git a/sys/contrib/vchiq/interface/compat/vchi_bsd.c
b/sys/contrib/vchiq/interface/compat/vchi_bsd.c
index f831880f5e13..e039992036aa 100644
--- a/sys/contrib/vchiq/interface/compat/vchi_bsd.c
+++ b/sys/contrib/vchiq/interface/compat/vchi_bsd.c
@@ -341,7 +341,6 @@ down_interruptible(struct semaphore *s)
  int ret ;

  ret =3D 0;
-
  mtx_lock(&s->mtx);

  while (s->value =3D=3D 0) {
@@ -349,13 +348,11 @@ down_interruptible(struct semaphore *s)
  ret =3D cv_wait_sig(&s->cv, &s->mtx);
  s->waiters--;

- if (ret =3D=3D EINTR) {
+ /* XXXMDC As per its semaphore.c, linux can only return EINTR */
+ if (ret) {
  mtx_unlock(&s->mtx);
- return (-EINTR);
+ return -EINTR;
  }
-
- if (ret =3D=3D ERESTART)
- continue;
  }

  s->value--;
@@ -442,8 +439,7 @@ flush_signals(VCHIQ_THREAD_T thr)
 int
 fatal_signal_pending(VCHIQ_THREAD_T thr)
 {
- printf("Implement ME: %s\n", __func__);
- return (0);
+ return (curproc_sigkilled());
 }

 /*
diff --git a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_2835_arm.c
b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_2835_arm.c
index 279aacd0880a..7a48ad9d21b6 100644
--- a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_2835_arm.c
+++ b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_2835_arm.c
@@ -65,9 +65,24 @@ MALLOC_DEFINE(M_VCPAGELIST, "vcpagelist",
"VideoCore pagelist memory");

 #define MAX_FRAGMENTS (VCHIQ_NUM_CURRENT_BULKS * 2)

+/*
+ *  XXXMDC
+ * Do this less ad-hoc-y -- e.g.
+ * https://github.com/raspberrypi/linux/commit/c683db8860a80562a2bb5b451d7=
7b3e471d24f36
+ */
+#if defined(__aarch64__)
+int g_cache_line_size =3D 64;
+#else
 int g_cache_line_size =3D 32;
+#endif
 static int g_fragment_size;

+unsigned int g_long_bulk_space =3D 0;
+#define VM_PAGE_TO_VC_BULK_PAGE(x) (\
+ g_long_bulk_space ? VM_PAGE_TO_PHYS(x)\
+  : PHYS_TO_VCBUS(VM_PAGE_TO_PHYS(x))\
+)
+
 typedef struct vchiq_2835_state_struct {
    int inited;
    VCHIQ_ARM_STATE_T arm_state;
@@ -113,6 +128,59 @@ vchiq_dmamap_cb(void *arg, bus_dma_segment_t
*segs, int nseg, int err)
  *addr =3D PHYS_TO_VCBUS(segs[0].ds_addr);
 }

+#if defined(__aarch64__) /* See comment in free_pagelist */
+static int
+invalidate_cachelines_in_range_of_ppage(
+ vm_page_t p,
+ size_t offset,
+ size_t count
+)
+{
+ if(offset + count > PAGE_SIZE){ return EINVAL; }
+        uint8_t *dst =3D (uint8_t*)pmap_quick_enter_page(p);
+        if (!dst){
+                return ENOMEM;
+ }
+ cpu_dcache_inv_range((vm_offset_t)dst + offset, count);
+ pmap_quick_remove_page((vm_offset_t)dst);
+ return 0;
+}
+
+/* XXXMDC bulk instead of loading and invalidating single pages? */
+static void
+invalidate_cachelines_in_range_of_ppage_seq(
+ vm_page_t *p,
+ size_t start,
+ size_t count
+)
+{
+ if(start >=3D PAGE_SIZE) goto invalid_input;
+
+#define _NEXT_AT(x,_m) (((x)+((_m)-1)) & ~((_m)-1))   /* for power of two =
m */
+ size_t offset =3D _NEXT_AT(start,g_cache_line_size);
+#undef _NEXT_AT
+ count =3D (offset < start + count) ? count - (offset - start) : 0;
+ offset =3D offset & (PAGE_SIZE - 1);
+ for(
+ size_t done =3D 0;
+ count > done;
+ p++, done +=3D PAGE_SIZE - offset, offset =3D 0
+ ){
+ size_t in_page =3D PAGE_SIZE - offset;
+ size_t todo =3D (count-done > in_page) ? in_page : count-done;
+ int e =3D invalidate_cachelines_in_range_of_ppage(*p, offset, todo);
+ if(e !=3D 0)
+ goto problem_in_loop;
+ }
+ return;
+
+problem_in_loop:
+invalid_input:
+ WARN_ON(1);
+ return;
+}
+#endif
+
 static int
 copyout_page(vm_page_t p, size_t offset, void *kaddr, size_t size)
 {
@@ -171,7 +239,7 @@ vchiq_platform_init(VCHIQ_STATE_T *state)
  goto failed_load;
  }

- WARN_ON(((int)g_slot_mem & (PAGE_SIZE - 1)) !=3D 0);
+ WARN_ON(((size_t)g_slot_mem & (PAGE_SIZE - 1)) !=3D 0);

  vchiq_slot_zero =3D vchiq_init_slots(g_slot_mem, g_slot_mem_size);
  if (!vchiq_slot_zero) {
@@ -204,8 +272,8 @@ vchiq_platform_init(VCHIQ_STATE_T *state)
  bcm_mbox_write(BCM2835_MBOX_CHAN_VCHIQ, (unsigned int)g_slot_phys);

  vchiq_log_info(vchiq_arm_log_level,
- "vchiq_init - done (slots %x, phys %x)",
- (unsigned int)vchiq_slot_zero, g_slot_phys);
+ "vchiq_init - done (slots %zx, phys %zx)",
+ (size_t)vchiq_slot_zero, g_slot_phys);

    vchiq_call_connected_callbacks();

@@ -393,13 +461,14 @@ pagelist_page_free(vm_page_t pp)
 ** from increased speed as a result.
 */

+
 static int
 create_pagelist(char __user *buf, size_t count, unsigned short type,
  struct proc *p, BULKINFO_T *bi)
 {
  PAGELIST_T *pagelist;
  vm_page_t* pages;
- unsigned long *addrs;
+ uint32_t *addrs;
  unsigned int num_pages, i;
  vm_offset_t offset;
  int pagelist_size;
@@ -436,7 +505,7 @@ create_pagelist(char __user *buf, size_t count,
unsigned short type,

  err =3D bus_dmamem_alloc(bi->pagelist_dma_tag, (void **)&pagelist,
      BUS_DMA_COHERENT | BUS_DMA_WAITOK, &bi->pagelist_dma_map);
- if (err) {
+ if (err || !pagelist) {
  vchiq_log_error(vchiq_core_log_level, "Unable to allocate pagelist memory=
");
  err =3D -ENOMEM;
  goto failed_alloc;
@@ -449,14 +518,12 @@ create_pagelist(char __user *buf, size_t count,
unsigned short type,
  if (err) {
  vchiq_log_error(vchiq_core_log_level, "cannot load DMA map for
pagelist memory");
  err =3D -ENOMEM;
+ bi->pagelist =3D pagelist;
  goto failed_load;
  }

  vchiq_log_trace(vchiq_arm_log_level,
- "create_pagelist - %x (%d bytes @%p)", (unsigned int)pagelist, count, buf=
);
-
- if (!pagelist)
- return -ENOMEM;
+ "create_pagelist - %zx (%zu bytes @%p)", (size_t)pagelist, count, buf);

  addrs =3D pagelist->addrs;
  pages =3D (vm_page_t*)(addrs + num_pages);
@@ -467,8 +534,9 @@ create_pagelist(char __user *buf, size_t count,
unsigned short type,

  if (actual_pages !=3D num_pages) {
  vm_page_unhold_pages(pages, actual_pages);
- free(pagelist, M_VCPAGELIST);
- return (-ENOMEM);
+ err =3D -ENOMEM;
+ bi->pagelist =3D pagelist;
+ goto failed_hold;
  }

  pagelist->length =3D count;
@@ -477,27 +545,28 @@ create_pagelist(char __user *buf, size_t count,
unsigned short type,

  /* Group the pages into runs of contiguous pages */

- base_addr =3D (void *)PHYS_TO_VCBUS(VM_PAGE_TO_PHYS(pages[0]));
+ size_t run_ceil =3D g_long_bulk_space ? 0x100 : PAGE_SIZE;
+ unsigned int pg_addr_rshift =3D g_long_bulk_space ? 4 : 0;
+ base_addr =3D (void *) VM_PAGE_TO_VC_BULK_PAGE(pages[0]);
  next_addr =3D base_addr + PAGE_SIZE;
  addridx =3D 0;
  run =3D 0;
-
+#define _PG_BLOCK(base,run) \
+ ((((size_t) (base)) >> pg_addr_rshift) & ~(run_ceil-1)) + (run)
  for (i =3D 1; i < num_pages; i++) {
- addr =3D (void *)PHYS_TO_VCBUS(VM_PAGE_TO_PHYS(pages[i]));
- if ((addr =3D=3D next_addr) && (run < (PAGE_SIZE - 1))) {
+ addr =3D (void *)VM_PAGE_TO_VC_BULK_PAGE(pages[i]);
+ if ((addr =3D=3D next_addr) && (run < run_ceil - 1)) {
  next_addr +=3D PAGE_SIZE;
  run++;
  } else {
- addrs[addridx] =3D (unsigned long)base_addr + run;
- addridx++;
+ addrs[addridx++] =3D (uint32_t) _PG_BLOCK(base_addr,run);
  base_addr =3D addr;
  next_addr =3D addr + PAGE_SIZE;
  run =3D 0;
  }
  }
-
- addrs[addridx] =3D (unsigned long)base_addr + run;
- addridx++;
+ addrs[addridx++] =3D _PG_BLOCK(base_addr, run);
+#undef _PG_BLOCK

  /* Partial cache lines (fragments) require special measures */
  if ((type =3D=3D PAGELIST_READ) &&
@@ -519,12 +588,24 @@ create_pagelist(char __user *buf, size_t count,
unsigned short type,
  g_free_fragments =3D *(char **) g_free_fragments;
  up(&g_free_fragments_mutex);
  pagelist->type =3D
-  PAGELIST_READ_WITH_FRAGMENTS +
-  (fragments - g_fragments_base)/g_fragment_size;
+  PAGELIST_READ_WITH_FRAGMENTS
+  + (fragments - g_fragments_base)/g_fragment_size;
+#if defined(__aarch64__)
+  bus_dmamap_sync(bcm_slots_dma_tag, bcm_slots_dma_map, BUS_DMASYNC_PREREA=
D);
+#endif
  }

+#if defined(__aarch64__)
+ if(type =3D=3D PAGELIST_READ){
+ cpu_dcache_wbinv_range((vm_offset_t)buf,count);
+ }else{
+ cpu_dcache_wb_range((vm_offset_t)buf,count);
+ }
+ dsb(sy);
+#else
  pa =3D pmap_extract(PCPU_GET(curpmap), (vm_offset_t)buf);
  dcache_wbinv_poc((vm_offset_t)buf, pa, count);
+#endif

  bus_dmamap_sync(bi->pagelist_dma_tag, bi->pagelist_dma_map,
BUS_DMASYNC_PREWRITE);

@@ -532,6 +613,8 @@ create_pagelist(char __user *buf, size_t count,
unsigned short type,

  return 0;

+failed_hold:
+ bus_dmamap_unload(bi->pagelist_dma_tag,bi->pagelist_dma_map);
 failed_load:
  bus_dmamem_free(bi->pagelist_dma_tag, bi->pagelist, bi->pagelist_dma_map)=
;
 failed_alloc:
@@ -550,7 +633,7 @@ free_pagelist(BULKINFO_T *bi, int actual)
  pagelist =3D bi->pagelist;

  vchiq_log_trace(vchiq_arm_log_level,
- "free_pagelist - %x, %d (%lu bytes @%p)", (unsigned int)pagelist,
actual, pagelist->length, bi->buf);
+ "free_pagelist - %zx, %d (%u bytes @%p)", (size_t)pagelist, actual,
pagelist->length, bi->buf);

  num_pages =3D
  (pagelist->length + pagelist->offset + PAGE_SIZE - 1) /
@@ -558,6 +641,27 @@ free_pagelist(BULKINFO_T *bi, int actual)

  pages =3D (vm_page_t*)(pagelist->addrs + num_pages);

+#if defined(__aarch64__)
+ /*
+         * On arm64, even if the user keeps their end of the bargain
+  * -- do NOT touch the buffers sent to VC -- but reads around the
+  * pagelist after the invalidation above, the arm might preemptively
+  * load (and validate) cache lines for areas inside the page list,
+  * so we must invalidate them again.
+  *
+  * The functional test does it and without this it doesn't pass.
+  *
+  * XXXMDC might it be enough to invalidate a couple of pages at
+  * the ends of the page list?
+  */
+ if(pagelist->type >=3D PAGELIST_READ && actual > 0)
+ invalidate_cachelines_in_range_of_ppage_seq(
+ pages,
+ pagelist->offset,
+ actual
+ );
+#endif
+
  /* Deal with any partial cache lines (fragments) */
  if (pagelist->type >=3D PAGELIST_READ_WITH_FRAGMENTS) {
  char *fragments =3D g_fragments_base +
@@ -594,13 +698,18 @@ free_pagelist(BULKINFO_T *bi, int actual)
  up(&g_free_fragments_sema);
  }

- for (i =3D 0; i < num_pages; i++) {
- if (pagelist->type !=3D PAGELIST_WRITE) {
+ if (pagelist->type !=3D PAGELIST_WRITE) {
+ for (i =3D 0; i < num_pages; i++) {
  vm_page_dirty(pages[i]);
  pagelist_page_free(pages[i]);
  }
  }

+#if defined(__aarch64__)
+ /* XXXMDC necessary? */
+ dsb(sy);
+#endif
+
  bus_dmamap_unload(bi->pagelist_dma_tag, bi->pagelist_dma_map);
  bus_dmamem_free(bi->pagelist_dma_tag, bi->pagelist, bi->pagelist_dma_map)=
;
  bus_dma_tag_destroy(bi->pagelist_dma_tag);
diff --git a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_arm.c
b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_arm.c
index 763cd9ce9417..bfcff315a543 100644
--- a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_arm.c
+++ b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_arm.c
@@ -386,7 +386,7 @@ static void
 user_service_free(void *userdata)
 {
  USER_SERVICE_T *user_service =3D userdata;
-
+
  _sema_destroy(&user_service->insert_event);
  _sema_destroy(&user_service->remove_event);

@@ -410,7 +410,7 @@ static void close_delivered(USER_SERVICE_T *user_servic=
e)

  /* Wake the user-thread blocked in close_ or remove_service */
  up(&user_service->close_event);
-
+
  user_service->close_pending =3D 0;
  }
 }
@@ -442,12 +442,23 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
 #define _IOC_TYPE(x) IOCGROUP(x)

  vchiq_log_trace(vchiq_arm_log_level,
-  "vchiq_ioctl - instance %x, cmd %s, arg %p",
- (unsigned int)instance,
+  "vchiq_ioctl - instance %zx, cmd %s, arg %p",
+ (size_t)instance,
  ((_IOC_TYPE(cmd) =3D=3D VCHIQ_IOC_MAGIC) &&
  (_IOC_NR(cmd) <=3D VCHIQ_IOC_MAX)) ?
  ioctl_names[_IOC_NR(cmd)] : "<invalid>", arg);

+#ifdef COMPAT_FREEBSD32
+/* A fork in the road to freebsd32 compatibilty */
+#define _CF32_FORK(compat_c,native_c)\
+ { \
+ int _____dont_call_your_vars_this =3D 0;\
+ switch(cmd){_CF32_CASE {_____dont_call_your_vars_this =3D 1;} break;} \
+ if(_____dont_call_your_vars_this) { compat_c } else { native_c } \
+ }
+#else
+#define _CF32_FORK(compat_c,native_c) { native_c }
+#endif
  switch (cmd) {
  case VCHIQ_IOC_SHUTDOWN:
  if (!instance->connected)
@@ -496,13 +507,32 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  "vchiq: could not connect: %d", status);
  break;

+#ifdef COMPAT_FREEBSD32
+#define _CF32_CASE \
+ case VCHIQ_IOC_CREATE_SERVICE32:
+ _CF32_CASE
+#endif
  case VCHIQ_IOC_CREATE_SERVICE: {
  VCHIQ_CREATE_SERVICE_T args;
  USER_SERVICE_T *user_service =3D NULL;
  void *userdata;
  int srvstate;

+_CF32_FORK(
+ VCHIQ_CREATE_SERVICE32_T args32;
+ memcpy(&args32, (const void*)arg, sizeof(args32));
+ args.params.fourcc =3D args32.params.fourcc;
+/* XXXMDC not actually used? overwritten straight away */
+ args.params.callback =3D (VCHIQ_CALLBACK_T)(uintptr_t) args32.params.call=
back;
+ args.params.userdata =3D (void*)(uintptr_t)args32.params.userdata;
+ args.params.version =3D args32.params.version;
+ args.params.version_min =3D args32.params.version_min;
+ args.is_open =3D args32.is_open;
+ args.is_vchi =3D args32.is_vchi;
+ args.handle  =3D args32.handle;
+,
  memcpy(&args, (const void*)arg, sizeof(args));
+)

  user_service =3D kmalloc(sizeof(USER_SERVICE_T), GFP_KERNEL);
  if (!user_service) {
@@ -558,15 +588,22 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  break;
  }
  }
-
 #ifdef VCHIQ_IOCTL_DEBUG
  printf("%s: [CREATE SERVICE] handle =3D %08x\n", __func__, service->handl=
e);
 #endif
+_CF32_FORK(
+ memcpy((void *)
+ &(((VCHIQ_CREATE_SERVICE32_T*)
+ arg)->handle),
+ (const void *)&service->handle,
+ sizeof(service->handle));
+,
  memcpy((void *)
  &(((VCHIQ_CREATE_SERVICE_T*)
  arg)->handle),
  (const void *)&service->handle,
  sizeof(service->handle));
+);

  service =3D NULL;
  } else {
@@ -574,6 +611,7 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd, caddr_t
arg, int fflag,
  kfree(user_service);
  }
  } break;
+#undef _CF32_CASE

  case VCHIQ_IOC_CLOSE_SERVICE: {
  VCHIQ_SERVICE_HANDLE_T handle;
@@ -673,9 +711,22 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  ret =3D -EINVAL;
  } break;

+#ifdef COMPAT_FREEBSD32
+#define _CF32_CASE \
+ case VCHIQ_IOC_QUEUE_MESSAGE32:
+ _CF32_CASE
+#endif
  case VCHIQ_IOC_QUEUE_MESSAGE: {
  VCHIQ_QUEUE_MESSAGE_T args;
+_CF32_FORK(
+ VCHIQ_QUEUE_MESSAGE32_T args32;
+ memcpy(&args32, (const void*)arg, sizeof(args32));
+ args.handle =3D args32.handle;
+ args.count =3D args32.count;
+ args.elements =3D (VCHIQ_ELEMENT_T *)(uintptr_t)args32.elements;
+,
  memcpy(&args, (const void*)arg, sizeof(args));
+)

 #ifdef VCHIQ_IOCTL_DEBUG
  printf("%s: [QUEUE MESSAGE] handle =3D %08x\n", __func__, args.handle);
@@ -686,8 +737,22 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  if ((service !=3D NULL) && (args.count <=3D MAX_ELEMENTS)) {
  /* Copy elements into kernel space */
  VCHIQ_ELEMENT_T elements[MAX_ELEMENTS];
- if (copy_from_user(elements, args.elements,
- args.count * sizeof(VCHIQ_ELEMENT_T)) =3D=3D 0)
+ long cp_ret;
+_CF32_FORK(
+ VCHIQ_ELEMENT32_T elements32[MAX_ELEMENTS];
+ cp_ret =3D copy_from_user(elements32, args.elements,
+ args.count * sizeof(VCHIQ_ELEMENT32_T));
+ for(int i=3D0;cp_ret =3D=3D 0 && i < args.count;++i){
+ elements[i].data =3D
+ (void *)(uintptr_t)elements32[i].data;
+ elements[i].size =3D elements32[i].size;
+ }
+
+,
+ cp_ret =3D copy_from_user(elements, args.elements,
+ args.count * sizeof(VCHIQ_ELEMENT_T));
+)
+ if (cp_ret =3D=3D 0)
  status =3D vchiq_queue_message
  (args.handle,
  elements, args.count);
@@ -697,16 +762,37 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  ret =3D -EINVAL;
  }
  } break;
+#undef _CF32_CASE

+#ifdef COMPAT_FREEBSD32
+#define _CF32_CASE \
+ case VCHIQ_IOC_QUEUE_BULK_TRANSMIT32: \
+ case VCHIQ_IOC_QUEUE_BULK_RECEIVE32:
+ _CF32_CASE
+#endif
  case VCHIQ_IOC_QUEUE_BULK_TRANSMIT:
  case VCHIQ_IOC_QUEUE_BULK_RECEIVE: {
  VCHIQ_QUEUE_BULK_TRANSFER_T args;
+
  struct bulk_waiter_node *waiter =3D NULL;
  VCHIQ_BULK_DIR_T dir =3D
- (cmd =3D=3D VCHIQ_IOC_QUEUE_BULK_TRANSMIT) ?
+ (cmd =3D=3D VCHIQ_IOC_QUEUE_BULK_TRANSMIT) || (cmd =3D=3D
VCHIQ_IOC_QUEUE_BULK_TRANSMIT32)?
  VCHIQ_BULK_TRANSMIT : VCHIQ_BULK_RECEIVE;

+_CF32_FORK(
+ VCHIQ_QUEUE_BULK_TRANSFER32_T args32;
+ memcpy(&args32, (const void*)arg, sizeof(args32));
+ /* XXXMDC parens needed (macro parsing?) */
+ args =3D ((VCHIQ_QUEUE_BULK_TRANSFER_T) {
+ .handle =3D args32.handle,
+ .data =3D (void *)(uintptr_t) args32.data,
+ .size =3D args32.size,
+ .userdata =3D (void *)(uintptr_t) args32.userdata,
+ .mode =3D args32.mode,
+ });
+,
  memcpy(&args, (const void*)arg, sizeof(args));
+)

  service =3D find_service_for_instance(instance, args.handle);
  if (!service) {
@@ -734,7 +820,6 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd, caddr_t
arg, int fflag,
  list_del(pos);
  break;
  }
-
  }
  lmutex_unlock(&instance->bulk_waiter_list_mutex);
  if (!waiter) {
@@ -745,10 +830,11 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  break;
  }
  vchiq_log_info(vchiq_arm_log_level,
- "found bulk_waiter %x for pid %d",
- (unsigned int)waiter, current->p_pid);
+ "found bulk_waiter %zx for pid %d",
+ (size_t)waiter, current->p_pid);
  args.userdata =3D &waiter->bulk_waiter;
  }
+
  status =3D vchiq_bulk_transfer
  (args.handle,
   VCHI_MEM_HANDLE_INVALID,
@@ -776,17 +862,31 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  list_add(&waiter->list, &instance->bulk_waiter_list);
  lmutex_unlock(&instance->bulk_waiter_list_mutex);
  vchiq_log_info(vchiq_arm_log_level,
- "saved bulk_waiter %x for pid %d",
- (unsigned int)waiter, current->p_pid);
+ "saved bulk_waiter %zx for pid %d",
+ (size_t)waiter, current->p_pid);

+_CF32_FORK(
+ memcpy((void *)
+ &(((VCHIQ_QUEUE_BULK_TRANSFER32_T *)
+ arg)->mode),
+ (const void *)&mode_waiting,
+ sizeof(mode_waiting));
+,
  memcpy((void *)
  &(((VCHIQ_QUEUE_BULK_TRANSFER_T *)
  arg)->mode),
  (const void *)&mode_waiting,
  sizeof(mode_waiting));
+)
  }
  } break;
+#undef _CF32_CASE

+#ifdef COMPAT_FREEBSD32
+#define _CF32_CASE \
+ case VCHIQ_IOC_AWAIT_COMPLETION32:
+ _CF32_CASE
+#endif
  case VCHIQ_IOC_AWAIT_COMPLETION: {
  VCHIQ_AWAIT_COMPLETION_T args;
  int count =3D 0;
@@ -797,7 +897,17 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  break;
  }

+_CF32_FORK(
+ VCHIQ_AWAIT_COMPLETION32_T args32;
+                memcpy(&args32, (const void*)arg, sizeof(args32));
+ args.count =3D args32.count;
+ args.buf =3D (VCHIQ_COMPLETION_DATA_T *)(uintptr_t)args32.buf;
+ args.msgbufsize =3D args32.msgbufsize;
+ args.msgbufcount =3D args32.msgbufcount;
+ args.msgbufs =3D (void **)(uintptr_t)args32.msgbufs;
+,
                 memcpy(&args, (const void*)arg, sizeof(args));
+)

  lmutex_lock(&instance->completion_mutex);

@@ -860,9 +970,9 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd, caddr_t
arg, int fflag,
  if (args.msgbufsize < msglen) {
  vchiq_log_error(
  vchiq_arm_log_level,
- "header %x: msgbufsize"
+ "header %zx: msgbufsize"
  " %x < msglen %x",
- (unsigned int)header,
+ (size_t)header,
  args.msgbufsize,
  msglen);
  WARN(1, "invalid message "
@@ -877,6 +987,19 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  break;
  /* Get the pointer from user space */
  msgbufcount--;
+_CF32_FORK(
+ uint32_t *msgbufs32 =3D (uint32_t *) args.msgbufs;
+ uint32_t msgbuf32 =3D 0;
+ if (copy_from_user(&msgbuf32,
+ (const uint32_t __user *)
+ &msgbufs32[msgbufcount],
+ sizeof(msgbuf32)) !=3D 0) {
+ if (count =3D=3D 0)
+ ret =3D -EFAULT;
+ break;
+ }
+ msgbuf =3D (void __user *)(uintptr_t)msgbuf32;
+,
  if (copy_from_user(&msgbuf,
  (const void __user *)
  &args.msgbufs[msgbufcount],
@@ -885,6 +1008,7 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  ret =3D -EFAULT;
  break;
  }
+)

  /* Copy the message to user space */
  if (copy_to_user(msgbuf, header,
@@ -908,7 +1032,26 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  VCHIQ_SERVICE_CLOSED) &&
  !instance->use_close_delivered)
  unlock_service(service1);
-
+_CF32_FORK(
+ VCHIQ_COMPLETION_DATA32_T comp32 =3D {0};
+ comp32.reason
+ =3D (uint32_t)(size_t) completion->reason;
+ comp32.service_userdata
+ =3D (uint32_t)(size_t) completion->service_userdata;
+ comp32.bulk_userdata
+ =3D (uint32_t)(size_t) completion->bulk_userdata;
+ comp32.header =3D (uint32_t)(size_t)completion->header;
+
+ VCHIQ_COMPLETION_DATA32_T __user *buf_loc;
+ buf_loc =3D (VCHIQ_COMPLETION_DATA32_T __user *) args.buf;
+ buf_loc +=3D count;
+ if (copy_to_user(
+ buf_loc, &comp32, sizeof(comp32)
+    ) !=3D 0){
+ if (ret =3D=3D 0)
+ ret =3D -EFAULT;
+ }
+,
  if (copy_to_user((void __user *)(
  (size_t)args.buf +
  count * sizeof(VCHIQ_COMPLETION_DATA_T)),
@@ -918,6 +1061,7 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  ret =3D -EFAULT;
  break;
  }
+)

  /* Ensure that the above copy has completed
  ** before advancing the remove pointer. */
@@ -927,18 +1071,33 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  }

  if (msgbufcount !=3D args.msgbufcount) {
+_CF32_FORK(
+ memcpy(
+ (void __user *)
+ &((VCHIQ_AWAIT_COMPLETION32_T *)arg)->
+ msgbufcount,
+ &msgbufcount,
+ sizeof(msgbufcount));
+,
  memcpy((void __user *)
  &((VCHIQ_AWAIT_COMPLETION_T *)arg)->
  msgbufcount,
  &msgbufcount,
  sizeof(msgbufcount));
+)
  }

   if (count !=3D args.count)
   {
+_CF32_FORK(
+ memcpy((void __user *)
+ &((VCHIQ_AWAIT_COMPLETION32_T *)arg)->count,
+ &count, sizeof(count));
+,
  memcpy((void __user *)
  &((VCHIQ_AWAIT_COMPLETION_T *)arg)->count,
  &count, sizeof(count));
+)
  }
  }

@@ -947,9 +1106,9 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,

  if ((ret =3D=3D 0) && instance->closing)
  ret =3D -ENOTCONN;
- /*
+ /*
   * XXXBSD: ioctl return codes are not negative as in linux, so
-  * we can not indicate success with positive number of passed
+  * we can not indicate success with positive number of passed
   * messages
   */
  if (ret > 0)
@@ -958,14 +1117,29 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  lmutex_unlock(&instance->completion_mutex);
  DEBUG_TRACE(AWAIT_COMPLETION_LINE);
  } break;
+#undef _CF32_CASE

+#ifdef COMPAT_FREEBSD32
+#define _CF32_CASE \
+ case VCHIQ_IOC_DEQUEUE_MESSAGE32:
+ _CF32_CASE
+#endif
  case VCHIQ_IOC_DEQUEUE_MESSAGE: {
  VCHIQ_DEQUEUE_MESSAGE_T args;
  USER_SERVICE_T *user_service;
  VCHIQ_HEADER_T *header;

  DEBUG_TRACE(DEQUEUE_MESSAGE_LINE);
+_CF32_FORK(
+ VCHIQ_DEQUEUE_MESSAGE32_T args32;
+ memcpy(&args32, (const void*)arg, sizeof(args32));
+ args.handle =3D args32.handle;
+ args.blocking =3D args32.blocking;
+ args.bufsize =3D args32.bufsize;
+ args.buf =3D (void *)(uintptr_t)args32.buf;
+,
  memcpy(&args, (const void*)arg, sizeof(args));
+)
  service =3D find_service_for_instance(instance, args.handle);
  if (!service) {
  ret =3D -EINVAL;
@@ -1022,8 +1196,19 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  header->data,
  header->size) =3D=3D 0)) {
  args.bufsize =3D header->size;
+_CF32_FORK(
+ VCHIQ_DEQUEUE_MESSAGE32_T args32;
+ args32.handle =3D args.handle;
+ args32.blocking =3D args.blocking;
+ args32.bufsize =3D args.bufsize;
+ args32.buf =3D (uintptr_t)(void *)args.buf;
+
+ memcpy((void *)arg, &args32,
+     sizeof(args32));
+,
  memcpy((void *)arg, &args,
      sizeof(args));
+)
  vchiq_release_message(
  service->handle,
  header);
@@ -1031,14 +1216,15 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  ret =3D -EFAULT;
  } else {
  vchiq_log_error(vchiq_arm_log_level,
- "header %x: bufsize %x < size %x",
- (unsigned int)header, args.bufsize,
+ "header %zx: bufsize %x < size %x",
+ (size_t)header, args.bufsize,
  header->size);
  WARN(1, "invalid size\n");
  ret =3D -EMSGSIZE;
  }
  DEBUG_TRACE(DEQUEUE_MESSAGE_LINE);
  } break;
+#undef _CF32_CASE

  case VCHIQ_IOC_GET_CLIENT_ID: {
  VCHIQ_SERVICE_HANDLE_T handle;
@@ -1048,11 +1234,24 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  ret =3D vchiq_get_client_id(handle);
  } break;

+#ifdef COMPAT_FREEBSD32
+#define _CF32_CASE \
+ case VCHIQ_IOC_GET_CONFIG32:
+ _CF32_CASE
+#endif
  case VCHIQ_IOC_GET_CONFIG: {
  VCHIQ_GET_CONFIG_T args;
  VCHIQ_CONFIG_T config;
-
+_CF32_FORK(
+ VCHIQ_GET_CONFIG32_T args32;
+
+ memcpy(&args32, (const void*)arg, sizeof(args32));
+ args.config_size =3D args32.config_size;
+ args.pconfig =3D (VCHIQ_CONFIG_T *)
+ (uintptr_t)args32.pconfig;
+,
  memcpy(&args, (const void*)arg, sizeof(args));
+)
  if (args.config_size > sizeof(config)) {
  ret =3D -EINVAL;
  break;
@@ -1066,6 +1265,7 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  }
  }
  } break;
+#undef _CF32_CASE

  case VCHIQ_IOC_SET_SERVICE_OPTION: {
  VCHIQ_SET_SERVICE_OPTION_T args;
@@ -1082,18 +1282,31 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  args.handle, args.option, args.value);
  } break;

+#ifdef COMPAT_FREEBSD32
+#define _CF32_CASE \
+ case VCHIQ_IOC_DUMP_PHYS_MEM32:
+ _CF32_CASE
+#endif
  case VCHIQ_IOC_DUMP_PHYS_MEM: {
  VCHIQ_DUMP_MEM_T  args;

+_CF32_FORK(
+ VCHIQ_DUMP_MEM32_T args32;
+ memcpy(&args32, (const void*)arg, sizeof(args32));
+ args.virt_addr =3D (void *)(uintptr_t)args32.virt_addr;
+ args.num_bytes =3D (size_t)args32.num_bytes;
+,
  memcpy(&args, (const void*)arg, sizeof(args));
+)
  printf("IMPLEMENT ME: %s:%d\n", __FILE__, __LINE__);
 #if 0
  dump_phys_mem(args.virt_addr, args.num_bytes);
 #endif
  } break;
+#undef _CF32_CASE

  case VCHIQ_IOC_LIB_VERSION: {
- unsigned int lib_version =3D (unsigned int)arg;
+ size_t lib_version =3D (size_t)arg;

  if (lib_version < VCHIQ_VERSION_MIN)
  ret =3D -EINVAL;
@@ -1119,6 +1332,7 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  ret =3D -ENOTTY;
  break;
  }
+#undef _CF32_FORK

  if (service)
  unlock_service(service);
@@ -1155,18 +1369,14 @@ vchiq_ioctl(struct cdev *cdev, u_long cmd,
caddr_t arg, int fflag,
  return ret;
 }

-static void
-instance_dtr(void *data)
-{

- kfree(data);
-}

 /*************************************************************************=
***
 *
 *   vchiq_open
 *
 **************************************************************************=
*/
+static void instance_dtr(void *data);

 static int
 vchiq_open(struct cdev *dev, int flags, int fmt __unused, struct thread *t=
d)
@@ -1206,7 +1416,7 @@ vchiq_open(struct cdev *dev, int flags, int fmt
__unused, struct thread *td)
  INIT_LIST_HEAD(&instance->bulk_waiter_list);

  devfs_set_cdevpriv(instance, instance_dtr);
- }
+ }
  else {
  vchiq_log_error(vchiq_arm_log_level,
  "Unknown minor device");
@@ -1222,143 +1432,151 @@ vchiq_open(struct cdev *dev, int flags, int
fmt __unused, struct thread *td)
 *
 **************************************************************************=
*/

+
 static int
-vchiq_close(struct cdev *dev, int flags __unused, int fmt __unused,
-                struct thread *td)
+_vchiq_close_instance(VCHIQ_INSTANCE_T instance)
 {
  int ret =3D 0;
- if (1) {
- VCHIQ_INSTANCE_T instance;
- VCHIQ_STATE_T *state =3D vchiq_get_state();
- VCHIQ_SERVICE_T *service;
- int i;
-
- if ((ret =3D devfs_get_cdevpriv((void**)&instance))) {
- printf("devfs_get_cdevpriv failed: error %d\n", ret);
- return (ret);
- }
-
- vchiq_log_info(vchiq_arm_log_level,
- "vchiq_release: instance=3D%lx",
- (unsigned long)instance);
-
- if (!state) {
- ret =3D -EPERM;
- goto out;
- }
+ VCHIQ_STATE_T *state =3D vchiq_get_state();
+ VCHIQ_SERVICE_T *service;
+ int i;

- /* Ensure videocore is awake to allow termination. */
- vchiq_use_internal(instance->state, NULL,
- USE_TYPE_VCHIQ);
+ vchiq_log_info(vchiq_arm_log_level,
+ "vchiq_release: instance=3D%lx",
+ (unsigned long)instance);

- lmutex_lock(&instance->completion_mutex);
+ if (!state) {
+ ret =3D -EPERM;
+ goto out;
+ }

- /* Wake the completion thread and ask it to exit */
- instance->closing =3D 1;
- up(&instance->insert_event);
+ /* Ensure videocore is awake to allow termination. */
+ vchiq_use_internal(instance->state, NULL,
+ USE_TYPE_VCHIQ);

- lmutex_unlock(&instance->completion_mutex);
+ lmutex_lock(&instance->completion_mutex);

- /* Wake the slot handler if the completion queue is full. */
- up(&instance->remove_event);
+ /* Wake the completion thread and ask it to exit */
+ instance->closing =3D 1;
+ up(&instance->insert_event);

- /* Mark all services for termination... */
- i =3D 0;
- while ((service =3D next_service_by_instance(state, instance,
- &i)) !=3D NULL) {
- USER_SERVICE_T *user_service =3D service->base.userdata;
+ lmutex_unlock(&instance->completion_mutex);

- /* Wake the slot handler if the msg queue is full. */
- up(&user_service->remove_event);
+ /* Wake the slot handler if the completion queue is full. */
+ up(&instance->remove_event);

- vchiq_terminate_service_internal(service);
- unlock_service(service);
- }
+ /* Mark all services for termination... */
+ i =3D 0;
+ while ((service =3D next_service_by_instance(state, instance,
+ &i)) !=3D NULL) {
+ USER_SERVICE_T *user_service =3D service->base.userdata;

- /* ...and wait for them to die */
- i =3D 0;
- while ((service =3D next_service_by_instance(state, instance, &i))
- !=3D NULL) {
- USER_SERVICE_T *user_service =3D service->base.userdata;
+ /* Wake the slot handler if the msg queue is full. */
+ up(&user_service->remove_event);

- down(&service->remove_event);
+ vchiq_terminate_service_internal(service);
+ unlock_service(service);
+ }

- BUG_ON(service->srvstate !=3D VCHIQ_SRVSTATE_FREE);
+ /* ...and wait for them to die */
+ i =3D 0;
+ while ((service =3D next_service_by_instance(state, instance, &i))
+ !=3D NULL) {
+ USER_SERVICE_T *user_service =3D service->base.userdata;

- spin_lock(&msg_queue_spinlock);
+ down(&service->remove_event);

- while (user_service->msg_remove !=3D
- user_service->msg_insert) {
- VCHIQ_HEADER_T *header =3D user_service->
- msg_queue[user_service->msg_remove &
- (MSG_QUEUE_SIZE - 1)];
- user_service->msg_remove++;
- spin_unlock(&msg_queue_spinlock);
+ BUG_ON(service->srvstate !=3D VCHIQ_SRVSTATE_FREE);

- if (header)
- vchiq_release_message(
- service->handle,
- header);
- spin_lock(&msg_queue_spinlock);
- }
+ spin_lock(&msg_queue_spinlock);

+ while (user_service->msg_remove !=3D
+ user_service->msg_insert) {
+ VCHIQ_HEADER_T *header =3D user_service->
+ msg_queue[user_service->msg_remove &
+ (MSG_QUEUE_SIZE - 1)];
+ user_service->msg_remove++;
  spin_unlock(&msg_queue_spinlock);

- unlock_service(service);
+ if (header)
+ vchiq_release_message(
+ service->handle,
+ header);
+ spin_lock(&msg_queue_spinlock);
  }

- /* Release any closed services */
- while (instance->completion_remove !=3D
- instance->completion_insert) {
- VCHIQ_COMPLETION_DATA_T *completion;
- VCHIQ_SERVICE_T *service1;
- completion =3D &instance->completions[
- instance->completion_remove &
- (MAX_COMPLETIONS - 1)];
- service1 =3D completion->service_userdata;
- if (completion->reason =3D=3D VCHIQ_SERVICE_CLOSED)
- {
- USER_SERVICE_T *user_service =3D
- service->base.userdata;
-
- /* Wake any blocked user-thread */
- if (instance->use_close_delivered)
- up(&user_service->close_event);
- unlock_service(service1);
- }
- instance->completion_remove++;
- }
+ spin_unlock(&msg_queue_spinlock);

- /* Release the PEER service count. */
- vchiq_release_internal(instance->state, NULL);
+ unlock_service(service);
+ }

+ /* Release any closed services */
+ while (instance->completion_remove !=3D
+ instance->completion_insert) {
+ VCHIQ_COMPLETION_DATA_T *completion;
+ VCHIQ_SERVICE_T *service;
+ completion =3D &instance->completions[
+ instance->completion_remove &
+ (MAX_COMPLETIONS - 1)];
+ service =3D completion->service_userdata;
+ if (completion->reason =3D=3D VCHIQ_SERVICE_CLOSED)
  {
- struct list_head *pos, *next;
- list_for_each_safe(pos, next,
- &instance->bulk_waiter_list) {
- struct bulk_waiter_node *waiter;
- waiter =3D list_entry(pos,
- struct bulk_waiter_node,
- list);
- list_del(pos);
- vchiq_log_info(vchiq_arm_log_level,
- "bulk_waiter - cleaned up %x "
- "for pid %d",
- (unsigned int)waiter, waiter->pid);
-                 _sema_destroy(&waiter->bulk_waiter.event);
- kfree(waiter);
- }
- }
+ USER_SERVICE_T *user_service =3D
+ service->base.userdata;

+ /* Wake any blocked user-thread */
+ if (instance->use_close_delivered)
+ up(&user_service->close_event);
+
+ unlock_service(service);
+ }
+ instance->completion_remove++;
  }
- else {
- vchiq_log_error(vchiq_arm_log_level,
- "Unknown minor device");
- ret =3D -ENXIO;
+
+ /* Release the PEER service count. */
+ vchiq_release_internal(instance->state, NULL);
+
+ {
+ struct list_head *pos, *next;
+ list_for_each_safe(pos, next,
+ &instance->bulk_waiter_list) {
+ struct bulk_waiter_node *waiter;
+ waiter =3D list_entry(pos,
+ struct bulk_waiter_node,
+ list);
+ list_del(pos);
+ vchiq_log_info(vchiq_arm_log_level,
+ "bulk_waiter - cleaned up %zx "
+ "for pid %d",
+ (size_t)waiter, waiter->pid);
+ _sema_destroy(&waiter->bulk_waiter.event);
+ kfree(waiter);
+ }
  }

 out:
  return ret;
+
+}
+
+static void
+instance_dtr(void *data)
+{
+ VCHIQ_INSTANCE_T instance =3D  data;
+ _vchiq_close_instance(instance);
+ kfree(data);
+}
+
+static int
+vchiq_close(struct cdev *dev, int flags __unused, int fmt __unused,
+                struct thread *td)
+{
+
+ /* XXXMDC it's privdata that tracks opens */
+ /* XXXMDC only get closes when there are no more open fds on a vnode */
+
+ return(0);
+
 }

 /*************************************************************************=
***
@@ -1435,9 +1653,9 @@ vchiq_dump_platform_instances(void *dump_context)
  instance =3D service->instance;
  if (instance && !instance->mark) {
  len =3D snprintf(buf, sizeof(buf),
- "Instance %x: pid %d,%s completions "
+ "Instance %zx: pid %d,%s completions "
  "%d/%d",
- (unsigned int)instance, instance->pid,
+ (size_t)instance, instance->pid,
  instance->connected ? " connected, " :
  "",
  instance->completion_insert -
@@ -1465,8 +1683,8 @@ vchiq_dump_platform_service_state(void
*dump_context, VCHIQ_SERVICE_T *service)
  char buf[80];
  int len;

- len =3D snprintf(buf, sizeof(buf), "  instance %x",
- (unsigned int)service->instance);
+ len =3D snprintf(buf, sizeof(buf), "  instance %zx",
+ (size_t)service->instance);

  if ((service->base.callback =3D=3D service_callback) &&
  user_service->is_vchi) {
diff --git a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_core.c
b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_core.c
index 2e30dd7dc3de..80a3a531e8b5 100644
--- a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_core.c
+++ b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_core.c
@@ -31,6 +31,9 @@
  * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */

+/* For the PRIu64 format identifier */
+#include <machine/_inttypes.h>
+
 #include "vchiq_core.h"
 #include "vchiq_killable.h"

@@ -392,9 +395,9 @@ make_service_callback(VCHIQ_SERVICE_T *service,
VCHIQ_REASON_T reason,
  VCHIQ_HEADER_T *header, void *bulk_userdata)
 {
  VCHIQ_STATUS_T status;
- vchiq_log_trace(vchiq_core_log_level, "%d: callback:%d (%s, %x, %x)",
+ vchiq_log_trace(vchiq_core_log_level, "%d: callback:%d (%s, %tx, %tx)",
  service->state->id, service->localport, reason_names[reason],
- (unsigned int)header, (unsigned int)bulk_userdata);
+ (size_t)header, (size_t)bulk_userdata);
  status =3D service->base.callback(reason, header, service->handle,
  bulk_userdata);
  if (status =3D=3D VCHIQ_ERROR) {
@@ -417,13 +420,15 @@ vchiq_set_conn_state(VCHIQ_STATE_T *state,
VCHIQ_CONNSTATE_T newstate)
  vchiq_platform_conn_state_changed(state, oldstate, newstate);
 }

+#define ACTUAL_EVENT_SEM_ADDR(ref,offset)\
+ ((struct semaphore *)(((size_t) ref) + ((size_t) offset)))
 static inline void
-remote_event_create(REMOTE_EVENT_T *event)
+remote_event_create(VCHIQ_STATE_T *ref, REMOTE_EVENT_T *event)
 {
  event->armed =3D 0;
  /* Don't clear the 'fired' flag because it may already have been set
  ** by the other side. */
- _sema_init(event->event, 0);
+ _sema_init(ACTUAL_EVENT_SEM_ADDR(ref,event->event), 0);
 }

 __unused static inline void
@@ -433,13 +438,18 @@ remote_event_destroy(REMOTE_EVENT_T *event)
 }

 static inline int
-remote_event_wait(REMOTE_EVENT_T *event)
+remote_event_wait(VCHIQ_STATE_T *ref, REMOTE_EVENT_T *event)
 {
  if (!event->fired) {
  event->armed =3D 1;
+#if defined(__aarch64__)
+ dsb(sy);
+#else
  dsb();
+#endif
+
  if (!event->fired) {
- if (down_interruptible(event->event) !=3D 0) {
+ if (down_interruptible(ACTUAL_EVENT_SEM_ADDR(ref,event->event)) !=3D 0) {
  event->armed =3D 0;
  return 0;
  }
@@ -453,26 +463,32 @@ remote_event_wait(REMOTE_EVENT_T *event)
 }

 static inline void
-remote_event_signal_local(REMOTE_EVENT_T *event)
+remote_event_signal_local(VCHIQ_STATE_T *ref, REMOTE_EVENT_T *event)
 {
+/*
+ * Mirror
+ * https://github.com/raspberrypi/linux/commit/a50c4c9a65779ca835746b5fd79=
d3d5278afbdbe
+ * for extra safety
+ */
+ event->fired =3D 1;
  event->armed =3D 0;
- up(event->event);
+ up(ACTUAL_EVENT_SEM_ADDR(ref,event->event));
 }

 static inline void
-remote_event_poll(REMOTE_EVENT_T *event)
+remote_event_poll(VCHIQ_STATE_T *ref, REMOTE_EVENT_T *event)
 {
  if (event->fired && event->armed)
- remote_event_signal_local(event);
+ remote_event_signal_local(ref,event);
 }

 void
 remote_event_pollall(VCHIQ_STATE_T *state)
 {
- remote_event_poll(&state->local->sync_trigger);
- remote_event_poll(&state->local->sync_release);
- remote_event_poll(&state->local->trigger);
- remote_event_poll(&state->local->recycle);
+ remote_event_poll(state , &state->local->sync_trigger);
+ remote_event_poll(state , &state->local->sync_release);
+ remote_event_poll(state , &state->local->trigger);
+ remote_event_poll(state , &state->local->recycle);
 }

 /* Round up message sizes so that any space at the end of a slot is always=
 big
@@ -553,7 +569,7 @@ request_poll(VCHIQ_STATE_T *state, VCHIQ_SERVICE_T
*service, int poll_type)
  wmb();

  /* ... and ensure the slot handler runs. */
- remote_event_signal_local(&state->local->trigger);
+ remote_event_signal_local(state, &state->local->trigger);
 }

 /* Called from queue_message, by the slot handler and application threads,
@@ -640,8 +656,8 @@ process_free_queue(VCHIQ_STATE_T *state)

  rmb();

- vchiq_log_trace(vchiq_core_log_level, "%d: pfq %d=3D%x %x %x",
- state->id, slot_index, (unsigned int)data,
+ vchiq_log_trace(vchiq_core_log_level, "%d: pfq %d=3D%tx %x %x",
+ state->id, slot_index, (size_t)data,
  local->slot_queue_recycle, slot_queue_available);

  /* Initialise the bitmask for services which have used this
@@ -675,13 +691,13 @@ process_free_queue(VCHIQ_STATE_T *state)
  vchiq_log_error(vchiq_core_log_level,
  "service %d "
  "message_use_count=3D%d "
- "(header %x, msgid %x, "
+ "(header %tx, msgid %x, "
  "header->msgid %x, "
  "header->size %x)",
  port,
  service_quota->
  message_use_count,
- (unsigned int)header, msgid,
+ (size_t)header, msgid,
  header->msgid,
  header->size);
  WARN(1, "invalid message use count\n");
@@ -704,24 +720,24 @@ process_free_queue(VCHIQ_STATE_T *state)
  up(&service_quota->quota_event);
  vchiq_log_trace(
  vchiq_core_log_level,
- "%d: pfq:%d %x@%x - "
+ "%d: pfq:%d %x@%tx - "
  "slot_use->%d",
  state->id, port,
  header->size,
- (unsigned int)header,
+ (size_t)header,
  count - 1);
  } else {
  vchiq_log_error(
  vchiq_core_log_level,
  "service %d "
  "slot_use_count"
- "=3D%d (header %x"
+ "=3D%d (header %tx"
  ", msgid %x, "
  "header->msgid"
  " %x, header->"
  "size %x)",
  port, count,
- (unsigned int)header,
+ (size_t)header,
  msgid,
  header->msgid,
  header->size);
@@ -735,9 +751,9 @@ process_free_queue(VCHIQ_STATE_T *state)
  pos +=3D calc_stride(header->size);
  if (pos > VCHIQ_SLOT_SIZE) {
  vchiq_log_error(vchiq_core_log_level,
- "pfq - pos %x: header %x, msgid %x, "
+ "pfq - pos %x: header %tx, msgid %x, "
  "header->msgid %x, header->size %x",
- pos, (unsigned int)header, msgid,
+ pos, (size_t)header, msgid,
  header->msgid, header->size);
  WARN(1, "invalid slot position\n");
  }
@@ -885,17 +901,16 @@ queue_message(VCHIQ_STATE_T *state,
VCHIQ_SERVICE_T *service,
  int slot_use_count;

  vchiq_log_info(vchiq_core_log_level,
- "%d: qm %s@%x,%x (%d->%d)",
+ "%d: qm %s@%tx,%x (%d->%d)",
  state->id,
  msg_type_str(VCHIQ_MSG_TYPE(msgid)),
- (unsigned int)header, size,
+ (size_t)header, size,
  VCHIQ_MSG_SRCPORT(msgid),
  VCHIQ_MSG_DSTPORT(msgid));

  BUG_ON(!service);
  BUG_ON((flags & (QMFLAGS_NO_MUTEX_LOCK |
   QMFLAGS_NO_MUTEX_UNLOCK)) !=3D 0);
-
  for (i =3D 0, pos =3D 0; i < (unsigned int)count;
  pos +=3D elements[i++].size)
  if (elements[i].size) {
@@ -951,9 +966,9 @@ queue_message(VCHIQ_STATE_T *state,
VCHIQ_SERVICE_T *service,
  VCHIQ_SERVICE_STATS_ADD(service, ctrl_tx_bytes, size);
  } else {
  vchiq_log_info(vchiq_core_log_level,
- "%d: qm %s@%x,%x (%d->%d)", state->id,
+ "%d: qm %s@%tx,%x (%d->%d)", state->id,
  msg_type_str(VCHIQ_MSG_TYPE(msgid)),
- (unsigned int)header, size,
+ (size_t)header, size,
  VCHIQ_MSG_SRCPORT(msgid),
  VCHIQ_MSG_DSTPORT(msgid));
  if (size !=3D 0) {
@@ -1017,7 +1032,7 @@ queue_message_sync(VCHIQ_STATE_T *state,
VCHIQ_SERVICE_T *service,
  (lmutex_lock_interruptible(&state->sync_mutex) !=3D 0))
  return VCHIQ_RETRY;

- remote_event_wait(&local->sync_release);
+ remote_event_wait(state, &local->sync_release);

  rmb();

@@ -1036,9 +1051,9 @@ queue_message_sync(VCHIQ_STATE_T *state,
VCHIQ_SERVICE_T *service,
  int i, pos;

  vchiq_log_info(vchiq_sync_log_level,
- "%d: qms %s@%x,%x (%d->%d)", state->id,
+ "%d: qms %s@%tx,%x (%d->%d)", state->id,
  msg_type_str(VCHIQ_MSG_TYPE(msgid)),
- (unsigned int)header, size,
+ (size_t)header, size,
  VCHIQ_MSG_SRCPORT(msgid),
  VCHIQ_MSG_DSTPORT(msgid));

@@ -1065,9 +1080,9 @@ queue_message_sync(VCHIQ_STATE_T *state,
VCHIQ_SERVICE_T *service,
  VCHIQ_SERVICE_STATS_ADD(service, ctrl_tx_bytes, size);
  } else {
  vchiq_log_info(vchiq_sync_log_level,
- "%d: qms %s@%x,%x (%d->%d)", state->id,
+ "%d: qms %s@%tx,%x (%d->%d)", state->id,
  msg_type_str(VCHIQ_MSG_TYPE(msgid)),
- (unsigned int)header, size,
+ (size_t)header, size,
  VCHIQ_MSG_SRCPORT(msgid),
  VCHIQ_MSG_DSTPORT(msgid));
  if (size !=3D 0) {
@@ -1098,9 +1113,6 @@ queue_message_sync(VCHIQ_STATE_T *state,
VCHIQ_SERVICE_T *service,
  size);
  }

- /* Make sure the new header is visible to the peer. */
- wmb();
-
  remote_event_signal(&state->remote->sync_trigger);

  if (VCHIQ_MSG_TYPE(msgid) !=3D VCHIQ_MSG_PAUSE)
@@ -1368,26 +1380,26 @@ resolve_bulks(VCHIQ_SERVICE_T *service,
VCHIQ_BULK_QUEUE_T *queue)
  "Send Bulk to" : "Recv Bulk from";
  if (bulk->actual !=3D VCHIQ_BULK_ACTUAL_ABORTED)
  vchiq_log_info(SRVTRACE_LEVEL(service),
- "%s %c%c%c%c d:%d len:%d %x<->%x",
+ "%s %c%c%c%c d:%d len:%d %tx<->%tx",
  header,
  VCHIQ_FOURCC_AS_4CHARS(
  service->base.fourcc),
  service->remoteport,
  bulk->size,
- (unsigned int)bulk->data,
- (unsigned int)bulk->remote_data);
+ (size_t)bulk->data,
+ (size_t)bulk->remote_data);
  else
  vchiq_log_info(SRVTRACE_LEVEL(service),
  "%s %c%c%c%c d:%d ABORTED - tx len:%d,"
- " rx len:%d %x<->%x",
+ " rx len:%d %tx<->%tx",
  header,
  VCHIQ_FOURCC_AS_4CHARS(
  service->base.fourcc),
  service->remoteport,
  bulk->size,
  bulk->remote_size,
- (unsigned int)bulk->data,
- (unsigned int)bulk->remote_data);
+ (size_t)bulk->data,
+ (size_t)bulk->remote_data);
  }

  vchiq_complete_bulk(bulk);
@@ -1522,8 +1534,8 @@ parse_open(VCHIQ_STATE_T *state, VCHIQ_HEADER_T *head=
er)

  fourcc =3D payload->fourcc;
  vchiq_log_info(vchiq_core_log_level,
- "%d: prs OPEN@%x (%d->'%c%c%c%c')",
- state->id, (unsigned int)header,
+ "%d: prs OPEN@%tx (%d->'%c%c%c%c')",
+ state->id, (size_t)header,
  localport,
  VCHIQ_FOURCC_AS_4CHARS(fourcc));

@@ -1661,7 +1673,7 @@ parse_rx_slots(VCHIQ_STATE_T *state)

  header =3D (VCHIQ_HEADER_T *)(state->rx_data +
  (state->rx_pos & VCHIQ_SLOT_MASK));
- DEBUG_VALUE(PARSE_HEADER, (int)header);
+ DEBUG_VALUE(PARSE_HEADER, (size_t)header);
  msgid =3D header->msgid;
  DEBUG_VALUE(PARSE_MSGID, msgid);
  size =3D header->size;
@@ -1695,20 +1707,20 @@ parse_rx_slots(VCHIQ_STATE_T *state)
  remoteport);
  if (service)
  vchiq_log_warning(vchiq_core_log_level,
- "%d: prs %s@%x (%d->%d) - "
+ "%d: prs %s@%tx (%d->%d) - "
  "found connected service %d",
  state->id, msg_type_str(type),
- (unsigned int)header,
+ (size_t)header,
  remoteport, localport,
  service->localport);
  }

  if (!service) {
  vchiq_log_error(vchiq_core_log_level,
- "%d: prs %s@%x (%d->%d) - "
+ "%d: prs %s@%zx (%d->%d) - "
  "invalid/closed service %d",
  state->id, msg_type_str(type),
- (unsigned int)header,
+ (size_t)header,
  remoteport, localport, localport);
  goto skip_message;
  }
@@ -1734,12 +1746,12 @@ parse_rx_slots(VCHIQ_STATE_T *state)
  min(16, size));
  }

- if (((unsigned int)header & VCHIQ_SLOT_MASK) + calc_stride(size)
+ if (((size_t)header & VCHIQ_SLOT_MASK) + calc_stride(size)
  > VCHIQ_SLOT_SIZE) {
  vchiq_log_error(vchiq_core_log_level,
- "header %x (msgid %x) - size %x too big for "
+ "header %tx (msgid %x) - size %x too big for "
  "slot",
- (unsigned int)header, (unsigned int)msgid,
+ (size_t)header, (unsigned int)msgid,
  (unsigned int)size);
  WARN(1, "oversized for slot\n");
  }
@@ -1758,8 +1770,8 @@ parse_rx_slots(VCHIQ_STATE_T *state)
  service->peer_version =3D payload->version;
  }
  vchiq_log_info(vchiq_core_log_level,
- "%d: prs OPENACK@%x,%x (%d->%d) v:%d",
- state->id, (unsigned int)header, size,
+ "%d: prs OPENACK@%tx,%x (%d->%d) v:%d",
+ state->id, (size_t)header, size,
  remoteport, localport, service->peer_version);
  if (service->srvstate =3D=3D
  VCHIQ_SRVSTATE_OPENING) {
@@ -1776,8 +1788,8 @@ parse_rx_slots(VCHIQ_STATE_T *state)
  WARN_ON(size !=3D 0); /* There should be no data */

  vchiq_log_info(vchiq_core_log_level,
- "%d: prs CLOSE@%x (%d->%d)",
- state->id, (unsigned int)header,
+ "%d: prs CLOSE@%tx (%d->%d)",
+ state->id, (size_t)header,
  remoteport, localport);

  mark_service_closing_internal(service, 1);
@@ -1794,8 +1806,8 @@ parse_rx_slots(VCHIQ_STATE_T *state)
  break;
  case VCHIQ_MSG_DATA:
  vchiq_log_info(vchiq_core_log_level,
- "%d: prs DATA@%x,%x (%d->%d)",
- state->id, (unsigned int)header, size,
+ "%d: prs DATA@%tx,%x (%d->%d)",
+ state->id, (size_t)header, size,
  remoteport, localport);

  if ((service->remoteport =3D=3D remoteport)
@@ -1819,14 +1831,23 @@ parse_rx_slots(VCHIQ_STATE_T *state)
  break;
  case VCHIQ_MSG_CONNECT:
  vchiq_log_info(vchiq_core_log_level,
- "%d: prs CONNECT@%x",
- state->id, (unsigned int)header);
+ "%d: prs CONNECT@%tx",
+ state->id, (size_t)header);
  state->version_common =3D ((VCHIQ_SLOT_ZERO_T *)
   state->slot_data)->version;
  up(&state->connect);
  break;
+/*
+ * XXXMDC Apparently nothing uses this
+ * https://github.com/raspberrypi/linux/commit/14f4d72fb799a9b3170a45ab80d=
4a3ddad541960
+ * but taking out the master bits is a whole new job
+ */
  case VCHIQ_MSG_BULK_RX:
- case VCHIQ_MSG_BULK_TX: {
+ case VCHIQ_MSG_BULK_TX:
+ WARN_ON(1);
+ break;
+#if 0
+ {
  VCHIQ_BULK_QUEUE_T *queue;
  WARN_ON(!state->is_master);
  queue =3D (type =3D=3D VCHIQ_MSG_BULK_RX) ?
@@ -1854,12 +1875,12 @@ parse_rx_slots(VCHIQ_STATE_T *state)
  wmb();

  vchiq_log_info(vchiq_core_log_level,
- "%d: prs %s@%x (%d->%d) %x@%x",
+ "%d: prs %s@%tx (%d->%d) %x@%tx",
  state->id, msg_type_str(type),
- (unsigned int)header,
+ (size_t)header,
  remoteport, localport,
  bulk->remote_size,
- (unsigned int)bulk->remote_data);
+ (size_t)bulk->remote_data);

  queue->remote_insert++;

@@ -1888,9 +1909,11 @@ parse_rx_slots(VCHIQ_STATE_T *state)
  lmutex_unlock(&service->bulk_mutex);
  if (resolved)
  notify_bulks(service, queue,
- 1/*retry_poll*/);
+ 1//retry_poll
+ );
  }
- } break;
+ }
+#endif
  case VCHIQ_MSG_BULK_RX_DONE:
  case VCHIQ_MSG_BULK_TX_DONE:
  WARN_ON(state->is_master);
@@ -1912,10 +1935,10 @@ parse_rx_slots(VCHIQ_STATE_T *state)
  if ((int)(queue->remote_insert -
  queue->local_insert) >=3D 0) {
  vchiq_log_error(vchiq_core_log_level,
- "%d: prs %s@%x (%d->%d) "
+ "%d: prs %s@%tx (%d->%d) "
  "unexpected (ri=3D%d,li=3D%d)",
  state->id, msg_type_str(type),
- (unsigned int)header,
+ (size_t)header,
  remoteport, localport,
  queue->remote_insert,
  queue->local_insert);
@@ -1932,11 +1955,11 @@ parse_rx_slots(VCHIQ_STATE_T *state)
  queue->remote_insert++;

  vchiq_log_info(vchiq_core_log_level,
- "%d: prs %s@%x (%d->%d) %x@%x",
+ "%d: prs %s@%tx (%d->%d) %x@%tx",
  state->id, msg_type_str(type),
- (unsigned int)header,
+ (size_t)header,
  remoteport, localport,
- bulk->actual, (unsigned int)bulk->data);
+ bulk->actual, (size_t)bulk->data);

  vchiq_log_trace(vchiq_core_log_level,
  "%d: prs:%d %cx li=3D%x ri=3D%x p=3D%x",
@@ -1958,14 +1981,14 @@ parse_rx_slots(VCHIQ_STATE_T *state)
  break;
  case VCHIQ_MSG_PADDING:
  vchiq_log_trace(vchiq_core_log_level,
- "%d: prs PADDING@%x,%x",
- state->id, (unsigned int)header, size);
+ "%d: prs PADDING@%tx,%x",
+ state->id, (size_t)header, size);
  break;
  case VCHIQ_MSG_PAUSE:
  /* If initiated, signal the application thread */
  vchiq_log_trace(vchiq_core_log_level,
- "%d: prs PAUSE@%x,%x",
- state->id, (unsigned int)header, size);
+ "%d: prs PAUSE@%tx,%x",
+ state->id, (size_t)header, size);
  if (state->conn_state =3D=3D VCHIQ_CONNSTATE_PAUSED) {
  vchiq_log_error(vchiq_core_log_level,
  "%d: PAUSE received in state PAUSED",
@@ -1988,8 +2011,8 @@ parse_rx_slots(VCHIQ_STATE_T *state)
  break;
  case VCHIQ_MSG_RESUME:
  vchiq_log_trace(vchiq_core_log_level,
- "%d: prs RESUME@%x,%x",
- state->id, (unsigned int)header, size);
+ "%d: prs RESUME@%tx,%x",
+ state->id, (size_t)header, size);
  /* Release the slot mutex */
  lmutex_unlock(&state->slot_mutex);
  if (state->is_master)
@@ -2010,8 +2033,8 @@ parse_rx_slots(VCHIQ_STATE_T *state)

  default:
  vchiq_log_error(vchiq_core_log_level,
- "%d: prs invalid msgid %x@%x,%x",
- state->id, msgid, (unsigned int)header, size);
+ "%d: prs invalid msgid %x@%tx,%x",
+ state->id, msgid, (size_t)header, size);
  WARN(1, "invalid message\n");
  break;
  }
@@ -2051,7 +2074,7 @@ slot_handler_func(void *v)
  while (1) {
  DEBUG_COUNT(SLOT_HANDLER_COUNT);
  DEBUG_TRACE(SLOT_HANDLER_LINE);
- remote_event_wait(&local->trigger);
+ remote_event_wait(state, &local->trigger);

  rmb();

@@ -2141,8 +2164,7 @@ recycle_func(void *v)
  VCHIQ_SHARED_STATE_T *local =3D state->local;

  while (1) {
- remote_event_wait(&local->recycle);
-
+ remote_event_wait(state, &local->recycle);
  process_free_queue(state);
  }
  return 0;
@@ -2165,7 +2187,7 @@ sync_func(void *v)
  int type;
  unsigned int localport, remoteport;

- remote_event_wait(&local->sync_trigger);
+ remote_event_wait(state, &local->sync_trigger);

  rmb();

@@ -2179,10 +2201,10 @@ sync_func(void *v)

  if (!service) {
  vchiq_log_error(vchiq_sync_log_level,
- "%d: sf %s@%x (%d->%d) - "
+ "%d: sf %s@%tx (%d->%d) - "
  "invalid/closed service %d",
  state->id, msg_type_str(type),
- (unsigned int)header,
+ (size_t)header,
  remoteport, localport, localport);
  release_message_sync(state, header);
  continue;
@@ -2213,8 +2235,8 @@ sync_func(void *v)
  service->peer_version =3D payload->version;
  }
  vchiq_log_info(vchiq_sync_log_level,
- "%d: sf OPENACK@%x,%x (%d->%d) v:%d",
- state->id, (unsigned int)header, size,
+ "%d: sf OPENACK@%tx,%x (%d->%d) v:%d",
+ state->id, (size_t)header, size,
  remoteport, localport, service->peer_version);
  if (service->srvstate =3D=3D VCHIQ_SRVSTATE_OPENING) {
  service->remoteport =3D remoteport;
@@ -2228,8 +2250,8 @@ sync_func(void *v)

  case VCHIQ_MSG_DATA:
  vchiq_log_trace(vchiq_sync_log_level,
- "%d: sf DATA@%x,%x (%d->%d)",
- state->id, (unsigned int)header, size,
+ "%d: sf DATA@%tx,%x (%d->%d)",
+ state->id, (size_t)header, size,
  remoteport, localport);

  if ((service->remoteport =3D=3D remoteport) &&
@@ -2248,8 +2270,8 @@ sync_func(void *v)

  default:
  vchiq_log_error(vchiq_sync_log_level,
- "%d: sf unexpected msgid %x@%x,%x",
- state->id, msgid, (unsigned int)header, size);
+ "%d: sf unexpected msgid %x@%tx,%x",
+ state->id, msgid, (size_t)header, size);
  release_message_sync(state, header);
  break;
  }
@@ -2282,7 +2304,7 @@ get_conn_state_name(VCHIQ_CONNSTATE_T conn_state)
 VCHIQ_SLOT_ZERO_T *
 vchiq_init_slots(void *mem_base, int mem_size)
 {
- int mem_align =3D (VCHIQ_SLOT_SIZE - (int)mem_base) & VCHIQ_SLOT_MASK;
+ int mem_align =3D (int)((VCHIQ_SLOT_SIZE - (long)mem_base) & VCHIQ_SLOT_M=
ASK);
  VCHIQ_SLOT_ZERO_T *slot_zero =3D
  (VCHIQ_SLOT_ZERO_T *)((char *)mem_base + mem_align);
  int num_slots =3D (mem_size - mem_align)/VCHIQ_SLOT_SIZE;
@@ -2334,8 +2356,8 @@ vchiq_init_state(VCHIQ_STATE_T *state,
VCHIQ_SLOT_ZERO_T *slot_zero,
  if (slot_zero->magic !=3D VCHIQ_MAGIC) {
  vchiq_loud_error_header();
  vchiq_loud_error("Invalid VCHIQ magic value found.");
- vchiq_loud_error("slot_zero=3D%x: magic=3D%x (expected %x)",
- (unsigned int)slot_zero, slot_zero->magic, VCHIQ_MAGIC);
+ vchiq_loud_error("slot_zero=3D%tx: magic=3D%x (expected %x)",
+ (size_t)slot_zero, slot_zero->magic, VCHIQ_MAGIC);
  vchiq_loud_error_footer();
  return VCHIQ_ERROR;
  }
@@ -2348,9 +2370,9 @@ vchiq_init_state(VCHIQ_STATE_T *state,
VCHIQ_SLOT_ZERO_T *slot_zero,
  if (slot_zero->version < VCHIQ_VERSION_MIN) {
  vchiq_loud_error_header();
  vchiq_loud_error("Incompatible VCHIQ versions found.");
- vchiq_loud_error("slot_zero=3D%x: VideoCore version=3D%d "
+ vchiq_loud_error("slot_zero=3D%tx: VideoCore version=3D%d "
  "(minimum %d)",
- (unsigned int)slot_zero, slot_zero->version,
+ (size_t)slot_zero, slot_zero->version,
  VCHIQ_VERSION_MIN);
  vchiq_loud_error("Restart with a newer VideoCore image.");
  vchiq_loud_error_footer();
@@ -2360,9 +2382,9 @@ vchiq_init_state(VCHIQ_STATE_T *state,
VCHIQ_SLOT_ZERO_T *slot_zero,
  if (VCHIQ_VERSION < slot_zero->version_min) {
  vchiq_loud_error_header();
  vchiq_loud_error("Incompatible VCHIQ versions found.");
- vchiq_loud_error("slot_zero=3D%x: version=3D%d (VideoCore "
+ vchiq_loud_error("slot_zero=3D%tx: version=3D%d (VideoCore "
  "minimum %d)",
- (unsigned int)slot_zero, VCHIQ_VERSION,
+ (size_t)slot_zero, VCHIQ_VERSION,
  slot_zero->version_min);
  vchiq_loud_error("Restart with a newer kernel.");
  vchiq_loud_error_footer();
@@ -2375,25 +2397,25 @@ vchiq_init_state(VCHIQ_STATE_T *state,
VCHIQ_SLOT_ZERO_T *slot_zero,
   (slot_zero->max_slots_per_side !=3D VCHIQ_MAX_SLOTS_PER_SIDE)) {
  vchiq_loud_error_header();
  if (slot_zero->slot_zero_size !=3D sizeof(VCHIQ_SLOT_ZERO_T))
- vchiq_loud_error("slot_zero=3D%x: slot_zero_size=3D%x "
+ vchiq_loud_error("slot_zero=3D%tx: slot_zero_size=3D%x "
  "(expected %zx)",
- (unsigned int)slot_zero,
+ (size_t)slot_zero,
  slot_zero->slot_zero_size,
  sizeof(VCHIQ_SLOT_ZERO_T));
  if (slot_zero->slot_size !=3D VCHIQ_SLOT_SIZE)
- vchiq_loud_error("slot_zero=3D%x: slot_size=3D%d "
+ vchiq_loud_error("slot_zero=3D%tx: slot_size=3D%d "
  "(expected %d",
- (unsigned int)slot_zero, slot_zero->slot_size,
+ (size_t)slot_zero, slot_zero->slot_size,
  VCHIQ_SLOT_SIZE);
  if (slot_zero->max_slots !=3D VCHIQ_MAX_SLOTS)
- vchiq_loud_error("slot_zero=3D%x: max_slots=3D%d "
+ vchiq_loud_error("slot_zero=3D%tx: max_slots=3D%d "
  "(expected %d)",
- (unsigned int)slot_zero, slot_zero->max_slots,
+ (size_t)slot_zero, slot_zero->max_slots,
  VCHIQ_MAX_SLOTS);
  if (slot_zero->max_slots_per_side !=3D VCHIQ_MAX_SLOTS_PER_SIDE)
- vchiq_loud_error("slot_zero=3D%x: max_slots_per_side=3D%d "
+ vchiq_loud_error("slot_zero=3D%tx: max_slots_per_side=3D%d "
  "(expected %d)",
- (unsigned int)slot_zero,
+ (size_t)slot_zero,
  slot_zero->max_slots_per_side,
  VCHIQ_MAX_SLOTS_PER_SIDE);
  vchiq_loud_error_footer();
@@ -2478,24 +2500,24 @@ vchiq_init_state(VCHIQ_STATE_T *state,
VCHIQ_SLOT_ZERO_T *slot_zero,
  state->data_use_count =3D 0;
  state->data_quota =3D state->slot_queue_available - 1;

- local->trigger.event =3D &state->trigger_event;
- remote_event_create(&local->trigger);
+ local->trigger.event =3D offsetof(VCHIQ_STATE_T, trigger_event);
+ remote_event_create(state, &local->trigger);
  local->tx_pos =3D 0;

- local->recycle.event =3D &state->recycle_event;
- remote_event_create(&local->recycle);
+ local->recycle.event =3D offsetof(VCHIQ_STATE_T, recycle_event);
+ remote_event_create(state, &local->recycle);
  local->slot_queue_recycle =3D state->slot_queue_available;

- local->sync_trigger.event =3D &state->sync_trigger_event;
- remote_event_create(&local->sync_trigger);
+ local->sync_trigger.event =3D offsetof(VCHIQ_STATE_T, sync_trigger_event)=
;
+ remote_event_create(state, &local->sync_trigger);

- local->sync_release.event =3D &state->sync_release_event;
- remote_event_create(&local->sync_release);
+ local->sync_release.event =3D offsetof(VCHIQ_STATE_T, sync_release_event)=
;
+ remote_event_create(state, &local->sync_release);

  /* At start-of-day, the slot is empty and available */
  ((VCHIQ_HEADER_T *)SLOT_DATA_FROM_INDEX(state, local->slot_sync))->msgid
  =3D VCHIQ_MSGID_PADDING;
- remote_event_signal_local(&local->sync_release);
+ remote_event_signal_local(state, &local->sync_release);

  local->debug[DEBUG_ENTRIES] =3D DEBUG_MAX;

@@ -2775,18 +2797,18 @@ release_service_messages(VCHIQ_SERVICE_T *service)
  if ((port =3D=3D service->localport) &&
  (msgid & VCHIQ_MSGID_CLAIMED)) {
  vchiq_log_info(vchiq_core_log_level,
- "  fsi - hdr %x",
- (unsigned int)header);
+ "  fsi - hdr %tx",
+ (size_t)header);
  release_slot(state, slot_info, header,
  NULL);
  }
  pos +=3D calc_stride(header->size);
  if (pos > VCHIQ_SLOT_SIZE) {
  vchiq_log_error(vchiq_core_log_level,
- "fsi - pos %x: header %x, "
+ "fsi - pos %x: header %tx, "
  "msgid %x, header->msgid %x, "
  "header->size %x",
- pos, (unsigned int)header,
+ pos, (size_t)header,
  msgid, header->msgid,
  header->size);
  WARN(1, "invalid slot position\n");
@@ -3360,10 +3382,10 @@ vchiq_bulk_transfer(VCHIQ_SERVICE_HANDLE_T handle,
  wmb();

  vchiq_log_info(vchiq_core_log_level,
- "%d: bt (%d->%d) %cx %x@%x %x",
+ "%d: bt (%d->%d) %cx %x@%tx %tx",
  state->id,
  service->localport, service->remoteport, dir_char,
- size, (unsigned int)bulk->data, (unsigned int)userdata);
+ size, (size_t)bulk->data, (size_t)userdata);

  /* The slot mutex must be held when the service is being closed, so
     claim it here to ensure that isn't happening */
@@ -3382,7 +3404,7 @@ vchiq_bulk_transfer(VCHIQ_SERVICE_HANDLE_T handle,
  (dir =3D=3D VCHIQ_BULK_TRANSMIT) ?
  VCHIQ_POLL_TXNOTIFY : VCHIQ_POLL_RXNOTIFY);
  } else {
- int payload[2] =3D { (int)bulk->data, bulk->size };
+ uint32_t payload[2] =3D { (uint32_t)(uintptr_t)bulk->data, bulk->size };
  VCHIQ_ELEMENT_T element =3D { payload, sizeof(payload) };

  status =3D queue_message(state, NULL,
@@ -3526,7 +3548,6 @@ static void
 release_message_sync(VCHIQ_STATE_T *state, VCHIQ_HEADER_T *header)
 {
  header->msgid =3D VCHIQ_MSGID_PADDING;
- wmb();
  remote_event_signal(&state->remote->sync_release);
 }

@@ -3710,12 +3731,12 @@ vchiq_dump_state(void *dump_context,
VCHIQ_STATE_T *state)
  vchiq_dump(dump_context, buf, len + 1);

  len =3D snprintf(buf, sizeof(buf),
- "  tx_pos=3D%x(@%x), rx_pos=3D%x(@%x)",
+ "  tx_pos=3D%x(@%tx), rx_pos=3D%x(@%tx)",
  state->local->tx_pos,
- (uint32_t)state->tx_data +
+ (size_t)state->tx_data +
  (state->local_tx_pos & VCHIQ_SLOT_MASK),
  state->rx_pos,
- (uint32_t)state->rx_data +
+ (size_t)state->rx_data +
  (state->rx_pos & VCHIQ_SLOT_MASK));
  vchiq_dump(dump_context, buf, len + 1);

@@ -3817,8 +3838,8 @@ vchiq_dump_service_state(void *dump_context,
VCHIQ_SERVICE_T *service)
  vchiq_dump(dump_context, buf, len + 1);

  len =3D snprintf(buf, sizeof(buf),
- "  Ctrl: tx_count=3D%d, tx_bytes=3D%llu, "
- "rx_count=3D%d, rx_bytes=3D%llu",
+ "  Ctrl: tx_count=3D%d, tx_bytes=3D%"PRIu64", "
+ "rx_count=3D%d, rx_bytes=3D%"PRIu64"",
  service->stats.ctrl_tx_count,
  service->stats.ctrl_tx_bytes,
  service->stats.ctrl_rx_count,
@@ -3826,8 +3847,8 @@ vchiq_dump_service_state(void *dump_context,
VCHIQ_SERVICE_T *service)
  vchiq_dump(dump_context, buf, len + 1);

  len =3D snprintf(buf, sizeof(buf),
- "  Bulk: tx_count=3D%d, tx_bytes=3D%llu, "
- "rx_count=3D%d, rx_bytes=3D%llu",
+ "  Bulk: tx_count=3D%d, tx_bytes=3D%"PRIu64", "
+ "rx_count=3D%d, rx_bytes=3D%"PRIu64"",
  service->stats.bulk_tx_count,
  service->stats.bulk_tx_bytes,
  service->stats.bulk_rx_count,
diff --git a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_core.h
b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_core.h
index 38ede407f4f4..4e3f41203bc4 100644
--- a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_core.h
+++ b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_core.h
@@ -184,12 +184,21 @@ enum {
 #if VCHIQ_ENABLE_DEBUG

 #define DEBUG_INITIALISE(local) int *debug_ptr =3D (local)->debug;
+#if defined(__aarch64__)
+#define DEBUG_TRACE(d) \
+ do { debug_ptr[DEBUG_ ## d] =3D __LINE__; dsb(sy); } while (0)
+#define DEBUG_VALUE(d, v) \
+ do { debug_ptr[DEBUG_ ## d] =3D (v); dsb(sy); } while (0)
+#define DEBUG_COUNT(d) \
+ do { debug_ptr[DEBUG_ ## d]++; dsb(sy); } while (0)
+#else
 #define DEBUG_TRACE(d) \
  do { debug_ptr[DEBUG_ ## d] =3D __LINE__; dsb(); } while (0)
 #define DEBUG_VALUE(d, v) \
  do { debug_ptr[DEBUG_ ## d] =3D (v); dsb(); } while (0)
 #define DEBUG_COUNT(d) \
  do { debug_ptr[DEBUG_ ## d]++; dsb(); } while (0)
+#endif

 #else /* VCHIQ_ENABLE_DEBUG */

@@ -265,7 +274,7 @@ typedef struct vchiq_bulk_queue_struct {
 typedef struct remote_event_struct {
  int armed;
  int fired;
- struct semaphore *event;
+ uint32_t event;
 } REMOTE_EVENT_T;

 typedef struct opaque_platform_state_t *VCHIQ_PLATFORM_STATE_T;
diff --git a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_ioctl.h
b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_ioctl.h
index 617479eff136..90348ca4b0d0 100644
--- a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_ioctl.h
+++ b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_ioctl.h
@@ -127,4 +127,125 @@ typedef struct {
 #define VCHIQ_IOC_CLOSE_DELIVERED      _IO(VCHIQ_IOC_MAGIC,   17)
 #define VCHIQ_IOC_MAX                  17

+
+/*
+ * COMPAT_FREEBSD32
+ */
+
+typedef struct {
+ unsigned int config_size;
+ /*VCHIQ_CONFIG_T * */ uint32_t pconfig;
+} VCHIQ_GET_CONFIG32_T;
+
+typedef struct {
+ unsigned int handle;
+ /*void * */ uint32_t data;
+ unsigned int size;
+ /*void * */ uint32_t userdata;
+ VCHIQ_BULK_MODE_T mode;
+} VCHIQ_QUEUE_BULK_TRANSFER32_T;
+
+typedef struct {
+ unsigned int handle;
+ unsigned int count;
+ const /*VCHIQ_ELEMENT_T * */ uint32_t elements;
+} VCHIQ_QUEUE_MESSAGE32_T;
+
+typedef struct {
+ unsigned int handle;
+ int blocking;
+ unsigned int bufsize;
+ /*void * */ uint32_t buf;
+} VCHIQ_DEQUEUE_MESSAGE32_T;
+
+typedef struct {
+ /*void * */ uint32_t virt_addr;
+ /*size_t*/  uint32_t num_bytes;
+} VCHIQ_DUMP_MEM32_T;
+
+typedef struct {
+ VCHIQ_REASON_T reason;
+ /* VCHIQ_HEADER_T * */ uint32_t header;
+ /* void * */ uint32_t service_userdata;
+ /* void * */ uint32_t bulk_userdata;
+} VCHIQ_COMPLETION_DATA32_T;
+
+typedef struct {
+ unsigned int count;
+ /* VCHIQ_COMPLETION_DATA32_T * */ uint32_t buf;
+ unsigned int msgbufsize;
+ unsigned int msgbufcount; /* IN/OUT */
+ /* void ** */ uint32_t msgbufs;
+} VCHIQ_AWAIT_COMPLETION32_T;
+
+typedef struct vchiq_service_params32_struct {
+ int fourcc;
+ /* VCHIQ_CALLBACK_T */ uint32_t  callback;
+ /*void * */ uint32_t userdata;
+ short version;       /* Increment for non-trivial changes */
+ short version_min;   /* Update for incompatible changes */
+} VCHIQ_SERVICE_PARAMS32_T;
+
+typedef struct {
+ VCHIQ_SERVICE_PARAMS32_T params;
+ int is_open;
+ int is_vchi;
+ unsigned int handle;       /* OUT */
+} VCHIQ_CREATE_SERVICE32_T;
+
+typedef struct {
+ const /*void */ uint32_t data;
+ unsigned int size;
+} VCHIQ_ELEMENT32_T;
+
+
+#define VCHIQ_IOC_GET_CONFIG32 \
+ _IOC_NEWTYPE( \
+ VCHIQ_IOC_GET_CONFIG, \
+ VCHIQ_GET_CONFIG32_T \
+ )
+
+#define VCHIQ_IOC_QUEUE_BULK_TRANSMIT32 \
+ _IOC_NEWTYPE( \
+ VCHIQ_IOC_QUEUE_BULK_TRANSMIT, \
+ VCHIQ_QUEUE_BULK_TRANSFER32_T \
+ )
+
+#define VCHIQ_IOC_QUEUE_BULK_RECEIVE32 \
+ _IOC_NEWTYPE( \
+ VCHIQ_IOC_QUEUE_BULK_RECEIVE, \
+ VCHIQ_QUEUE_BULK_TRANSFER32_T \
+ )
+
+#define VCHIQ_IOC_QUEUE_MESSAGE32 \
+ _IOC_NEWTYPE( \
+ VCHIQ_IOC_QUEUE_MESSAGE, \
+ VCHIQ_QUEUE_MESSAGE32_T \
+ )
+
+#define VCHIQ_IOC_DEQUEUE_MESSAGE32 \
+ _IOC_NEWTYPE( \
+ VCHIQ_IOC_DEQUEUE_MESSAGE, \
+ VCHIQ_DEQUEUE_MESSAGE32_T \
+ )
+
+#define VCHIQ_IOC_DUMP_PHYS_MEM32 \
+ _IOC_NEWTYPE( \
+ VCHIQ_IOC_DUMP_PHYS_MEM, \
+ VCHIQ_DUMP_MEM32_T \
+ )
+
+#define VCHIQ_IOC_AWAIT_COMPLETION32 \
+ _IOC_NEWTYPE( \
+ VCHIQ_IOC_AWAIT_COMPLETION, \
+ VCHIQ_AWAIT_COMPLETION32_T \
+ )
+
+#define VCHIQ_IOC_CREATE_SERVICE32 \
+ _IOC_NEWTYPE( \
+ VCHIQ_IOC_CREATE_SERVICE, \
+ VCHIQ_CREATE_SERVICE32_T \
+ )
+
+
 #endif
diff --git a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_kern_lib.c
b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_kern_lib.c
index 1f849a09d854..22b988dcf436 100644
--- a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_kern_lib.c
+++ b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_kern_lib.c
@@ -151,9 +151,9 @@ VCHIQ_STATUS_T vchiq_shutdown(VCHIQ_INSTANCE_T instance=
)
  list);
  list_del(pos);
  vchiq_log_info(vchiq_arm_log_level,
- "bulk_waiter - cleaned up %x "
+ "bulk_waiter - cleaned up %tx "
  "for pid %d",
- (unsigned int)waiter, waiter->pid);
+ (size_t)waiter, waiter->pid);
  _sema_destroy(&waiter->bulk_waiter.event);

  kfree(waiter);
@@ -454,8 +454,8 @@
vchiq_blocking_bulk_transfer(VCHIQ_SERVICE_HANDLE_T handle, void
*data,
  list_add(&waiter->list, &instance->bulk_waiter_list);
  lmutex_unlock(&instance->bulk_waiter_list_mutex);
  vchiq_log_info(vchiq_arm_log_level,
- "saved bulk_waiter %x for pid %d",
- (unsigned int)waiter, current->p_pid);
+ "saved bulk_waiter %tx for pid %d",
+ (size_t)waiter, current->p_pid);
  }

  return status;
diff --git a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_kmod.c
b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_kmod.c
index 5b47377735f1..5c7cf9035413 100644
--- a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_kmod.c
+++ b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_kmod.c
@@ -47,7 +47,11 @@ __FBSDID("$FreeBSD$");
 #include <dev/ofw/ofw_bus_subr.h>

 #include <machine/bus.h>
+/* XXXMDC Is this necessary at all? */
+#if defined(__aarch64__)
+#else
 #include <machine/fdt.h>
+#endif

 #include "vchiq_arm.h"
 #include "vchiq_2835.h"
@@ -78,13 +82,31 @@ struct bcm_vchiq_softc {

 static struct bcm_vchiq_softc *bcm_vchiq_sc =3D NULL;

-#define BSD_DTB 1
-#define UPSTREAM_DTB 2
+
+#define CONFIG_INVALID 0
+#define CONFIG_VALID 1 << 0
+#define BSD_REG_ADDRS 1 << 1
+#define LONG_BULK_SPACE 1 << 2
+
+/*
+ * Also controls the use of the standard VC address offset for bulk data D=
MA
+ * (normal bulks use that offset; bulks for long address spaces use physic=
al
+ * page addresses)
+ */
+extern unsigned int g_long_bulk_space;
+
+
+/*
+ * XXXMDC
+ * The man page for ofw_bus_is_compatible describes ``features''
+ * as ``can be used''. Here we use understand them as ``must be used''
+ */
+
 static struct ofw_compat_data compat_data[] =3D {
- {"broadcom,bcm2835-vchiq", BSD_DTB},
- {"brcm,bcm2835-vchiq", UPSTREAM_DTB},
- {"brcm,bcm2711-vchiq", UPSTREAM_DTB},
- {NULL, 0}
+ {"broadcom,bcm2835-vchiq", BSD_REG_ADDRS | CONFIG_VALID},
+ {"brcm,bcm2835-vchiq", CONFIG_VALID},
+ {"brcm,bcm2711-vchiq", LONG_BULK_SPACE | CONFIG_VALID},
+ {NULL, CONFIG_INVALID}
 };

 #define vchiq_read_4(reg) \
@@ -119,13 +141,23 @@ bcm_vchiq_intr(void *arg)
 void
 remote_event_signal(REMOTE_EVENT_T *event)
 {
- event->fired =3D 1;

+ wmb();
+
+ event->fired =3D 1;
  /* The test on the next line also ensures the write on the previous line
  has completed */
+ /* UPDATE: not on arm64, it would seem... */
+#if defined(__aarch64__)
+ dsb(sy);
+#endif
  if (event->armed) {
  /* trigger vc interrupt */
+#if defined(__aarch64__)
+ dsb(sy);
+#else
  dsb();
+#endif
  vchiq_write_4(0x48, 0);
  }
 }
@@ -134,13 +166,17 @@ static int
 bcm_vchiq_probe(device_t dev)
 {

- if (ofw_bus_search_compatible(dev, compat_data)->ocd_data =3D=3D 0)
+ if ((ofw_bus_search_compatible(dev, compat_data)->ocd_data &
CONFIG_VALID) =3D=3D 0)
  return (ENXIO);

  device_set_desc(dev, "BCM2835 VCHIQ");
  return (BUS_PROBE_DEFAULT);
 }

+/* debug_sysctl */
+extern int vchiq_core_log_level;
+extern int vchiq_arm_log_level;
+
 static int
 bcm_vchiq_attach(device_t dev)
 {
@@ -168,14 +204,36 @@ bcm_vchiq_attach(device_t dev)
  return (ENXIO);
  }

- if (ofw_bus_search_compatible(dev, compat_data)->ocd_data =3D=3D UPSTREAM=
_DTB)
+ uintptr_t dev_compat_d =3D ofw_bus_search_compatible(dev,
compat_data)->ocd_data;
+ /* XXXMDC: shouldn't happen (checked for in probe)--but, for symmetry */
+ if ((dev_compat_d & CONFIG_VALID) =3D=3D 0){
+ device_printf(dev, "attempting to attach using invalid config.\n");
+ bus_release_resource(dev, SYS_RES_IRQ, rid, sc->irq_res);
+ return (EINVAL);
+ }
+ if ((dev_compat_d & BSD_REG_ADDRS) =3D=3D 0)
  sc->regs_offset =3D -0x40;
+ if(dev_compat_d & LONG_BULK_SPACE)
+ g_long_bulk_space =3D 1;

  node =3D ofw_bus_get_node(dev);
  if ((OF_getencprop(node, "cache-line-size", &cell, sizeof(cell))) > 0)
  g_cache_line_size =3D cell;

  vchiq_core_initialize();
+
+ /* debug_sysctl */
+        struct sysctl_ctx_list *ctx_l =3D device_get_sysctl_ctx(dev);
+        struct sysctl_oid *tree_node =3D device_get_sysctl_tree(dev);
+        struct sysctl_oid_list *tree =3D SYSCTL_CHILDREN(tree_node);
+ SYSCTL_ADD_INT(
+ ctx_l, tree, OID_AUTO, "log", CTLFLAG_RW,
+ &vchiq_core_log_level, vchiq_core_log_level, "log level"
+ );
+ SYSCTL_ADD_INT(
+ ctx_l, tree, OID_AUTO, "arm_log", CTLFLAG_RW,
+ &vchiq_arm_log_level, vchiq_arm_log_level, "arm log level"
+ );

  /* Setup and enable the timer */
  if (bus_setup_intr(dev, sc->irq_res, INTR_TYPE_MISC | INTR_MPSAFE,
diff --git a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_pagelist.h
b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_pagelist.h
index 72c362464cc2..d1cb9f1e1658 100644
--- a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_pagelist.h
+++ b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_pagelist.h
@@ -42,10 +42,10 @@
 #define PAGELIST_READ_WITH_FRAGMENTS 2

 typedef struct pagelist_struct {
- unsigned long length;
- unsigned short type;
- unsigned short offset;
- unsigned long addrs[1]; /* N.B. 12 LSBs hold the number of following
+ uint32_t length;
+ uint16_t type;
+ uint16_t offset;
+ uint32_t addrs[1]; /* N.B. 12 LSBs hold the number of following
     pages at consecutive addresses. */
 } PAGELIST_T;

diff --git a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_shim.c
b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_shim.c
index cc8ef2e071f8..f33c545cea45 100644
--- a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_shim.c
+++ b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_shim.c
@@ -398,7 +398,7 @@ EXPORT_SYMBOL(vchi_msg_queuev);
  ***********************************************************/
 int32_t vchi_held_msg_release(VCHI_HELD_MSG_T *message)
 {
- vchiq_release_message((VCHIQ_SERVICE_HANDLE_T)message->service,
+ vchiq_release_message((VCHIQ_SERVICE_HANDLE_T)(size_t)message->service,
  (VCHIQ_HEADER_T *)message->message);

  return 0;
@@ -444,7 +444,7 @@ int32_t vchi_msg_hold(VCHI_SERVICE_HANDLE_T handle,
  *msg_size =3D header->size;

  message_handle->service =3D
- (struct opaque_vchi_service_t *)service->handle;
+ (struct opaque_vchi_service_t *)(unsigned long)service->handle;
  message_handle->message =3D header;

  return 0;
--=20
2.32.0


On Mon, Feb 28, 2022 at 7:42 PM Warner Losh <imp@bsdimp.com> wrote:
>
>
>
> On Mon, Feb 28, 2022, 12:36 PM Marco Devesas Campos <devesas.campos@gmail=
.com> wrote:
>>
>> Entirely right, Ronald =E2=80=94 thanks for catching it!
>
>
> Oops
>
>> Warner, can I send you a consolidated patch later in the week? What=E2=
=80=99s the best way to submit it?
>
>
> Git format-patch is likely best.
>
> Warner
>
>>
>> Best,
>> Marco
>>
>> On 28 Feb 2022, at 19:26, Ronald Klop <ronald-lists@klop.ws> wrote:
>>
>> On Sun, 27 Feb 2022 17:41:25 +0100, Warner Losh <imp@bsdimp.com> wrote:
>>
>>
>>
>> On Sun, Feb 27, 2022 at 8:44 AM Marco Devesas Campos <devesas.campos@gma=
il.com> wrote:
>>>
>>> Hi, List
>>>
>>> On the back of Ronald Klop's comments (thanks!), I went and got myself =
an
>>> RPI 4 and it turns out all that was need was adding the right dtb
>>> reference and it all works (seemingly) fine (incremental patch attached=
).
>>
>>
>> I've committed the patch below. If it turns out we need more, we can alw=
ays augment.
>>
>>
>>
>> Hi Marco, Warner,
>>
>> Isn't the patch from https://lists.freebsd.org/archives/freebsd-arm/2022=
-February/000949.html needed also?
>> As you mention the patch below is an incremental patch?
>>
>> Regards,
>> Ronald.
>>
>>
>>
>>
>>
>>
>>
>> Warner
>>
>>>
>>> One of the potential projects highlighted in the latest call for propos=
als
>>> was exactly to get hdmi audio output in 64 bit Pis, viz. the 400-s. If
>>> anyone who voted for that reads this list, wd be nice to get some input=
 on
>>> the patches.
>>>
>>> Best,
>>> Marco
>>>
>>>
>>>
>>> diff --git a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_kmod.c b/sys/c=
ontrib/vchiq/interface/vchiq_arm/vchiq_kmod.c
>>> index dc18678b99a3..344267ff0c1c 100644
>>> --- a/sys/contrib/vchiq/interface/vchiq_arm/vchiq_kmod.c
>>> +++ b/sys/contrib/vchiq/interface/vchiq_arm/vchiq_kmod.c
>>> @@ -83,6 +83,7 @@ static struct bcm_vchiq_softc *bcm_vchiq_sc =3D NULL;
>>>  static struct ofw_compat_data compat_data[] =3D {
>>>         {"broadcom,bcm2835-vchiq",      BSD_DTB},
>>>         {"brcm,bcm2835-vchiq",          UPSTREAM_DTB},
>>> +       {"brcm,bcm2711-vchiq",          UPSTREAM_DTB},
>>>         {NULL,                          0}
>>>  };
>>>
>>>
>>>
>>> > On 8 Feb 2022, at 08:49, Ronald Klop <ronald-lists@klop.ws> wrote:
>>> >
>>> > Van: Ronald Klop <ronald-lists@klop.ws>
>>> > Datum: maandag, 7 februari 2022 21:05
>>> > Aan: Marco Devesas Campos <devesas.campos@gmail.com>, freebsd-arm@fre=
ebsd.org
>>> > Onderwerp: Re: [PATCH] Experimental vchiq and bcm2835_audio support f=
or arm64
>>> >
>>> > On 2/6/22 14:46, Marco Devesas Campos wrote:
>>> > > Hi Ronald,
>>> > >
>>> > > Thanks so much for trying out the patch out.
>>> > >
>>> > >> On 6 Feb 2022, at 13:05, Ronald Klop <ronald-lists@klop.ws> wrote:
>>> > >>
>>> > >> Hi,
>>> > >>
>>> > >> I compiled this on a RPI4 + 14-CURRENT. It boots, but I see no dif=
ference in available devices.
>>> > >> I can try to boot it on a RPI3B+ on another time.
>>> > >
>>> > > I *think* the GPU/VC in RPI-4 is a very different beast from the ot=
hers. I'll
>>> > > look into it, but if you could give it a try on the 3+ I'd be much =
obliged.
>>> > >
>>> > >>
>>> > >> What would be the expected outcome? Where should I look at (or lis=
ten to)?
>>> > >>
>>> > >
>>> > > You should see something like
>>> > >
>>> > >    vchiq0: <BCM2835 VCHIQ> mem 0x7e00b840-0x7e00b87b irq 54 on simp=
lebus0
>>> > >    vchiq: local ver 8 (min 3), remote ver 8.
>>> > >    pcm0: <VCHIQ audio> on vchiq0
>>> > >
>>> > > in your dmesg output.
>>> > >
>>> > > The file /dev/vchiq should exist, as well as the following sysctl-s=
 (I'm
>>> > > assuming no other audio devices are attached)
>>> > >
>>> > >    % sysctl dev.pcm
>>> > >    dev.pcm.0.trace: 0
>>> > >    ...
>>> > >    dev.pcm.0.dest: 0
>>> > >    ...
>>> > >    dev.pcm.0.%parent: vchiq0
>>> > >    ...
>>> > >    dev.pcm.0.%driver: pcm
>>> > >    dev.pcm.0.%desc: VCHIQ audio
>>> > >    =E2=80=A6
>>> > >
>>> > > Then if you `cat < /dev/random > /dev/dsp` you should hear some sta=
tic coming
>>> > > out of whatever is connected to hdmi (maybe headphones too? otherwi=
se try
>>> > > setting `sysctl dev.pcm.0.dest=3D1`)
>>> > >
>>> > > Best,
>>> > > Marco
>>> >
>>> >
>>> > Hi,
>>> >
>>> > Booted the patched 14-CURRENT on the RPI3B+.
>>> >
>>> > dmesg diff:
>>> > +vchiq0: <BCM2835 VCHIQ> mem 0x7e00b840-0x7e00b87b irq 54 on simplebu=
s0
>>> > +vchiq: local ver 8 (min 3), remote ver 8.
>>> > +pcm0: <VCHIQ audio> on vchiq0
>>> >
>>> > [root@rpi3 ~]# cat /dev/sndstat
>>> > Installed devices:
>>> > pcm0: <VCHIQ audio> (play) default
>>> > No devices installed from userspace.
>>> >
>>> > [root@rpi3 ~]# sysctl dev.pcm
>>> > dev.pcm.0.trace: 0
>>> > dev.pcm.0.starved: 0
>>> > dev.pcm.0.freebuffer: 40000
>>> > dev.pcm.0.underruns: 0
>>> > dev.pcm.0.retrieved: 0
>>> > dev.pcm.0.submitted: 0
>>> > dev.pcm.0.callbacks: 0
>>> > dev.pcm.0.dest: 0
>>> > dev.pcm.0.mode: 3
>>> > dev.pcm.0.bitperfect: 0
>>> > dev.pcm.0.buffersize: 0
>>> > dev.pcm.0.play.vchanformat: s16le:2.0
>>> > dev.pcm.0.play.vchanrate: 48000
>>> > dev.pcm.0.play.vchanmode: fixed
>>> > dev.pcm.0.play.vchans: 1
>>> > dev.pcm.0.%parent: vchiq0
>>> > dev.pcm.0.%pnpinfo:
>>> > dev.pcm.0.%location:
>>> > dev.pcm.0.%driver: pcm
>>> > dev.pcm.0.%desc: VCHIQ audio
>>> > dev.pcm.%parent:
>>> >
>>> >
>>> > To play some audio I need to search some headphones first. :-)
>>> >
>>> > Ronald.
>>> >
>>> >
>>> >
>>> > Good morning,
>>> >
>>> > Found headphones with a cable on the attic. Plugged it into the audio=
 jack and played an mp3. Amazing!
>>> >
>>> > Regards,
>>> > Ronald.
>>> >
>>>
>>>
>>
>>
>>
>>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CADOynoTLzJfwHUkOzGeUWsY-V3GCSji9b3_JLo-xjvZRg=ianw>