Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 28 Feb 2015 09:47:31 +0900
From:      "Daisuke Aoyama" <aoyama@peach.ne.jp>
To:        "Lars Engels" <lars.engels@0x20.net>
Cc:        freebsd-arm@freebsd.org
Subject:   Re: Ready for NAS4Free on ODROID-C1
Message-ID:  <822338A1698D483190FD602C32DD70F4@ad.peach.ne.jp>
In-Reply-To: <20150224154713.GS52053@e-new.0x20.net>
References:  <1DA948EA255F4963ACBC0EBE7D046401@ad.peach.ne.jp> <34D37D2811D246BEB11080D19F03FECE@ad.peach.ne.jp> <20150224154713.GS52053@e-new.0x20.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,

>> I miss previous mail, USB HDD is formatted by NTFS.
>> So, NAS4Free can use NTFS USB drive without any additional tool/step.
>
>That's probably why the write speeds are so slow? Have you tested
>writing to a UFS formatted USB HDD?

Yes, around 6MB/s writing is cap of dwc_otg.c.
I'm digging carefully. At this time, I find out two problem in dwc_otg.c.

>>        TAILQ_FOREACH(xfer, &sc->sc_bus.intr_q.head, wait_entry)
>>                dwc_otg_xfer_do_fifo(sc, xfer);

Not depend on CPU power:
dwc_otg_host_channel_alloc() returns an error many times by "compute needed TX FIFO size" and 
busy(dwc_otg_enable_sof_irq).
It causes state reset and retry after next SOF(HS mode microframe is 125us).

Note: host have max 16 endpoints and FIFO is not so large.

Depend on CPU power:
dwc_otg_update_host_transfer_schedule_locked() scans all intr_q few times.
Under heavy load, the function is called every SOF 125us! (8000 times per second)
I think this is expensive cost on RPi.

Note: 1MB request make DFLTPHYS(64KB) x 16 SCSI request,
and 16 SCSI request make 3(cmd/data/status) x 16 = 48 USB request on intr_q.
1 USB request = 512B x 128 transaction.

Above reason, dwc_otg.c can't get over 6MB/s easily and unstable performance.
(depend on CPU power and system load)

If you want debug on dwc_otg.c, you must check/print within 125us,
otherwise you get wrong frame info or missing frame.
Also you must think about NRZI encoding and bit stuffing in frame.

I feel that it is necessary to split intr_q to pending and running.

Regards,
-- 
Daisuke Aoyama
  



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?822338A1698D483190FD602C32DD70F4>