From owner-freebsd-hackers@FreeBSD.ORG Wed Nov 19 07:55:15 2008 Return-Path: Delivered-To: hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CF4A41065670 for ; Wed, 19 Nov 2008 07:55:15 +0000 (UTC) (envelope-from nick@van-laarhoven.org) Received: from cpsmtpo-eml04.kpnxchange.com (cpsmtpo-eml04.KPNXCHANGE.COM [213.75.38.153]) by mx1.freebsd.org (Postfix) with ESMTP id 624448FC16 for ; Wed, 19 Nov 2008 07:55:15 +0000 (UTC) (envelope-from nick@van-laarhoven.org) Received: from cpsmtp-eml114.kpnxchange.com ([213.75.84.114]) by cpsmtpo-eml04.kpnxchange.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 19 Nov 2008 08:43:10 +0100 Received: from uitsmijter.van-laarhoven.org ([81.207.207.222]) by cpsmtp-eml114.kpnxchange.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 19 Nov 2008 08:43:10 +0100 Received: (qmail 81887 invoked by uid 98); 19 Nov 2008 07:43:23 -0000 Received: from 77.62.210.250 (nick@77.62.210.250) by uitsmijter.van-laarhoven.org (envelope-from , uid 82) with qmail-scanner-2.01 (clamdscan: 0.92/5270. f-prot: 4.6.7/3.16.15. spamassassin: 3.2.3. Clear:RC:0(77.62.210.250):SA:0(0.5/5.0):. Processed in 4.122417 secs); 19 Nov 2008 07:43:23 -0000 X-Spam-Status: No, score=0.5 required=5.0 X-Spam-Level: Received: from unknown (HELO van-laarhoven.org) (nick@77.62.210.250) by uitsmijter.van-laarhoven.org with SMTP; 19 Nov 2008 07:43:18 -0000 Received: (nullmailer pid 1528 invoked by uid 1001); Wed, 19 Nov 2008 07:42:59 -0000 From: Nick Hibma To: FreeBSD Hackers Mailing List Date: Wed, 19 Nov 2008 08:42:58 +0100 User-Agent: KMail/1.9.7 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200811190842.59377.nick@van-laarhoven.org> X-OriginalArrivalTime: 19 Nov 2008 07:43:10.0516 (UTC) FILETIME=[78543740:01C94A1A] Cc: Subject: Unicode USB strings conversion X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2008 07:55:15 -0000 In the USB code (and I bet it is the same in the USB4BSD code) unicode characters in strings are converted in a very crude way to ASCII. As I have a user on the line who sees rubbish in his logs and when using usbctl/usbdevs/etc., I bet this is the problem. I'd like to try and fix this problem by using libkern/libiconv. 1) Is this the right approach to convert UTF8 to printable string in the kernel? 2) Is this needed at all in the short term future? I remember seeing attempts at making the kernel use UTF8. 3) Does anyone know of a good example in the code without me having to hunt through the kernel to find it? For reference: The code that needs replacing is: usbd_get_string(): s = buf; n = size / 2 - 1; for (i = 0; i < n && i < len - 1; i++) { c = UGETW(us.bString[i]); /* Convert from Unicode, handle buggy strings. */ if ((c & 0xff00) == 0) *s++ = c; else if ((c & 0x00ff) == 0 && swap) *s++ = c >> 8; else *s++ = '?'; } *s++ = 0; I haven't got the USB specs handy, but I believe that this is a simple way of converting LE and BE UTF8 to ASCII. Nick