From owner-freebsd-questions@FreeBSD.ORG Sun Mar 14 20:52:19 2004 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DBF5016A4CE for ; Sun, 14 Mar 2004 20:52:19 -0800 (PST) Received: from hotmail.com (bay16-f44.bay16.hotmail.com [65.54.186.94]) by mx1.FreeBSD.org (Postfix) with ESMTP id AB28943D39 for ; Sun, 14 Mar 2004 20:52:19 -0800 (PST) (envelope-from weiwuzhang@hotmail.com) Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Sun, 14 Mar 2004 20:52:19 -0800 Received: from 218.85.100.89 by by16fd.bay16.hotmail.msn.com with HTTP; Mon, 15 Mar 2004 04:52:19 GMT X-Originating-IP: [218.85.100.89] X-Originating-Email: [weiwuzhang@hotmail.com] X-Sender: weiwuzhang@hotmail.com From: "Zhang Weiwu" To: questions@freebsd.org Date: Mon, 15 Mar 2004 12:52:19 +0800 Mime-Version: 1.0 Content-Type: text/plain; charset=gb2312; format=flowed Message-ID: X-OriginalArrivalTime: 15 Mar 2004 04:52:19.0526 (UTC) FILETIME=[4BE2C660:01C40A49] Subject: [OT?] write C program with UTF16LE X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: zhangweiwu@realss.com List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Mar 2004 04:52:20 -0000 Hello. Although I write some php/perl script, I don't write C program. Now I have a very large text file in UTF16LE format, the rule is strings are seperated by numbers. Say 0300 6100 6200 6300 0400 6700 5400 9800 7400 0300 .... Leading 0300 means the following 3 characters (6 bytes) is a string, and the next 0400 means the following 4 characters makes another string. I need to read the file and replace every number-style string seperator with a linefeed. I decide to use C, it is a good chance to start some practice on C. The old getc() I learnt from school is not my cup of tea, because I always need to do two getcs at once, and for the seperators I need to do getc()+getc()*256. What is the best practics to deal with such number/UTF16 mixed text? I googled around and find some tutorials, most i18n toturials think I'm already a C expert:( I find the glibc manual looks good learning resource, but I am the kind of newbie don't know if I am using glibc at all. When I just write #include Am i using the stdio.h from glibc? I think simply point me a tutorial that fits me will do me more help. Thank you. _________________________________________________________________ 与联机的朋友进行交流,请使用 MSN Messenger: http://messenger.msn.com/cn