No BSD License  

Highlights from
unicode2ascii

5.0

5.0 | 3 ratings Rate this file 8 Downloads (last 30 days) File Size: 1.59 KB File ID: #10686

unicode2ascii

by Stefan Eireiner

 

10 Apr 2006 (Updated 11 Apr 2006)

Converts unicode endcoded files to ASCII encoded files

| Watch this File

File Information
Description

UNICODE2ASCII Converts unicode endcoded files to ASCII encoded files
  UNICODE2ASCII(FILE)
   Converts the file to ASCII (overwrites the old file!)
  UNICODE2ASCII(SOURCEFILE, DESTINATIONFILE)
   Converts the contents of SOURCEFILE to ASCII and writes it to DESTINATIONFILE
  ASCIISTRING = UNICODE2ASCII('string', UTFSTRING)
   Converts the UTFSTRING to ASCII and returns the string.

The unicode header and all 0-bytes will be deleted. If there are characers with encoding > FF then the file will continue to contain garbage because there is no ASCII representation for those characters. But about 99% of the files should convert ok.

MATLAB release MATLAB 6.5 (R13)
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (4)
06 Nov 2007 jf b  
14 Dec 2009 Gordon

combined this with csvimport and it worked a treat

31 Jan 2010 Sergey

Thanks! It's clean, fast, and most important - works )

28 Sep 2010 Mark Patterson

Works well. The included isunicode function does not correctly release the file handle when it returns. The attached diff correct this.

--- isunicode.m 2010-09-20 18:25:56 +0000
+++ isunicode.m 2010-09-27 21:11:21 +0000
@@ -13,30 +13,22 @@
%
% (c) Version 1.0 by Stefan Eireiner (<a
href="mailto:stefan.eireiner@siemens.com?subject=isunicode">stefan.eirei
ner@siemens.com</a>)
% last change 10.04.2006
-%
-% Downloaded from:
-%
http://www.mathworks.com/matlabcentral/fileexchange/10686-unicode2ascii.
-%
-% See also: unicode2ascii.

isuc = false;
-if(nargin == 2)
- if(strcmpi(filename, 'string'))
- firstLine = varargin{1}(1:4);
+if nargin == 2 && strcmpi(filename, 'string')
+ firstLine = varargin{1}(1:4);
+else
+ fileInfo = dir(filename);
+ if(fileInfo.bytes < 4) % a unicode file incl. header can't be
smaller than 4 bytes if it shall display at least one char.
+ return;
    end
-end
-
-if(~exist('firstLine', 'var'))
    fin = fopen(filename,'r');
    if (fin == -1) %does the file exist?
        error(['File ' filename ' not found!'])
        return;
    end
- fileInfo = dir(filename);
- if(fileInfo.bytes < 4) % a unicode file incl. header can't be
smaller than 4 bytes if it shall display at least one char.
- return;
- end
    firstLine = fread(fin,4)';
+ fclose(fin) ;
end

% assign all possible headers to variables
@@ -58,7 +50,3 @@
elseif(strfind(firstLine, utf32leheader) == 1)
        isuc = 5;
end
-
-if(~exist('firstLine', 'var'))
- fclose(fin);
-end

Please login to add a comment or rating.
Updates
11 Apr 2006

added functionality to convert strings.

Tag Activity for this File
Tag Applied By Date/Time
data import Stefan Eireiner 22 Oct 2008 08:21:55
data export Stefan Eireiner 22 Oct 2008 08:21:56
unicode utf ascii ansi converter utf 8 16 32 txt Stefan Eireiner 22 Oct 2008 08:21:56

Contact us at files@mathworks.com