Read iso8859_2 file

qlwik

Client
Регистрация
03.04.2013
Сообщения
207
Благодарностей
5
Баллы
18
Hi,
I have iso8859_2 encoded file, if I will use on it "process files" -> "read text" it will replace all national signs into this: �
how can I solve it?

my zp version: 5.9.9.1
 

VladZen

Administrator
Команда форума
Регистрация
05.11.2014
Сообщения
22 453
Благодарностей
5 913
Баллы
113

qlwik

Client
Регистрация
03.04.2013
Сообщения
207
Благодарностей
5
Баллы
18
You mean in external tool? In zp if I read text and save it to variable, text will be wrong in that variable.
 

VladZen

Administrator
Команда форума
Регистрация
05.11.2014
Сообщения
22 453
Благодарностей
5 913
Баллы
113

qlwik

Client
Регистрация
03.04.2013
Сообщения
207
Благодарностей
5
Баллы
18
ok but I need it for many files, it must be done with windows commend line, I found iconv but it doesn't conver files "in place" (save to oryginal file), u know any solution that convert it in windows command line in place?

or maybe it can be done with c+?

Furthermore I can't use > in run program because it cause an error and doesn't save file, for example
program: C:\Program Files (x86)\GnuWin32\bin\iconv.exe
setings: -f ISO-8859-2 -t UTF-8 inputfile.txt

wrok fine (print result in command line) but this:
program: C:\Program Files (x86)\GnuWin32\bin\iconv.exe
setings: -f ISO-8859-2 -t UTF-8 inputfile.txt > outputfile.exe

causes an error
 
Последнее редактирование:

PHaRTnONu

Client
Регистрация
01.10.2016
Сообщения
340
Благодарностей
48
Баллы
28
Solution found for you in c# HERE
var previousLines = new HashSet<string>();
File.WriteAllLines("D:\\textfile2.txt",
File.ReadLines("textfile1.txt").Where(line => previousLines.Add(line)),
Encoding.GetEncoding("ISO-8859-2"));
File.ReadLines("textfile1.txt", Encoding.GetEncoding("ISO-8859-2"))
I also found this article HERE that i think is a better solution for you
1 down vote accepted



Where are you writing ASCII.txt? You're writing ANSI.txt in the first line, but that's certainly not ASCII as ASCII doesn't contain any accented characters. The ANSI file won't contain any preamble indicating that it's ANSI rather than ASCII or UTF-8.

You seem to have changed your mind between ASCII and ANSI half way through writing the example, basically.

I'd expect any ASCII file to be "detected" as UTF-8, but the encoding detection relies on the file having a byte order mark for it to be anything other than UTF-8. It's not like it reads the whole file and then guesses at what the encoding is.

From the docs for StreamReader:

This constructor initializes the encoding to UTF8Encoding, the BaseStream property using the stream parameter, and the internal buffer to the default size.

The detectEncodingFromByteOrderMarks parameter detects the encoding by looking at the first three bytes of the stream. It automatically recognizes UTF-8, little-endian Unicode, and big-endian Unicode text if the file starts with the appropriate byte order marks. See the Encoding.GetPreamble method for more information.

Now File.Copy is just copying the raw bytes from place to place - it shouldn't change anything in terms of character encodings, because it doesn't try to interpret the file as a text file in the first place.

It's not quite clear to me where you see a problem (partly due to the ANSI/ASCII part). I suggest you separate out the issues of "does File.Copy change things?" and "what character encoding is detected by StreamReader?" in both your mind and your question. The answers should be:

  • File.Copy shouldn't change the contents of the file at all
  • StreamReader can only detect UTF-8 and UTF-16; if you need to read a file encoded with any other encoding, you should state that explicitly in the constructor. (I would personally recommend using Encoding.Default instead of Encoding.GetEncoding(0) by the way. I think it's clearer.)

but the best solution i found is this
Код:
using (System.IO.StreamReader reader = new System.IO.StreamReader(fileName,
                                       Encoding.GetEncoding("iso-8859-1")))
{
    using (System.IO.StreamWriter writer = new System.IO.StreamWriter(
                                           outFileName, Encoding.UTF8))
    {
        writer.Write(reader.ReadToEnd());
    }
}
 
  • Спасибо
Реакции: qlwik

Кто просматривает тему: (Всего: 1, Пользователи: 0, Гости: 1)