Navigation
|
Synopsis Read the contents of a location and return it as string value.
Function str readFile(loc file) throws PathNotFound(loc file), IO(str msg)
Usage import IO;
Description Return the contents of a file location as a single string.
Also see readFileLines.
Encoding
A text file can be encoded in many different character sets, most common are UTF8, ISO-8859-1, and ASCII.
If you know the encoding of the file, please use the readFileEnc and readFileLinesEnc overloads.
If you do not know, we try to detect this. This detection is explained below:
- If the implementation of the used scheme Location (eg.
|project:///| ) defines the charset of the file then this is used.
- Otherwise if the file contains a UTF8/16/32 BOM
, then this is used.
- As a last resort the IO library uses heuristics to determine if UTF-8 or UTF-32 could work:
- Are the first 32 bytes valid UTF-8? Then use UTF-8.
- Are the first 32 bytes valid UTF-32? Then use UTF-32.
- Finally, we fall back to the system default (as given by the Java Runtime Environment).
To summarize, we use UTF-8 by default, except if the Location has available meta-data, the file contains a BOM, or
the first 32 bytes of the file are not valid UTF-8.
Pitfalls - The second version of
readFile with a string argument is deprecated.
- In case encoding is not known, we try to estimate as best as we can.
- We default to UTF-8, if the file was not encoded in UTF-8 but the first characters were valid UTF-8, you might get an decoding error or just strange looking characters.
|