info.monitorenter.cpdetector.io
Class ParsingDetector
java.lang.Object
info.monitorenter.cpdetector.io.AbstractCodepageDetector
info.monitorenter.cpdetector.io.ParsingDetector
- All Implemented Interfaces:
- ICodepageDetector, Serializable, Comparable
public class ParsingDetector
- extends AbstractCodepageDetector
A Fa�ade that
internally uses an ANTLR - based parser /
lexer.
The underlying lexer is more a filter: It does not verify lexical correctness
by the means of matching a defined order of tokens, but just filters m_out
certain tokens. By now the following tokens are filtered:
- Author:
- Achim Westermann
- See Also:
- Serialized Form
|
Method Summary |
Charset |
detectCodepage(InputStream in,
int length)
This method allows to detect the charset encoding from every source (even a
String, which an URL does not decorate!). |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ParsingDetector
public ParsingDetector()
ParsingDetector
public ParsingDetector(boolean verbose)
detectCodepage
public Charset detectCodepage(InputStream in,
int length)
throws IOException
- Description copied from interface:
ICodepageDetector
This method allows to detect the charset encoding from every source (even a
String, which an URL does not decorate!).
Note that you cannot reuse the given InputStream unless it supports marking (InputStream.markSupported() ==
true), you mark the initial position with a sufficient readlimit and invoke
reset afterwards (without getting any exception).
- Parameters:
in - An InputStream for the document, that supports mark and a
readlimit of argument length.length - The amount of bytes to take into account. This number should not
be longer than the amount of bytes retrievable from the
InputStream but should be as long as possible to give the fallback
detection (chardet) more hints to guess.
- Throws:
IOException
Copyleft ㊢ 2003-2004 MPL 1.1, All Rights Footloose.