Package nu.validator.htmlparser.io
Class Driver
java.lang.Object
nu.validator.htmlparser.io.Driver
- All Implemented Interfaces:
EncodingDeclarationHandler
-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate boolean
private Encoding
private CharacterHandler[]
Used for NFC checking if non-null
, source code capture, etc.private Confidence
private Heuristics
private Reader
The input UTF-16 code unit stream.private RewindableInputStream
The reference to the rewindable byte stream.private boolean
private final Tokenizer
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
addCharacterHandler
(CharacterHandler characterHandler) private void
(package private) void
protected Encoding
encodingFromExternalDeclaration
(String encoding) Initializes a decoder from external decl.Queries the environment for the encoding in use (for error reporting).boolean
internalEncodingDeclaration
(String internalCharset) Indicates that the parser has found an internal encoding declaration with the charset valuecharset
.boolean
Returns the allowRewinding.boolean
Query if checking normalization.(package private) void
private void
void
setAllowRewinding
(boolean allowRewinding) Sets the allowRewinding.void
setCheckingNormalization
(boolean enable) Turns NFC checking on or off.void
setCommentPolicy
(XmlViolationPolicy commentPolicy) void
setContentNonXmlCharPolicy
(XmlViolationPolicy contentNonXmlCharPolicy) void
setContentSpacePolicy
(XmlViolationPolicy contentSpacePolicy) void
setEncoding
(Encoding encoding, Confidence confidence) void
void
setHeuristics
(Heuristics heuristics) Sets the encoding sniffing heuristics.void
setHtml4ModeCompatibleWithXhtml1Schemata
(boolean html4ModeCompatibleWithXhtml1Schemata) void
setMappingLangToXmlLang
(boolean mappingLangToXmlLang) void
setNamePolicy
(XmlViolationPolicy namePolicy) void
setTransitionHandler
(TransitionHandler transitionHandler) void
setXmlnsPolicy
(XmlViolationPolicy xmlnsPolicy) void
tokenize
(InputSource is) Runs the tokenization.protected void
warnWithoutLocation
(String message) Reports a warning without line/colprotected Encoding
whineAboutEncodingAndReturnActual
(String encoding, Encoding cs)
-
Field Details
-
reader
The input UTF-16 code unit stream. If a byte stream was given, this object is an instance ofHtmlInputStreamReader
. -
rewindableInputStream
The reference to the rewindable byte stream.null
if p rohibited or no longer needed. -
swallowBom
private boolean swallowBom -
characterEncoding
-
allowRewinding
private boolean allowRewinding -
heuristics
-
tokenizer
-
confidence
-
characterHandlers
Used for NFC checking if non-null
, source code capture, etc.
-
-
Constructor Details
-
Driver
-
-
Method Details
-
isAllowRewinding
public boolean isAllowRewinding()Returns the allowRewinding.- Returns:
- the allowRewinding
-
setAllowRewinding
public void setAllowRewinding(boolean allowRewinding) Sets the allowRewinding.- Parameters:
allowRewinding
- the allowRewinding to set
-
setCheckingNormalization
public void setCheckingNormalization(boolean enable) Turns NFC checking on or off.- Parameters:
enable
-true
if checking on
-
addCharacterHandler
-
isCheckingNormalization
public boolean isCheckingNormalization()Query if checking normalization.- Returns:
true
if checking on
-
tokenize
Runs the tokenization. This is the main entry point.- Parameters:
is
- the input source- Throws:
SAXException
- on fatal error (if configured to treat XML violations as fatal) or if the token handler threwIOException
- if the stream threw
-
dontSwallowBom
void dontSwallowBom() -
runStates
- Throws:
SAXException
IOException
-
setEncoding
-
internalEncodingDeclaration
Description copied from interface:EncodingDeclarationHandler
Indicates that the parser has found an internal encoding declaration with the charset valuecharset
.- Specified by:
internalEncodingDeclaration
in interfaceEncodingDeclarationHandler
- Parameters:
internalCharset
- the charset name found.- Returns:
true
if the value ofcharset
was an encoding name for a supported ASCII-superset encoding.- Throws:
SAXException
- if something went wrong
-
becomeConfident
private void becomeConfident() -
setHeuristics
Sets the encoding sniffing heuristics.- Parameters:
heuristics
- the heuristics to set
-
warnWithoutLocation
Reports a warning without line/col- Parameters:
message
- the message- Throws:
SAXException
-
encodingFromExternalDeclaration
Initializes a decoder from external decl.- Throws:
SAXException
-
whineAboutEncodingAndReturnActual
protected Encoding whineAboutEncodingAndReturnActual(String encoding, Encoding cs) throws SAXException - Parameters:
encoding
-cs
-- Returns:
- Throws:
SAXException
-
notifyAboutMetaBoundary
void notifyAboutMetaBoundary() -
setCommentPolicy
- Parameters:
commentPolicy
-- See Also:
-
setContentNonXmlCharPolicy
- Parameters:
contentNonXmlCharPolicy
-- See Also:
-
setContentSpacePolicy
- Parameters:
contentSpacePolicy
-- See Also:
-
setErrorHandler
- Parameters:
eh
-- See Also:
-
setTransitionHandler
-
setHtml4ModeCompatibleWithXhtml1Schemata
public void setHtml4ModeCompatibleWithXhtml1Schemata(boolean html4ModeCompatibleWithXhtml1Schemata) - Parameters:
html4ModeCompatibleWithXhtml1Schemata
-- See Also:
-
setMappingLangToXmlLang
public void setMappingLangToXmlLang(boolean mappingLangToXmlLang) - Parameters:
mappingLangToXmlLang
-- See Also:
-
setNamePolicy
- Parameters:
namePolicy
-- See Also:
-
setXmlnsPolicy
- Parameters:
xmlnsPolicy
-- See Also:
-
getCharacterEncoding
Description copied from interface:EncodingDeclarationHandler
Queries the environment for the encoding in use (for error reporting).- Specified by:
getCharacterEncoding
in interfaceEncodingDeclarationHandler
- Returns:
- the encoding in use
- Throws:
SAXException
- if something went wrong
-
getDocumentLocator
-