Package it.unimi.dsi.parser.callback
Class LinkExtractor
- java.lang.Object
-
- it.unimi.dsi.parser.callback.DefaultCallback
-
- it.unimi.dsi.parser.callback.LinkExtractor
-
- All Implemented Interfaces:
Callback
@Deprecated public class LinkExtractor extends DefaultCallback
Deprecated.This class is obsolete and kept around for backward compatibility only.A callback extracting links.This callbacks extracts links existing in the web page. The links are then accessible in
urls
(a set ofString
s). Note that we guarantee that the iteration order in the set is exactly the order in which links have been met (albeit copies appear just once).
-
-
Field Summary
Fields Modifier and Type Field Description java.util.Set<java.lang.String>
urls
Deprecated.The URLs resulting from the parsing process.-
Fields inherited from interface it.unimi.dsi.parser.callback.Callback
EMPTY_CALLBACK_ARRAY
-
-
Constructor Summary
Constructors Constructor Description LinkExtractor()
Deprecated.
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description java.lang.String
base()
Deprecated.Returns the URL specified by theBASE
element.void
configure(BulletParser parser)
Deprecated.Configure the parser to parse elements and certain attributes.java.lang.String
metaLocation()
Deprecated.Returns the URL specified byMETA
HTTP-EQUIV
elements of location type.java.lang.String
metaRefresh()
Deprecated.Returns the URL specified byMETA
HTTP-EQUIV
elements of refresh type.void
startDocument()
Deprecated.Receive notification of the beginning of the document.boolean
startElement(Element element, java.util.Map<Attribute,MutableString> attrMap)
Deprecated.Receive notification of the start of an element.-
Methods inherited from class it.unimi.dsi.parser.callback.DefaultCallback
cdata, characters, endDocument, endElement, getInstance
-
-
-
-
Method Detail
-
configure
public void configure(BulletParser parser)
Deprecated.Configure the parser to parse elements and certain attributes.The required attributes are
SRC
,HREF
,HTTP-EQUIV
, andCONTENT
.- Specified by:
configure
in interfaceCallback
- Overrides:
configure
in classDefaultCallback
-
startDocument
public void startDocument()
Deprecated.Description copied from interface:Callback
Receive notification of the beginning of the document.The callback must use this method to reset its internal state so that it can be resued. It must be safe to invoke this method several times.
- Specified by:
startDocument
in interfaceCallback
- Overrides:
startDocument
in classDefaultCallback
-
startElement
public boolean startElement(Element element, java.util.Map<Attribute,MutableString> attrMap)
Deprecated.Description copied from interface:Callback
Receive notification of the start of an element.For simple elements, this is the only notification that the callback will ever receive.
- Specified by:
startElement
in interfaceCallback
- Overrides:
startElement
in classDefaultCallback
- Parameters:
element
- the element whose opening tag was found.attrMap
- a map fromAttribute
s toMutableString
s.- Returns:
- true to keep the parser parsing, false to stop it.
-
metaLocation
public java.lang.String metaLocation()
Deprecated.Returns the URL specified byMETA
HTTP-EQUIV
elements of location type. More precisely, this method returns a non-null
result iff there is at least oneMETA HTTP-EQUIV
element specifying a location URL (if there is more than one, we keep the first one).- Returns:
- the first URL specified by a
META
HTTP-EQUIV
elements of location type, ornull
.
-
base
public java.lang.String base()
Deprecated.Returns the URL specified by theBASE
element. More precisely, this method returns a non-null
result iff there is at least oneBASE
element specifying a derelativisation URL (if there is more than one, we keep the first one).- Returns:
- the first URL specified by a
BASE
element, ornull
.
-
metaRefresh
public java.lang.String metaRefresh()
Deprecated.Returns the URL specified byMETA
HTTP-EQUIV
elements of refresh type. More precisely, this method returns a non-null
result iff there is at least oneMETA HTTP-EQUIV
element specifying a refresh URL (if there is more than one, we keep the first one).- Returns:
- the first URL specified by a
META
HTTP-EQUIV
elements of refresh type, ornull
.
-
-