org.das2.util.filesystem.HtmlUtil
HTML utilities, such as getting a directory listing, where a "file" is a link
below the directory we are listing, and read a URL into a String.
HtmlUtil( )
checkRedirect
checkRedirect( java.net.URLConnection urlConnection ) → URLConnection
check for 301, 302 or 303 redirects, and return a new connection in this case.
This should be called immediately before the urlConnection.connect call,
as this must connect to get the response code.
Parameters
urlConnection - if an HttpUrlConnection, check for 301 or 302; return connection otherwise.
Returns:
a connection, typically the same one as passed in.
See Also:
HttpUtil#checkRedirect(java.net.URLConnection)
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
consumeStream
consumeStream( java.io.InputStream err ) → void
nice clients consume both the stderr and stdout coming from websites.
This reads everything off of the stream and closes it.
http://docs.oracle.com/javase/1.5.0/docs/guide/net/http-keepalive.html suggests that you "do not abandon connection"
Parameters
err - the input stream
Returns:
void (returns nothing)
See Also:
HttpUtil#consumeStream(java.io.InputStream)
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
getDirectoryListing
getDirectoryListing( java.net.URL url, java.io.InputStream urlStream ) → URL[]
Get the listing of the web directory, returning links that are "under" the given URL.
Note this does not handle off-line modes where we need to log into
a website first, as is often the case for a hotel.
This was refactored to support caching of listings by simply writing the content to disk.
Parameters
url - the address.
urlStream - stream containing the URL content, which must be UTF-8 (or US-ASCII)
Returns:
list of URIs referred to in the page.
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
getDirectoryListing( java.net.URL url, java.io.InputStream urlStream, boolean childCheck ) → URL[]
getDirectoryListing( java.net.URL url ) → URL[]
getInputStream
getInputStream( java.net.URL url ) → InputStream
get the inputStream, following redirects if a 301 or 302 is encountered.
The scientist may be prompted for a password, but only if "user@" is
in the URL.
Note this does not explicitly close the connections
to the server, and Java may not know to release the resources.
TODO: fix this by wrapping the input stream and closing the connection
when the stream is closed. This was done in Autoplot's DataSetURI.downloadResourceAsTempFile
Parameters
url - an URL
Returns:
input stream
See Also:
org.autoplot.datasource.DataSetURI#downloadResourceAsTempFile
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
getLinks
getLinks( java.net.URL url, String content ) → List
return the links found in the content, using url as the context.
Parameters
url - null or the url for the context.
content - the html content.
Returns:
a list of URLs.
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
getMetadata
getMetadata( java.net.URL url, java.util.Map props ) → Map
return the metadata about a URL. This will support http, https,
and ftp, and will check for redirects. This will
allow caching of head requests.
Parameters
url - ftp,https, or http URL
props - a java.util.Map
Returns:
the metadata
See Also:
HttpUtil#getMetadata(java.net.URL, java.util.Map)
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
isDirectory
isDirectory( java.net.URL url ) → boolean
Parameters
url - an URL
Returns:
boolean
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]
readToString
readToString( java.net.URL url ) → String
read the contents of the URL into a string, assuming UTF-8 encoding.
Parameters
url - an URL
Returns:
a String
[search for examples]
[view on GitHub]
[view on old javadoc]
[view source]