Class for containing the elemental parts of a URI, and utility routines for working with URIs. We need a working definition of well-formed and colloquial URIs:
This routine knows nothing about the data source that will interpret the URI, so this needs to be established.= well-formed URIs =: ? :[ ?] : * they are valid URIs: they contain no spaces, etc. == params == ampersand-delimited (&) list of name=value pairs, or just value. vap+cdaweb:ds=ac_k0_epm&H_lo&timerange=2010-01 = colloquial URIs = * these are Strings that can be converted into URIs. * spaces in file names are converted into %20. * spaces in parameter lists are converted into pluses. * pluses in parameter lists are converted into %2B. * note that if there are pluses but the URI is valid, then pluses may be left alone.
time range subset.
subset of rank 2 data. For example, columns of excel workbook or ascii table. rank2=[3,5] or rank2=Bx-Bz
used for the number of records to read.
first positional parameter, typically interpreted the same as PARAM_ID
typically the dataset id.
some datasources support periodic checks to see if data sources have updated, such as: AggregatingDataSource AbstractDataSources (most of those based on files)
scheme for Autoplot, if provided. e.g. vap+cdf.
scheme for resource, e.g. "file" or "https"
the complete, modified surl. file:///home/jbf/mydata.qds this is the resource name, and doesn't contain the vapScheme.
the resource that is handled by the DataSource. This may be null if surl doesn't form a valid uri.
the resource uri up to the authority, e.g. http://autoplot.org
the resource uri including the path part.
contains the resource string up to the query part.
the file/resource extention, like ".cdf" or ".dat".
contains the parameters part, a ampersand-delimited set of parameters. For example, column=field2&rank2.
additional processes to be applied to the URI. For example, slice0(0) means slice the dataset at this point.
position of the caret after modifications to the surl are made. This is with respect to surl, the URI for the datasource, without the "vap" scheme.
position of the caret after modifications to the surl are made. This is with respect to formatted URI, which probably includes the explicit "vap:" scheme.
format the URI using vapScheme, file and params. If file is missing but params is present, then return params: vap+cdaweb:ds=myds If file is present, then format with file and params: vap+cdf:file://tmp/my.cdf?myVar Else, just use the surl that is in there already. Note if split.params is non-null, it will be appended with a question mark, even if empty.
spaces and other URI syntax elements are URL-encoded. Note some calls of this routine should check for an empty string result and then set split.params=null instead of "", to avoid the extraneous question mark.
convenient method for getting a parameter in the URI.
only split on the delimiter when we are not within the exclude delimiters. For example,
x=getDataSet("http://autoplot.org/data/autoplot.cdf?Magnitude&noDep=T")&y=getDataSet('http://autoplot.org/data/autoplot.cdf?BGSEc&slice1=2')&sqrt(x)
return the vap scheme in split.vapScheme or the one inferred by the extension. Returns an empty string (not "vap") if one cannot be inferred. e.g: /home/jbf/myfile.jyds --> vap+jyds vap+txt:/home/jbf/myfile.csv --> vap+txt This was introduced as part of the effort to get rid of extraneous "vap:"s that would be added to URIs.
We need a standard way to detect if a string has already been URL encoded. The problem is we want valid URIs that are also readable, so just using simple encode/decode logic is not practical. This means:
ensure that the reference, which may be relative, is absolute. NOTE this is only implemented for unix filenames. TODO: Windows. For example:
make the URI canonical, with the vap+<ext>: prefix. This will also now sort the parameters, when this can be done.
make the URI colloquial, e.g. removing "vap+cdf:" from "vap+cdf:file:///tmp/x.cdf" URIs that do not have a resource URI are left alone.
add "file:/" to a resource string that appears to reference the local filesystem. return the parsed string, or null if the string doesn't appear to be from a file.
added to avoid widespread use of parse(uri.toString). This way its all being done with same code, and keep the URI abstraction.
Split the parameters (if any) into name,value pairs. URLEncoded parameters are decoded, but the string may be decoded already. Items without equals (=) are inserted as "arg_N"=name.
Helper method to get the timerange from the URI
convenient method for adding or replacing a parameter to the URI.
convenient method to remove a parameter (or parameters) from the list of parameters
allow parsing of script:, bookmarks:, pngwalk:, etc
convert "+" to " ", etc, by using URLDecoder and catching the UnsupportedEncodingException that will never occur. We have to be careful for elements like %Y than are not to be decoded. TODO: we need to use standard escape/unescape code, possibly changing %Y to $Y beforehand.
convert " " to "%20", etc, by looking for and encoding illegal characters. We can't just aggressively convert...