org.das2.qds.DataSetOps

Useful operations for QDataSets, such as slice2, leafTrim. TODO: identify which functions appear here instead of Ops.java.

DataSetOps( )

DS_LENGTH_LIMIT

absolute length limit for plots. This is used to limit the elements used in autoranging, etc.

addElement

addElement( int[] array, int value ) → int[]

adds an element to the array

Parameters

array - length N array
value - the value to append

Returns:

array with the element, length N+1.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

addElement( int value, int[] array ) → int[]

applyIndex

applyIndex( QDataSet ds, QDataSet indices ) → QDataSet

return the dataset with records rearranged according to indices.

Parameters

ds - rank N dataset, where N>0
indices - rank 1 dataset, length m.

Returns:

length m rank N dataset.

applyIndexAllLists

applyIndexAllLists( QDataSet rods, QDataSet[] lists ) → org.das2.qds.ArrayDataSet

handle special case where rank 1 datasets are used to index a rank N array. Supports negative indices. This was extracted from PyQDataSet because it should be useful in Java codes as well.

Parameters

rods - the dataset
lists - datasets of rank 0 or rank 1

Returns:

the array extracted.

applyIndexInSitu

applyIndexInSitu( org.das2.qds.WritableDataSet ds, QDataSet sort ) → void

apply the sort to the data on the zeroth dimension. The dataset must be mutable, and the dataset itself is modified. This was introduced to support AggregatingDataSource but should be generally useful.

Parameters

ds - a writable dataset that is still mutable.
sort - the new sort indeces.

Returns:

void (returns nothing)

[search for examples] [view on GitHub] [view on old javadoc] [view source]

boundsContains

boundsContains( QDataSet bounds, Datum xValue, Datum yValue ) → boolean

return true of the bounds overlaps with the x and y values.

Parameters

bounds - bounding box
xValue - the x range
yValue - the y range

Returns:

true of the bounds overlap

[search for examples] [view on GitHub] [view on old javadoc] [view source]

bundleNames

bundleNames( QDataSet bundleDs ) → String[]

return the names of the dataset that can be unbundled.

Parameters

bundleDs - a QDataSet

Returns:

and array of the bundle names.

changesDimensions

changesDimensions( String p ) → boolean

indicate if this one operator changes the dimensions. For example, |smooth doesn't change the dimensions, but fftPower and slice do.

Parameters

p - the filter, e.g. "|smooth"

Returns:

true if the dimensions change.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

changesDimensions( String c0, String c1 ) → boolean

changesIndependentDimensions

changesIndependentDimensions( String p ) → boolean

indicate if this one operator changes the independent dimensions. For example, |smooth doesn't change the dimensions, but |multiply also doesn't change the independent dimension.

Parameters

p - the filter, e.g. "|smooth"

Returns:

true if the dimensions change.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

changesIndependentDimensions( String c0, String c1 ) → boolean

dbAboveBackgroundDim0

dbAboveBackgroundDim0( QDataSet ds, double level ) → QDataSet

normalize the level-th percentile from: rank 1: each element (same as removeBackground1) rank 2: each column of the dataset rank 3: each column of each rank 2 dataset slice. There must be at least 10 elements. If the data is already in dB, then the result is a difference. This is assuming the units are similar to voltage, not a power, we think, containing code like 20 * Math.log10( ds / background ).

Parameters

ds - a QDataSet
level - the percentile level, e.g. 10= 10%

Returns:

the result dataset, in dB above background.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

dbAboveBackgroundDim1

dbAboveBackgroundDim1( QDataSet ds, double level ) → QDataSet

normalize the nth-level percentile from:

rank 1: each element
rank 2: each row of the dataset
rank 3: each row of each rank 2 dataset slice.

If the data is already in dB, then the result is a difference. This is assuming the units are similar to voltage, not a power, we think, containing code like 20 * Math.log10( ds / background ).

Parameters

ds - a QDataSet
level - the percentile level, e.g. 10= 10%

Returns:

the result dataset, in dB above background.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

dbAboveBackgroundDim1( QDataSet ds, double level, boolean power ) → QDataSet

dependBounds

dependBounds( QDataSet ds ) → QDataSet

return a bounding qube of the independent dimensions containing the dataset. If r is the result of the function, then for

rank 1: r.slice(0) x bounds, r.slice(1) y bounds
rank 2 waveform: r.slice(0) x bounds, r.slice(1) y bounds
rank 2 table:r.slice(0) x bounds r.slice(1) DEPEND_0 bounds.
rank 3 table:r.slice(0) x bounds r.slice(1) DEPEND_0 bounds.

Parameters

ds - rank 1,2, or 3 dataset.

Returns:

a bounding qube of the independent dimensions

[search for examples] [view on GitHub] [view on old javadoc] [view source]

dependBoundsSimple

dependBoundsSimple( QDataSet ds ) → QDataSet

return a bounding qube of the independent dimensions containing the dataset. If r is the result of the function, then for

rank 1: r.slice(0) x bounds, r.slice(1) y bounds
rank 2 waveform: r.slice(0) x bounds, r.slice(1) y bounds
rank 2 table:r.slice(0) x bounds r.slice(1) DEPEND_0 bounds.
rank 3 table:r.slice(0) x bounds r.slice(1) DEPEND_0 bounds.

This does not take DELTA_PLUS and DELTA_MINUS into account. When all the data is fill, ds[0,0] will be positive infinity.

Parameters

ds - a rank 1,2, or 3 dataset.

Returns:

a bounding qube of the independent dimensions

[search for examples] [view on GitHub] [view on old javadoc] [view source]

flattenBundleDescriptor

flattenBundleDescriptor( QDataSet bundle1 ) → QDataSet

returns a bundle descriptor roughly equivalent to the BundleDescriptor passed in, but will describe each dataset as if it were rank 1. This is useful for when the client can't work with mixed rank bundles anyway (like display data).

Parameters

bundle1 - a QDataSet

Returns:

a QDataSet

[search for examples] [view on GitHub] [view on old javadoc] [view source]

flattenRank2

flattenRank2( QDataSet ds ) → QDataSet

flatten a rank 2 dataset. The result is a n,3 dataset of [x,y,f]. History:

modified for use in PW group.
missing DEPEND_1 resulted in NullPointerException, so just use 0,1,2,..,n instead and always have rank 2 result.

Parameters

ds - rank 2 table dataset

Returns:

rank 2 dataset that is that is array of (x,y,f).

[search for examples] [view on GitHub] [view on old javadoc] [view source]

flattenRank3

flattenRank3( QDataSet ds ) → QDataSet

flatten a rank 3 dataset. The result is a n,4 dataset of [x,y,z,f], or if there are no tags just rank 1 f. For a rank 3 join (array of tables), the result will be ds[n,3].

Parameters

ds - rank 3 table dataset

Returns:

rank 2 dataset that is array of (x,y,z,f) or rank 1 f.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

flattenWaveform

flattenWaveform( QDataSet ds ) → QDataSet

flatten a rank 2 dataset where the y depend variable is just an offset from the xtag. This is a nice example of the advantage of using a class to represent the data: this requires no additional storage to handle the huge waveform. Note the new DEPEND_0 may have different units from ds.property(DEPEND_0).

Parameters

ds - rank 2 waveform with tags for DEPEND_0 and offsets for DEPEND_1

Returns:

rank 1 waveform

[search for examples] [view on GitHub] [view on old javadoc] [view source]

getBackgroundLevel

getBackgroundLevel( QDataSet ds, double level ) → QDataSet

Get the background level by sorting the data. The result is rank one less than the input rank.

Parameters

ds - rank 1, 2, or rank 3 join.
level - the level between 0 and 100.

Returns:

a QDataSet

[search for examples] [view on GitHub] [view on old javadoc] [view source]

getComponentType

getComponentType( QDataSet ds ) → java.lang.Class

return the class type that can accurately store data in this dataset. This was motivated by DDataSets and FDataSets, but also IndexGenDataSets.

Parameters

ds - the dataset.

Returns:

the class that can store this type. double.class is returned when the class cannot be identified.

getNthPercentileSort

getNthPercentileSort( QDataSet ds, double n ) → QDataSet

returns the value from within a distribution that is the nth percentile division. This returns a fill dataset (Units.dimensionless.getFillDouble()) when the data is all fill.

Parameters

ds - the dataset
n - percent between 0 and 100.

Returns:

a QDataSet

[search for examples] [view on GitHub] [view on old javadoc] [view source]

grid

grid( QDataSet ds ) → QDataSet

takes rank 2 link (x,y,z) and makes a table from it z(x,y)

Parameters

ds - rank 2 link (x,y,z)

Returns:

a table from it z(x,y)

[search for examples] [view on GitHub] [view on old javadoc] [view source]

histogram

histogram( QDataSet ds, double min, double max, double binsize ) → QDataSet

returns a rank 1 dataset that is a histogram of the data. Note there will also be in the properties: count, the total number of valid values. nonZeroMin, the smallest non-zero, positive number

Parameters

ds - rank N dataset
min - the min of the first bin. If min=-1 and max=-1, then automatically set the min and max.
max - the max of the last bin.
binsize - the size of each bin.

Returns:

a rank 1 dataset with each bin's count. DEPEND_0 indicates the bin locations.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

indexOfBundledDataSet

indexOfBundledDataSet( QDataSet bundleDs, String name ) → int

return the index of the named bundled dataset. This cleans up the name so that is contains just a Java-style identifier. Also, ch_1 is always implicitly index 1. Last, if safe names created from labels match that this is used. For example,

bds=ripplesVectorTimeSeries(100)
2==indexOfBundledDataSet( bds, "Z" )

demonstrates its use. Last, extraneous spaces and underscores are removed to see if this will result in a match.

Parameters

bundleDs - a bundle dataset with the property BUNDLE_1 or DEPEND_1 having EnumerationUnits, (or BUNDLE_0 for a rank 1 dataset).
name - the named dataset.

Returns:

the index or -1 if the name is not found.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

isProcessAsync

isProcessAsync( String c ) → boolean

return true if the process described in c is probably a slow process that should be done asynchronously. For example, do a long fft on a different thread and use a progress monitor. Processes that take a trivial, constant amount of time should return false, and may be completed on the event thread,etc.

Parameters

c - process string, as in sprocess.

Returns:

true if the process described in c is probably a slow process

[search for examples] [view on GitHub] [view on old javadoc] [view source]

leafTrim

leafTrim( QDataSet ds, int start, int end ) → org.das2.qds.MutablePropertyDataSet

pull out a subset of the dataset by reducing the number of columns in the last dimension. This does not reduce rank. This assumes the dataset has no row with length>end. This is extended to support rank 4 datasets. TODO: This probably doesn't handle bundles property. TODO: slice and trim should probably be implemented here for efficiently.

Parameters

ds - rank 1 or more dataset
start - first index to include.
end - last index, exclusive

Returns:

dataset of the same rank.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

makeProcessStringCanonical

makeProcessStringCanonical( String s ) → String

replace any component reference C, to explicit "|unbundle(C)"

Parameters

s - the process string, like "X|fftPower(512,2)"

Returns:

canonical version, like "|unbundle(X)|fftPower(512,2)"

[search for examples] [view on GitHub] [view on old javadoc] [view source]

makePropertiesMutable

makePropertiesMutable( QDataSet dataset ) → org.das2.qds.MutablePropertyDataSet

return a dataset that has mutable properties. If the dataset parameter already has, then the dataset is returned. If the dataset is a MutablePropertyDataSet but the immutable flag is set, then the dataset is wrapped to make the properties mutable.

Parameters

dataset - dataset

Returns:

a MutablePropertyDataSet that is has a wrapper around the dataset, or the dataset.

makeWritable

makeWritable( QDataSet dataset ) → org.das2.qds.WritableDataSet

return a dataset that is writable. If the dataset parameter of this idempotent function is already writable, then the dataset is returned. If the dataset is a WritableDataSet but the immutable flag is set, then the a copy is returned.

Parameters

dataset - a QDataSet

Returns:

a WritableDataSet that is either a copy of the read-only dataset provided, or the parameter writable dataset provided.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

moment

moment( QDataSet ds ) → org.das2.qds.RankZeroDataSet

performs the moment (mean,variance,etc) on the dataset.

Parameters

ds - rank N QDataSet.

Returns:

rank 0 dataset of the mean. Properties contain other stats: stddev, RankZeroDataSet validCount, Integer, the number valid measurements invalidCount, Integer, the number of invalid measurements

[search for examples] [view on GitHub] [view on old javadoc] [view source]

processDataSet

processDataSet( String c, QDataSet fillDs, ProgressMonitor mon ) → QDataSet

apply process to the data. This is like sprocess, except that the component can be extracted as the first step. In general these can be done on the same thread (like slice1), but some are slow (like fftPower). This is a copy of PlotElementController.processDataSet.

Parameters

c - the process string, like "bgsmx|slice0(9)|histogram()"
fillDs - the input dataset.
mon - a monitor for the processing

Returns:

dataset resulting form filters.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

removeElement

removeElement( int[] array, int index ) → int[]

removes the index-th element from the array.

Parameters

array - length N array
index - the index to remove

Returns:

array without the element, length N-1.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

slice

slice( QDataSet ds, int dimension, int index ) → org.das2.qds.MutablePropertyDataSet

slice on the dimension. This saves from the pain of having this branch all over the code.

Parameters

ds - the rank N data to slice.
dimension - the dimension to slice, 0 is the first.
index - the index to slice at.

Returns:

the rank N-1 result.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

slice0

slice0( QDataSet ds, int index ) → org.das2.qds.MutablePropertyDataSet

slice on the first dimension. Note the function ds.slice(index) was added later and will typically be more efficient. This will create a new Slice0DataSet. DO NOT try to optimize this by calling native trim, some native slice implementations call this. TODO: This actually needs a bit more study, because there are codes that talk about not using the native slice because it copies data and they just want metadata. This probably is because Slice0DataSet doesn't check for immutability, and really should be copying. This needs to be fixed, making sure the result of this call is immutable, and the native slice really should be more efficient, always.

Parameters

ds - rank 1 or more dataset
index - the index to slice at

Returns:

rank 0 or more dataset.

slice1

slice1( QDataSet ds, int index ) → org.das2.qds.MutablePropertyDataSet

slice dataset operator assumes a qube dataset by picking the index-th element of dataset's second dimension, without regard to tags.

Parameters

ds - rank 2 or more dataset
index - the index to slice at

Returns:

rank 1 or more dataset.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

slice2

slice2( QDataSet ds, int index ) → org.das2.qds.MutablePropertyDataSet

slice dataset operator assumes a qube dataset by picking the index-th element of dataset's second dimension, without regard to tags.

Parameters

ds - rank 3 or more dataset
index - the index to slice at.

Returns:

rank 2 or more dataset.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

slice3

slice3( QDataSet ds, int index ) → org.das2.qds.MutablePropertyDataSet

slice dataset operator assumes a qube dataset by picking the index-th element of dataset's second dimension, without regard to tags.

Parameters

ds - rank 4 or more dataset.
index - index to slice at

Returns:

rank 3 or more dataset.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

sliceProperties

sliceProperties( java.util.Map properties, int sliceDimension ) → java.util.Map

we've sliced a dataset, removing an index. move the properties. This was Ops.sliceProperties For example, after slicing the zeroth dimension (time), what was DEPEND_1 is becomes DEPEND_0.

Parameters

properties - the properties to slice.
sliceDimension - the dimension to slice at (0,1,2...QDataSet.MAX_HIGH_RANK)

Returns:

the properties after the slice.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

sliceProperties0

sliceProperties0( int index, java.util.Map props ) → java.util.Map

method to help dataset implementations implement slice. 2010-09-23: support rank 2 DEPEND_2 and DEPEND_3 2010-09-23: add BINS_1 and BUNDLE_1, Slice0DataSet calls this. 2010-02-24: BUNDLE_0 handled. 2011-03-25: add WEIGHTS_PLANE

Parameters

index - the index to slice at in the zeroth index.
props - the properties to slice.

Returns:

the properties after the slice.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

sort

sort( QDataSet ds ) → QDataSet

returns a list of indeces that sort the dataset. I don't like this implementation, because it requires that an array of Integers (not int[]) be created. Invalid measurements are not indexed in the returned dataset. If the sort is monotonic, then the property MONOTONIC will be Boolean.TRUE.

Parameters

ds - rank 1 dataset, possibly containing fill.

Returns:

indeces that sort the data.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

sprocess

sprocess( String c, QDataSet fillDs, ProgressMonitor mon ) → QDataSet

sprocess implements the poorly-named filters string / process string of Autoplot, allowing clients to "pipe" data through a chain of operations. For example, the filters string "|slice0(9)|histogram()" will slice on the ninth index and then take a histogram of that result. See http://www.papco.org/wiki/index.php/DataReductionSpecs (TODO: wiki page was lost, which could probably be recovered.) There's a big problem here: if the command is not recognized, then it is ignored. We should probably change this, but the change should be at a major version change in case it breaks things.

Parameters

c - process string like "slice0(9)|histogram()"
fillDs - The dataset loaded from the data source controller, with initial filters (like fill) applied.
mon - monitor for the processing.

Returns:

the dataset after the process string is applied.

suggestFillForComponentType

suggestFillForComponentType( java.lang.Class c ) → double

return a fill value that is representable by the type.

Parameters

c - the class type, including double.class, float.class, etc.

Returns:

a fill value that is representable by the type.

[search for examples] [view on GitHub] [view on old javadoc] [view source]

transpose2

transpose2( QDataSet ds ) → QDataSet

transpose the rank 2 qube dataset so the rows are columns and the columns are rows.

Parameters

ds - rank 2 Qube DataSet.

Returns:

rank 2 Qube DataSet

[search for examples] [view on GitHub] [view on old javadoc] [view source]

trim

trim( QDataSet ds, int offset, int len ) → org.das2.qds.MutablePropertyDataSet

reduce the number of elements in the dataset to the dim 0 indeces specified. This does not change the rank of the dataset. DO NOT try to optimize this by calling native trim, some native trim implementations call this.

Parameters

ds - the dataset
offset - the offset
len - the length, (not the stop index!)

Returns:

trimmed dataset

[search for examples] [view on GitHub] [view on old javadoc] [view source]

trim( QDataSet dep, int start, int stop, int stride ) → org.das2.qds.MutablePropertyDataSet

unbundle

unbundle( QDataSet bundleDs, String name ) → QDataSet

Extract the named bundled dataset. For example, extract B_x from bundle of components.

Parameters

bundleDs - a bundle of datasets
name - the name of the bundled dataset, or "ch_<i>" where i is the dataset number

Returns:

the named dataset

unbundleDefaultDataSet

unbundleDefaultDataSet( QDataSet bundleDs ) → QDataSet

extract the dataset that is dependent on others, or the last one. For example, the dataset ds[:,"x,y"] → y[:]

Parameters

bundleDs - a bundle of datasets

Returns:

the default dataset

org.das2.qds.DataSetOps

DataSetOps( )

DS_LENGTH_LIMIT

addElement

Parameters

Returns:

applyIndex

Parameters

Returns:

See Also:

applyIndexAllLists

Parameters

Returns:

See Also:

applyIndexInSitu

Parameters

Returns:

boundsContains

Parameters

Returns:

bundleNames

Parameters

Returns:

See Also:

changesDimensions

Parameters

Returns:

changesIndependentDimensions

Parameters

Returns:

dbAboveBackgroundDim0

Parameters

Returns:

dbAboveBackgroundDim1

Parameters

Returns:

dependBounds

Parameters

Returns:

dependBoundsSimple

Parameters

Returns:

flattenBundleDescriptor

Parameters

Returns:

flattenRank2

Parameters

Returns:

flattenRank3

Parameters

Returns:

flattenWaveform

Parameters

Returns:

getBackgroundLevel

Parameters

Returns:

getComponentType

Parameters

Returns:

See Also:

getNthPercentileSort

Parameters

Returns:

grid

Parameters

Returns:

histogram

Parameters

Returns:

indexOfBundledDataSet

Parameters

Returns:

isProcessAsync

Parameters

Returns:

leafTrim

Parameters

Returns:

makeProcessStringCanonical