|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectmarytts.features.FeatureDefinition
public class FeatureDefinition
A feature definition object represents the "meaning" of feature vectors. It consists of a list of byte-valued, short-valued and continuous features by name and index position in the feature vector; the respective possible feature values (and corresponding byte and short codes); and, optionally, the weights and, for continuous features, weighting functions for each feature.
| Field Summary | |
|---|---|
static java.lang.String |
BYTEFEATURES
|
static java.lang.String |
CONTINUOUSFEATURES
|
static java.lang.String |
EDGEFEATURE
|
static java.lang.String |
EDGEFEATURE_END
|
static java.lang.String |
EDGEFEATURE_START
|
static java.lang.String |
NULLVALUE
|
static java.lang.String |
SHORTFEATURES
|
static char |
WEIGHT_SEPARATOR
|
| Constructor Summary | |
|---|---|
FeatureDefinition(java.io.BufferedReader input,
boolean readWeights)
Create a feature definition object, reading textual data from the given BufferedReader. |
|
FeatureDefinition(java.io.DataInput input)
Create a feature definition object, reading binary data from the given DataInput. |
|
| Method Summary | |
|---|---|
FeatureVector |
createEdgeFeatureVector(int unitIndex,
boolean start)
Create a feature vector that marks a start or end of a unit. |
static int |
diff(FeatureVector v1,
FeatureVector v2)
Compares two feature vectors in terms of how many discrete features they have in common. |
boolean |
equals(FeatureDefinition other)
Determine whether two feature definitions are equal, regarding both the actual feature definitions and the weights. |
boolean |
featureEquals(FeatureDefinition other)
Determine whether two feature definitions are equal, with respect to number, names, and possible values of the three kinds of features (byte-valued, short-valued, continuous). |
java.lang.String |
featureEqualsAnalyse(FeatureDefinition other)
An extension of the previous method. |
void |
generateAllDotDescForWagon(java.io.PrintWriter out)
Export this feature definition in the "all.desc" format which can be read by wagon. |
void |
generateAllDotDescForWagon(java.io.PrintWriter out,
java.util.Set<java.lang.String> featuresToIgnore)
Export this feature definition in the "all.desc" format which can be read by wagon. |
void |
generateFeatureWeightsFile(java.io.PrintWriter out)
Print this feature definition plus weights to a .txt file |
int |
getFeatureIndex(java.lang.String featureName)
Translate between a feature name and a feature index. |
int[] |
getFeatureIndexArray(java.lang.String[] featureName)
Translate between an array of feature names and an array of feature indexes. |
java.lang.String |
getFeatureName(int index)
Translate between a feature index and a feature name. |
java.lang.String[] |
getFeatureNameArray(int[] index)
Translate between an array of feature indexes and an array of feature names. |
java.lang.String |
getFeatureNames()
List all feature names, separated by white space, in their order of definition. |
byte |
getFeatureValueAsByte(int featureIndex,
java.lang.String value)
For the feature with the given index number, translate its String value to its byte value. |
byte |
getFeatureValueAsByte(java.lang.String featureName,
java.lang.String value)
For the feature with the given name, translate its String value to its byte value. |
short |
getFeatureValueAsShort(int featureIndex,
java.lang.String value)
For the feature with the given name, translate its String value to its short value. |
short |
getFeatureValueAsShort(java.lang.String featureName,
java.lang.String value)
For the feature with the given name, translate its String value to its short value. |
java.lang.String |
getFeatureValueAsString(int featureIndex,
int value)
For the feature with the given index number, translate its byte or short value to its String value. |
java.lang.String |
getFeatureValueAsString(java.lang.String featureName,
FeatureVector fv)
Simple access to string-based features. |
float[] |
getFeatureWeights()
|
int |
getNumberOfByteFeatures()
Get the number of byte features. |
int |
getNumberOfContinuousFeatures()
Get the number of continuous features. |
int |
getNumberOfFeatures()
Get the total number of features. |
int |
getNumberOfShortFeatures()
Get the number of short features. |
int |
getNumberOfValues(int featureIndex)
Get the number of possible values for the feature with the given index number. |
java.lang.String[] |
getPossibleValues(int featureIndex)
Get the list of possible String values for the feature with the given index number. |
float |
getWeight(int featureIndex)
For the feature with the given index, return the weight. |
java.lang.String |
getWeightFunctionName(int featureIndex)
Get the name of any weighting function associated with the given feature index. |
boolean |
hasFeature(java.lang.String name)
Indicate whether the feature definition contains the feature with the given name |
boolean |
isByteFeature(int index)
Determine whether the feature with the given index number is a byte feature. |
boolean |
isByteFeature(java.lang.String featureName)
Determine whether the feature with the given name is a byte feature. |
boolean |
isContinuousFeature(int index)
Determine whether the feature with the given index number is a continuous feature. |
boolean |
isContinuousFeature(java.lang.String featureName)
Determine whether the feature with the given name is a continuous feature. |
boolean |
isShortFeature(int index)
Determine whether the feature with the given index number is a short feature. |
boolean |
isShortFeature(java.lang.String featureName)
Determine whether the feature with the given name is a short feature. |
FeatureVector |
readFeatureVector(int currentUnitIndex,
java.io.DataInput input)
Create a feature vector consistent with this feature definition by reading the data from the given input. |
java.lang.String |
toFeatureString(FeatureVector fv)
Convert a feature vector into a String representation. |
FeatureVector |
toFeatureVector(int unitIndex,
byte[] bytes,
short[] shorts,
float[] floats)
|
FeatureVector |
toFeatureVector(int unitIndex,
java.lang.String featureString)
Create a feature vector consistent with this feature definition by reading the data from a String representation. |
void |
writeBinaryTo(java.io.DataOutput out)
Write this feature definition in binary format to the given output. |
void |
writeTo(java.io.PrintWriter out,
boolean writeWeights)
Export this feature definition in the text format which can also be read by this class. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final java.lang.String BYTEFEATURES
public static final java.lang.String SHORTFEATURES
public static final java.lang.String CONTINUOUSFEATURES
public static final char WEIGHT_SEPARATOR
public static final java.lang.String EDGEFEATURE
public static final java.lang.String EDGEFEATURE_START
public static final java.lang.String EDGEFEATURE_END
public static final java.lang.String NULLVALUE
| Constructor Detail |
|---|
public FeatureDefinition(java.io.BufferedReader input,
boolean readWeights)
throws java.io.IOException
input - a BufferedReader from which a textual feature definition
can be read.readWeights - a boolean indicating whether or not to read
weights from input. If weights are read, they will be normalized
so that they sum to one.
java.io.IOException - if a reading problem occurs
public FeatureDefinition(java.io.DataInput input)
throws java.io.IOException
input - a DataInputStream or a RandomAccessFile from which
a binary feature definition can be read.
java.io.IOException - if a reading problem occurs| Method Detail |
|---|
public void writeBinaryTo(java.io.DataOutput out)
throws java.io.IOException
out - a DataOutputStream or RandomAccessFile to which the
FeatureDefinition should be written.
java.io.IOException - if a problem occurs while writing.public int getNumberOfFeatures()
public int getNumberOfByteFeatures()
public int getNumberOfShortFeatures()
public int getNumberOfContinuousFeatures()
public float getWeight(int featureIndex)
featureIndex -
public float[] getFeatureWeights()
public java.lang.String getWeightFunctionName(int featureIndex)
featureIndex -
public java.lang.String getFeatureName(int index)
index - a feature index, as could be used to access
a feature value in a FeatureVector.
java.lang.IndexOutOfBoundsException - if index<0 or index>getNumberOfFeatures()public java.lang.String[] getFeatureNameArray(int[] index)
index - an array of feature indexes, as could be used to access
a feature value in a FeatureVector.
java.lang.IndexOutOfBoundsException - if any of the indexes is <0 or >getNumberOfFeatures()public java.lang.String getFeatureNames()
public boolean hasFeature(java.lang.String name)
name - the feature name in question, e.g. "next_next_phone"
public boolean isByteFeature(java.lang.String featureName)
featureName -
public boolean isByteFeature(int index)
featureIndex -
public boolean isShortFeature(java.lang.String featureName)
featureName -
public boolean isShortFeature(int index)
featureIndex -
public boolean isContinuousFeature(java.lang.String featureName)
featureName -
public boolean isContinuousFeature(int index)
featureIndex -
public int getFeatureIndex(java.lang.String featureName)
featureName - a valid feature name
java.lang.IllegalArgumentException - if the feature name is unknown.public int[] getFeatureIndexArray(java.lang.String[] featureName)
featureName - an array of valid feature names
java.lang.IllegalArgumentException - if one of the feature names is unknown.public int getNumberOfValues(int featureIndex)
featureIndex - the index number of the feature.
java.lang.IndexOutOfBoundsException - if featureIndex < 0 or
featureIndex >= getNumberOfByteFeatures() + getNumberOfShortFeatures().public java.lang.String[] getPossibleValues(int featureIndex)
featureIndex - the index number of the feature.
java.lang.IndexOutOfBoundsException - if featureIndex < 0 or
featureIndex >= getNumberOfByteFeatures() + getNumberOfShortFeatures().
public java.lang.String getFeatureValueAsString(int featureIndex,
int value)
featureIndex - the index number of the feature.value - the feature value. This must be in the range of acceptable values for
the given feature.
java.lang.IndexOutOfBoundsException - if featureIndex < 0 or
featureIndex >= getNumberOfByteFeatures() + getNumberOfShortFeatures()
java.lang.IndexOutOfBoundsException - if value is not a legal value for this feature
public java.lang.String getFeatureValueAsString(java.lang.String featureName,
FeatureVector fv)
featureName - fv -
public byte getFeatureValueAsByte(java.lang.String featureName,
java.lang.String value)
featureName - the name of the feature.value - the feature value. This must be among the acceptable values for
the given feature.
java.lang.IllegalArgumentException - if featureName is not a valid feature name,
or if featureName is not a byte-valued feature.
java.lang.IllegalArgumentException - if value is not a legal value for this feature
public byte getFeatureValueAsByte(int featureIndex,
java.lang.String value)
featureName - the name of the feature.value - the feature value. This must be among the acceptable values for
the given feature.
java.lang.IllegalArgumentException - if featureName is not a valid feature name,
or if featureName is not a byte-valued feature.
java.lang.IllegalArgumentException - if value is not a legal value for this feature
public short getFeatureValueAsShort(java.lang.String featureName,
java.lang.String value)
featureName - the name of the feature.value - the feature value. This must be among the acceptable values for
the given feature.
java.lang.IllegalArgumentException - if featureName is not a valid feature name,
or if featureName is not a short-valued feature.
java.lang.IllegalArgumentException - if value is not a legal value for this feature
public short getFeatureValueAsShort(int featureIndex,
java.lang.String value)
featureName - the name of the feature.value - the feature value. This must be among the acceptable values for
the given feature.
java.lang.IllegalArgumentException - if featureName is not a valid feature name,
or if featureName is not a short-valued feature.
java.lang.IllegalArgumentException - if value is not a legal value for this featurepublic boolean featureEquals(FeatureDefinition other)
other - the feature definition to compare to
public java.lang.String featureEqualsAnalyse(FeatureDefinition other)
public boolean equals(FeatureDefinition other)
other - the feature definition to compare to
featureEquals(FeatureDefinition)
public FeatureVector toFeatureVector(int unitIndex,
java.lang.String featureString)
unitIndex - an index number to assign to the feature vectorfeatureString - the string representation of a feature vector.
java.lang.IllegalArgumentException - if the feature values listed are not
consistent with the feature definition.toFeatureString(FeatureVector)
public FeatureVector toFeatureVector(int unitIndex,
byte[] bytes,
short[] shorts,
float[] floats)
public FeatureVector readFeatureVector(int currentUnitIndex,
java.io.DataInput input)
throws java.io.IOException
input - a DataInputStream or RandomAccessFile to read the feature values from.
java.io.IOException
public FeatureVector createEdgeFeatureVector(int unitIndex,
boolean start)
unitIndex - index of the unitstart - true creates a start vector, false creates an end vector.
public java.lang.String toFeatureString(FeatureVector fv)
fv - a feature vector which must be consistent with this feature definition.
java.lang.IllegalArgumentException - if the feature vector is not consistent with this
feature definition
java.lang.IndexOutOfBoundsException - if any value of the feature vector is not consistent with this
feature definition
public void writeTo(java.io.PrintWriter out,
boolean writeWeights)
out - the destination of the datawriteWeights - whether to write weights before every linepublic void generateAllDotDescForWagon(java.io.PrintWriter out)
out - the destination of the data
public void generateAllDotDescForWagon(java.io.PrintWriter out,
java.util.Set<java.lang.String> featuresToIgnore)
out - the destination of the datafeaturesToIgnore - a set of Strings containing the names of features that
wagon should ignore. Can be null.public void generateFeatureWeightsFile(java.io.PrintWriter out)
out - the destination of the data
public static int diff(FeatureVector v1,
FeatureVector v2)
v1 - A feature vector.v2 - Another feature vector to compare v1 with.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||