de.dfki.lt.mary.dbselection
Class CoverageDefinition

java.lang.Object
  extended by de.dfki.lt.mary.dbselection.CoverageDefinition

public class CoverageDefinition
extends java.lang.Object

Builds and manages the cover sets

Author:
Anna Hunecke

Constructor Summary
CoverageDefinition(FeatureDefinition featDef, java.lang.String configFile, boolean holdVectorsInMemory, byte[][] vectorArray)
          Build a new coverage definition and read in the config file
 
Method Summary
 byte[][] getVectorArray()
           
 byte getVectorValue(byte[] vectors, int vectorIndex, int valueIndex)
           
 void initialiseCoverage(java.lang.String[] basenames)
          Compute the coverage of the corpus, build and fill the cover sets
 void printResultToLog(java.io.PrintWriter logOut)
           
 void printSelectionDistribution(java.lang.String distributionFile, java.lang.String developmentFile, boolean logDevelopment)
          Print statistics of the selected sentences and a table of coverage development over time
 void printSettings(java.io.PrintWriter out)
          Print the settings of the config file
 void printTextCorpusStatistics(java.lang.String filename)
          Print a statistic of the unit distribution in the corpus
 boolean reachedMaxClusteredDiphones()
          Check if cover has maximum clustered diphone coverage
 boolean reachedMaxClusteredProsody()
          Check if cover has maximum clustered prosody coverage
 boolean reachedMaxSimpleDiphones()
          Check if cover has maximum simple diphone coverage
 boolean reachedMaxSimpleProsody()
          Check if cover has maximum simple prosody coverage
 void readCoverageBin(java.lang.String filename, FeatureDefinition featDef, java.lang.String[] basenames)
          Read the cover sets from the given file
 void updateCover(byte[] coveredFVs)
          Add the given feature vectors to the cover
 double usefulnessOfFVs(byte[] featureVectors)
          Get the usefulness of the given feature vectors Usefulness of a feature vector is defined as the sum of the score for the feature vectors on all levels of the tree.
 void writeCoverageBin(java.lang.String filename)
          Print the cover sets to the given file
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CoverageDefinition

public CoverageDefinition(FeatureDefinition featDef,
                          java.lang.String configFile,
                          boolean holdVectorsInMemory,
                          byte[][] vectorArray)
Build a new coverage definition and read in the config file

Parameters:
readConfigFile - if true, read the config file, else use default values
featDef - the feature definition for the vectors
configFile - the config file name
holdVectorsInMemory - if true, vectors are stored in memory
Method Detail

initialiseCoverage

public void initialiseCoverage(java.lang.String[] basenames)
                        throws java.io.IOException
Compute the coverage of the corpus, build and fill the cover sets

Parameters:
basenames - the list of filenames
Throws:
java.io.IOException

printTextCorpusStatistics

public void printTextCorpusStatistics(java.lang.String filename)
                               throws java.lang.Exception
Print a statistic of the unit distribution in the corpus

Parameters:
filename - the file to print to
Throws:
java.lang.Exception

printSettings

public void printSettings(java.io.PrintWriter out)
Print the settings of the config file

Parameters:
out - the PrintWriter to print to

printSelectionDistribution

public void printSelectionDistribution(java.lang.String distributionFile,
                                       java.lang.String developmentFile,
                                       boolean logDevelopment)
                                throws java.lang.Exception
Print statistics of the selected sentences and a table of coverage development over time

Parameters:
distributionFile - the file to print the statistics to
developmentFile - the file to print the coverage development to
logDevelopment - if true, print development file
Throws:
java.lang.Exception

printResultToLog

public void printResultToLog(java.io.PrintWriter logOut)

updateCover

public void updateCover(byte[] coveredFVs)
Add the given feature vectors to the cover

Parameters:
coveredFVs - the feature vectors to add

reachedMaxSimpleDiphones

public boolean reachedMaxSimpleDiphones()
Check if cover has maximum simple diphone coverage

Returns:
true if cover has maximum simple diphone coverage

reachedMaxClusteredDiphones

public boolean reachedMaxClusteredDiphones()
Check if cover has maximum clustered diphone coverage

Returns:
true if cover has maximum clustered diphone coverage

reachedMaxSimpleProsody

public boolean reachedMaxSimpleProsody()
Check if cover has maximum simple prosody coverage

Returns:
true if cover has maximum simple prosody coverage

reachedMaxClusteredProsody

public boolean reachedMaxClusteredProsody()
Check if cover has maximum clustered prosody coverage

Returns:
true if cover has maximum clustered prosody coverage

usefulnessOfFVs

public double usefulnessOfFVs(byte[] featureVectors)
Get the usefulness of the given feature vectors Usefulness of a feature vector is defined as the sum of the score for the feature vectors on all levels of the tree. On each level, the score is the product of the two weights of the node. The first weight reflects the frequency/ inverted frequency of the value associated with the node in the corpus (=> frequencyWeight). The second weight reflects how much an instance of a feature vector containing the associated value is wanted in the cover (=> wantedWeight).

Parameters:
featureVectors - the feature vectors
Returns:
the usefulness

getVectorArray

public byte[][] getVectorArray()

getVectorValue

public byte getVectorValue(byte[] vectors,
                           int vectorIndex,
                           int valueIndex)

writeCoverageBin

public void writeCoverageBin(java.lang.String filename)
                      throws java.lang.Exception
Print the cover sets to the given file

Parameters:
filename - the file to print to
Throws:
java.lang.Exception

readCoverageBin

public void readCoverageBin(java.lang.String filename,
                            FeatureDefinition featDef,
                            java.lang.String[] basenames)
                     throws java.lang.Exception
Read the cover sets from the given file

Parameters:
filename - the file containing the cover sets
featDef - the feature definition for the features
basenames - the list of basenames
Throws:
java.lang.Exception