|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectedu.isi.karma.modeling.semantictypes.SemanticTypeUtil
public class SemanticTypeUtil
This class provides various utility methods that can be used by the semantic typing module.
| Constructor Summary | |
|---|---|
SemanticTypeUtil()
|
|
| Method Summary | |
|---|---|
static java.util.ArrayList<java.lang.String> |
getTrainingExamples(edu.isi.karma.rep.Worksheet worksheet,
edu.isi.karma.rep.HNodePath path)
Prepares and returns a collection of training examples to be used in semantic types training. |
static void |
identifyOutliers(edu.isi.karma.rep.Worksheet worksheet,
java.lang.String predictedType,
edu.isi.karma.rep.HNodePath path,
edu.isi.karma.rep.metadata.Tag outlierTag,
java.util.Map<CRFModelHandler.ColumnFeature,java.util.Collection<java.lang.String>> columnFeatures,
CRFModelHandler crfModelHandler)
Identifies the outlier nodes (table cells) for a given column. |
static boolean |
populateSemanticTypesUsingCRF(edu.isi.karma.rep.Worksheet worksheet,
edu.isi.karma.rep.metadata.Tag outlierTag,
CRFModelHandler crfModelHandler)
This method predicts semantic types for all the columns in a worksheet using CRF modeling technique developed by Aman Goel. |
static java.lang.String |
removeNamespace(java.lang.String uri)
Removes the namespace from a given URI. |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public SemanticTypeUtil()
| Method Detail |
|---|
public static java.util.ArrayList<java.lang.String> getTrainingExamples(edu.isi.karma.rep.Worksheet worksheet,
edu.isi.karma.rep.HNodePath path)
worksheet - The target worksheetpath - Path to the target column
public static boolean populateSemanticTypesUsingCRF(edu.isi.karma.rep.Worksheet worksheet,
edu.isi.karma.rep.metadata.Tag outlierTag,
CRFModelHandler crfModelHandler)
worksheet - The target worksheetoutlierTag - Tag object that stores outlier nodescrfModelHandler - The CRF Model Handler to use
public static void identifyOutliers(edu.isi.karma.rep.Worksheet worksheet,
java.lang.String predictedType,
edu.isi.karma.rep.HNodePath path,
edu.isi.karma.rep.metadata.Tag outlierTag,
java.util.Map<CRFModelHandler.ColumnFeature,java.util.Collection<java.lang.String>> columnFeatures,
CRFModelHandler crfModelHandler)
worksheet - Target worksheetpredictedType - Type which was user-assigned or predicted by the CRF model for
the given column. If the type for a given node is different
from the predictedType, it is tagged as outlier and it's id is
stored in the outlier tag objectpath - Path to the given columnoutlierTag - The outlier tag object which stores all the outlier node ids.columnFeatures - Features such as column name, table name that are required by
the CRF Model to predict the semantic type for a node (table
cell)crfModelHandler - public static java.lang.String removeNamespace(java.lang.String uri)
uri - Input URI
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||