Data mining functions

The following table describes the supported data mining functions.

Function Description
CLUSTER_DETAILS Returns cluster details for each row in the selection as an XML string that describes the attributes of either the highest-probability cluster or the specified cluster_id.
CLUSTER_DISTANCE Returns the distance between a row and the centroid of either the highest-probability cluster or the specified cluster_id. The distance is returned as a BINARY_DOUBLE value.
CLUSTER_ID Returns the identifier of the highest-probability cluster for each row in the selection as a NUMBER.
CLUSTER_PROBABILITY Returns the probability that a row belongs to either the highest-probability cluster or the specified cluster_id. The probability is returned as a BINARY_DOUBLE value.
CLUSTER_SET Returns a set of cluster ID–probability pairs for each row in the selection as a varray of objects with fields CLUSTER_ID (NUMBER) and PROBABILITY (BINARY_DOUBLE).
FEATURE_COMPARE Uses a Feature Extraction model to compare two documents, phrases, or attribute lists for similarity or dissimilarity. Can be applied to text, numeric, or categorical data using algorithms such as SVD, PCA, NMF, or ESA.
FEATURE_DETAILS Returns feature details for each row in the selection as an XML string describing the attributes of either the highest-value feature or the specified feature_id.
FEATURE_ID Returns the identifier of the highest-value feature for each row in the selection as a NUMBER.
FEATURE_SET Returns a set of feature ID–value pairs for each row in the selection as a varray of objects with fields FEATURE_ID and VALUE, both of type NUMBER.
FEATURE_VALUE Returns the value of either the highest-value feature or the specified feature_id for each row in the selection. The feature value is returned as a BINARY_DOUBLE.
ORA_DM_PARTITION_NAME Returns the name of the partition associated with the input row. If used on a non-partitioned model, returns NULL.
PREDICTION Returns a prediction for each row in the selection. The return data type depends on the type of model: Regression, Classification, or Anomaly Detection.
PREDICTION_BOUNDS Applies a Generalized Linear Model (GLM) to predict a class or value for each row in the selection. Returns the prediction bounds as a varray of objects with fields UPPER and LOWER.
PREDICTION_COST Returns the cost for each row in the selection, either for the lowest-cost class or the specified class. The cost is returned as a BINARY_DOUBLE.
PREDICTION_DETAILS Returns prediction details for each row in the selection as an XML string that describes the attributes of the prediction.
PREDICTION_PROBABILITY Returns the probability that a row belongs to either the highest-probability class or the specified class. The probability is returned as a BINARY_DOUBLE.
PREDICTION_SET Returns a set of predictions with probabilities or costs for each row in the selection as a varray of objects. Each object has fields PREDICTION_ID (with the target’s data type) and either PROBABILITY or COST (BINARY_DOUBLE).