Bases: object
Canonical format for providing a dataset to the tree-builder.
Constructors are __init__ and fromIterable.
Bases: object
Represents a field of the dataset; usually created by a Dataset constructor.
Dataset.Field objects may be in one of two states: Numpy and Python. The Dataset constructors produce Dataset.Fields in their Numpy representation, which is required by the tree-builder. In the Numpy representation, categorical string data are represented as integers from 0 to N-1, where N is the number of unique input strings, with each distinct integer representing a distinct input string. Strings and integers can be converted through the intToStr and strToInt dictionaries, or by converting the whole array into a Pythonic form with the toPython method.
Creates a new Dataset.Field from this one by applying a boolean-valued Numpy array of the same length.
The new Dataset.Field is independent of the old one (this is a purely functional method).
Assumes that the Dataset.Field is currently in a Numpy representation.
Parameters: | selection (1-d Numpy array of bool) – data points to select |
---|---|
Return type: | 1-d Numpy array |
Returns: | subset of the original self.data |
Changes this field into a Numpy representation in-place (destructively replaces the old representation).
Changes this field into a Python representation in-place (destructively replaces the old representation).
Constructor for Dataset that takes a Python iterable (rows) of iterables (columns).
Each row must have the same number of fields with the same types (numbers.Real or basestring).
Parameters: |
|
---|---|
Return type: | titus.producer.cart.Dataset |
Returns: | a dataset |