titus.genpy.PFAEngine¶

class titus.genpy.PFAEngine[source]¶

Bases: object

Base class for a Titus scoring engine.

Create instances using one of PFAEngine’s staticmethods, then call begin once, action once for each datum in the data stream, and end once (if the stream ever ends). The rest of the functions are for

examining the scoring engine (config, call graph),

handling log output or emit output with callbacks, and

taking snapshots of the scoring engine’s current state.

Examples:

Load a PFA file as a scoring engine. Note the , to extract the single scoring engine from the list this function returns.

import json
from titus.genpy import PFAEngine
engine, = PFAEngine.fromJson(json.load(open("myModel.pfa")))

Assuming (and verifying) that method is map, run it over an Avro data stream.

assert(engine.config.method == "map")

inputDataStream = engine.avroInputIterator(open("inputData.avro"))
outputDataStream = engine.avroOutputDataFileWriter("outputData.avro")

engine.begin()
for datum in inputDataStream:
    outputDataStream.append(engine.action(datum))
engine.end()
outputDataStream.close()

Handle the case of method = emit engines (map and fold are the same).

if engine.config.method == "emit":
    def emit(x):
        outputDataStream.append(x)
    engine.emit = emit
    engine.begin()
    for datum in inputDataStream:
        engine.action(datum)
    engine.end()

else:
    engine.begin()
    for datum in inputDataStream:
        outputDataStream.append(engine.action(datum))
    engine.end()

Take a snapshot of a changing model and write it as a new PFA file.

open("snapshot.pfa").write(engine.snapshot().toJson(lineNumbers=False))

Data format:

Data passed to action or accepted from action has to satisfy a particular form. That form is:

null: Python None

boolean: Python True or False

int: any Python int or long (no Numpy numbers, for instance)

long: any Python int or long

float: any Python int, long, or float

double: any Python int, long, or float

string: any Python string or unicode

bytes: any Python string

array(X): any Python list or tuple of X

map(X): any Python dict of X

enum: Python string or unicode of one of the symbols in this enumeration

fixed: Python string with length specified by this fixed-length type

record: Python dict with key-value pairs for all fields required by this record

union: if null, a Python None; otherwise, a dict with one key-value pair representing the type and value. For example, None, {"int": 12}, {"double": 12}, {"fully.qualified.record": {"field1": 1, "field2": 2}}, {"array": [1, 2, 3]}, etc.

None of the types above are compiled (since this is Python), so anything can be directly created by the user.

Although all of these types are immutable in PFA, list and dict are mutable in Python, but if you modify them, the behavior of the PFA engine is undefined and likely to be wrong. Do not change these objects in place!

avroInputIterator(inputStream, interpreter='avro')¶

Create a generator over Avro-serialized input data.

Parameters:	inputStream (open filehandle) – serialized data
Return type:	`avro.datafile.DataFileReader`
Returns:	generator of objects suitable for the `action` method

avroOutputDataFileWriter(fileName)¶

Create an output stream for Avro-serializing scoring engine output.

Return values from the action method (or outputs captured by an emit callback) are suitable for writing to this stream.

Parameters:	fileName (string) – name of the file that will be overwritten by Avro bytes
Return type:	`avro.datafile.DataFileWriter`
Returns:	an output stream with an `append` method for appending output data objects

callDepth(fcnName, exclude=None, startingDepth=0)¶

Determine call depth of a function by traversing the callGraph.

Parameters:	fcnName (string) – name of function to look up exclude (set of string) – set of functions to exclude startingDepth (integer) – used by recursion to count
Return type:	integer or floating-point inf
Returns:	number representing call depth, with positive infinity (which is a `float`) as a possible result

calledBy(fcnName, exclude=None)¶

Determine which functions are called by fcnName by traversing the callGraph backward.

Parameters:	fcnName (string) – name of function to look up exclude (set of string) – set of functions to exclude
Return type:	set of string
Returns:	set of functions that call `fcnName`

static fromAst(engineConfig, options=None, version=None, sharedState=None, multiplicity=1, style='pure', debug=False)¶

Create a collection of instances of this scoring engine from a PFA abstract syntax tree (titus.pfaast.EngineConfig).

Parameters:	engineConfig (titus.pfaast.EngineConfig) – a parsed, interpreted PFA document, i.e. produced by `titus.reader.jsonToAst` options (dict of Pythonized JSON) – options that override those found in the PFA document version (string) – PFA version number as a “major.minor.release” string sharedState (titus.genpy.SharedState) – external state for shared cells and pools to initialize from and modify; pass `None` to limit sharing to instances of a single PFA file multiplicity (positive integer) – number of instances to return (default is 1; a single-item collection) style (string) – style of scoring engine; only one currently supported: “pure” for pure-Python debug (bool) – if `True`, print the Python code generated by this PFA document before evaluating
Return type:	PFAEngine
Returns:	a list of scoring engine instances

static fromJson(src, options=None, version=None, sharedState=None, multiplicity=1, style='pure', debug=False)¶

Create a collection of instances of this scoring engine from a JSON-formatted PFA file.

Parameters:	src (JSON string or Pythonized JSON) – a PFA document in JSON-serialized form; may be a literal JSON string or the kind of Python structure that `json.loads` creates from a JSON string options (dict of Pythonized JSON) – options that override those found in the PFA document version (string) – PFA version number as a “major.minor.release” string sharedState (titus.genpy.SharedState) – external state for shared cells and pools to initialize from and modify; pass `None` to limit sharing to instances of a single PFA file multiplicity (positive integer) – number of instances to return (default is 1; a single-item collection) style (string) – style of scoring engine; only one currently supported: “pure” for pure-Python debug (bool) – if `True`, print the Python code generated by this PFA document before evaluating
Return type:	PFAEngine
Returns:	a list of scoring engine instances

static fromPmml(src, pmmlOptions=None, pfaOptions=None, version=None, sharedState=None, multiplicity=1, style='pure', debug=False)¶

Translates some types of PMML documents into PFA and creates a collection of scoring engine instances.

Parameters:	src (string) – a PMML document in XML-serialized form; must be a string pmmlOptions (dict) – directives for interpreting the PMML document pfaOptions (dict) – options that override those found in the PFA document version (string) – PFA version number as a “major.minor.release” string sharedState (titus.genpy.SharedState) – external state for shared cells and pools to initialize from and modify; pass `None` to limit sharing to instances of a single PFA file multiplicity (positive integer) – number of instances to return (default is 1; a single-item collection) style (string) – style of scoring engine; only one currently supported: “pure” for pure-Python debug (bool) – if `True`, print the Python code generated by this PFA document before evaluating
Return type:	PFAEngine
Returns:	a list of scoring engine instances

static fromYaml(src, options=None, version=None, sharedState=None, multiplicity=1, style='pure', debug=False)¶

Create a collection of instances of this scoring engine from a YAML-formatted PFA file.

Parameters:	src (string) – a PFA document in YAML-serialized form; must be a string options (dict of Pythonized JSON) – options that override those found in the PFA document version (string) – PFA version number as a “major.minor.release” string sharedState (titus.genpy.SharedState) – external state for shared cells and pools to initialize from and modify; pass `None` to limit sharing to instances of a single PFA file multiplicity (positive integer) – number of instances to return (default is 1; a single-item collection) style (string) – style of scoring engine; only one currently supported: “pure” for pure-Python debug (bool) – if `True`, print the Python code generated by this PFA document before evaluating
Return type:	PFAEngine
Returns:	a list of scoring engine instances

hasRecursive(fcnName)¶

Determine if the call depth of a funciton is infinite.

Parameters:	fcnName (string) – name of function to look up
Return type:	bool
Returns:	`True` if the function can eventually call itself through a function that it calls, `False` otherwise

hasSideEffects(fcnName)¶

Determine if a function modifies the scoring engine’s persistent state.

Parameters:	fcnName (string) – name of function to look up
Return type:	bool
Returns:	`True` if the function can eventually call `(cell-to)` or `(pool-to)` on any cell or pool.

isRecursive(fcnName)¶

Determine if a function is directly recursive.

Parameters:	fcnName (string) – name of function to look up
Return type:	bool
Returns:	`True` if the function directly calls itself, `False` otherwise

snapshot()¶

take a snapshot of the entire scoring engine (all cells and pools) and represent it as an abstract syntax tree that can be used to make new scoring engines.

Note that you can call toJson on the EngineConfig to get a string that can be written to a PFA file.

Navigation

titus.genpy.PFAEngine¶

Navigation