titus.genpy.PFAEngine

class titus.genpy.PFAEngine[source]

Bases: object

Base class for a Titus scoring engine.

Create instances using one of PFAEngine’s staticmethods, then call begin once, action once for each datum in the data stream, and end once (if the stream ever ends). The rest of the functions are for

  • examining the scoring engine (config, call graph),
  • handling log output or emit output with callbacks, and
  • taking snapshots of the scoring engine’s current state.

Examples:

Load a PFA file as a scoring engine. Note the , to extract the single scoring engine from the list this function returns.

import json
from titus.genpy import PFAEngine
engine, = PFAEngine.fromJson(json.load(open("myModel.pfa")))

Assuming (and verifying) that method is map, run it over an Avro data stream.

assert(engine.config.method == "map")

inputDataStream = engine.avroInputIterator(open("inputData.avro"))
outputDataStream = engine.avroOutputDataFileWriter("outputData.avro")

engine.begin()
for datum in inputDataStream:
    outputDataStream.append(engine.action(datum))
engine.end()
outputDataStream.close()

Handle the case of method = emit engines (map and fold are the same).

if engine.config.method == "emit":
    def emit(x):
        outputDataStream.append(x)
    engine.emit = emit
    engine.begin()
    for datum in inputDataStream:
        engine.action(datum)
    engine.end()

else:
    engine.begin()
    for datum in inputDataStream:
        outputDataStream.append(engine.action(datum))
    engine.end()

Take a snapshot of a changing model and write it as a new PFA file.

open("snapshot.pfa").write(engine.snapshot().toJson(lineNumbers=False))

Data format:

Data passed to action or accepted from action has to satisfy a particular form. That form is:

  • null: Python None
  • boolean: Python True or False
  • int: any Python int or long (no Numpy numbers, for instance)
  • long: any Python int or long
  • float: any Python int, long, or float
  • double: any Python int, long, or float
  • string: any Python string or unicode
  • bytes: any Python string
  • array(X): any Python list or tuple of X
  • map(X): any Python dict of X
  • enum: Python string or unicode of one of the symbols in this enumeration
  • fixed: Python string with length specified by this fixed-length type
  • record: Python dict with key-value pairs for all fields required by this record
  • union: if null, a Python None; otherwise, a dict with one key-value pair representing the type and value. For example, None, {"int": 12}, {"double": 12}, {"fully.qualified.record": {"field1": 1, "field2": 2}}, {"array": [1, 2, 3]}, etc.

None of the types above are compiled (since this is Python), so anything can be directly created by the user.

Although all of these types are immutable in PFA, list and dict are mutable in Python, but if you modify them, the behavior of the PFA engine is undefined and likely to be wrong. Do not change these objects in place!

avroInputIterator(inputStream, interpreter='avro')

Create a generator over Avro-serialized input data.

Parameters:inputStream (open filehandle) – serialized data
Return type:avro.datafile.DataFileReader
Returns:generator of objects suitable for the action method
avroOutputDataFileWriter(fileName)

Create an output stream for Avro-serializing scoring engine output.

Return values from the action method (or outputs captured by an emit callback) are suitable for writing to this stream.

Parameters:fileName (string) – name of the file that will be overwritten by Avro bytes
Return type:avro.datafile.DataFileWriter
Returns:an output stream with an append method for appending output data objects
callDepth(fcnName, exclude=None, startingDepth=0)

Determine call depth of a function by traversing the callGraph.

Parameters:
  • fcnName (string) – name of function to look up
  • exclude (set of string) – set of functions to exclude
  • startingDepth (integer) – used by recursion to count
Return type:

integer or floating-point inf

Returns:

number representing call depth, with positive infinity (which is a float) as a possible result

calledBy(fcnName, exclude=None)

Determine which functions are called by fcnName by traversing the callGraph backward.

Parameters:
  • fcnName (string) – name of function to look up
  • exclude (set of string) – set of functions to exclude
Return type:

set of string

Returns:

set of functions that call fcnName

static fromAst(engineConfig, options=None, version=None, sharedState=None, multiplicity=1, style='pure', debug=False)

Create a collection of instances of this scoring engine from a PFA abstract syntax tree (titus.pfaast.EngineConfig).

Parameters:
  • engineConfig (titus.pfaast.EngineConfig) – a parsed, interpreted PFA document, i.e. produced by titus.reader.jsonToAst
  • options (dict of Pythonized JSON) – options that override those found in the PFA document
  • version (string) – PFA version number as a “major.minor.release” string
  • sharedState (titus.genpy.SharedState) – external state for shared cells and pools to initialize from and modify; pass None to limit sharing to instances of a single PFA file
  • multiplicity (positive integer) – number of instances to return (default is 1; a single-item collection)
  • style (string) – style of scoring engine; only one currently supported: “pure” for pure-Python
  • debug (bool) – if True, print the Python code generated by this PFA document before evaluating
Return type:

PFAEngine

Returns:

a list of scoring engine instances

static fromJson(src, options=None, version=None, sharedState=None, multiplicity=1, style='pure', debug=False)

Create a collection of instances of this scoring engine from a JSON-formatted PFA file.

Parameters:
  • src (JSON string or Pythonized JSON) – a PFA document in JSON-serialized form; may be a literal JSON string or the kind of Python structure that json.loads creates from a JSON string
  • options (dict of Pythonized JSON) – options that override those found in the PFA document
  • version (string) – PFA version number as a “major.minor.release” string
  • sharedState (titus.genpy.SharedState) – external state for shared cells and pools to initialize from and modify; pass None to limit sharing to instances of a single PFA file
  • multiplicity (positive integer) – number of instances to return (default is 1; a single-item collection)
  • style (string) – style of scoring engine; only one currently supported: “pure” for pure-Python
  • debug (bool) – if True, print the Python code generated by this PFA document before evaluating
Return type:

PFAEngine

Returns:

a list of scoring engine instances

static fromPmml(src, pmmlOptions=None, pfaOptions=None, version=None, sharedState=None, multiplicity=1, style='pure', debug=False)

Translates some types of PMML documents into PFA and creates a collection of scoring engine instances.

Parameters:
  • src (string) – a PMML document in XML-serialized form; must be a string
  • pmmlOptions (dict) – directives for interpreting the PMML document
  • pfaOptions (dict) – options that override those found in the PFA document
  • version (string) – PFA version number as a “major.minor.release” string
  • sharedState (titus.genpy.SharedState) – external state for shared cells and pools to initialize from and modify; pass None to limit sharing to instances of a single PFA file
  • multiplicity (positive integer) – number of instances to return (default is 1; a single-item collection)
  • style (string) – style of scoring engine; only one currently supported: “pure” for pure-Python
  • debug (bool) – if True, print the Python code generated by this PFA document before evaluating
Return type:

PFAEngine

Returns:

a list of scoring engine instances

static fromYaml(src, options=None, version=None, sharedState=None, multiplicity=1, style='pure', debug=False)

Create a collection of instances of this scoring engine from a YAML-formatted PFA file.

Parameters:
  • src (string) – a PFA document in YAML-serialized form; must be a string
  • options (dict of Pythonized JSON) – options that override those found in the PFA document
  • version (string) – PFA version number as a “major.minor.release” string
  • sharedState (titus.genpy.SharedState) – external state for shared cells and pools to initialize from and modify; pass None to limit sharing to instances of a single PFA file
  • multiplicity (positive integer) – number of instances to return (default is 1; a single-item collection)
  • style (string) – style of scoring engine; only one currently supported: “pure” for pure-Python
  • debug (bool) – if True, print the Python code generated by this PFA document before evaluating
Return type:

PFAEngine

Returns:

a list of scoring engine instances

hasRecursive(fcnName)

Determine if the call depth of a funciton is infinite.

Parameters:fcnName (string) – name of function to look up
Return type:bool
Returns:True if the function can eventually call itself through a function that it calls, False otherwise
hasSideEffects(fcnName)

Determine if a function modifies the scoring engine’s persistent state.

Parameters:fcnName (string) – name of function to look up
Return type:bool
Returns:True if the function can eventually call (cell-to) or (pool-to) on any cell or pool.
isRecursive(fcnName)

Determine if a function is directly recursive.

Parameters:fcnName (string) – name of function to look up
Return type:bool
Returns:True if the function directly calls itself, False otherwise
snapshot()

take a snapshot of the entire scoring engine (all cells and pools) and represent it as an abstract syntax tree that can be used to make new scoring engines.

Note that you can call toJson on the EngineConfig to get a string that can be written to a PFA file.