Wednesday, September 11, 2013

Improvement in _extract_pyvals

Instead of *errobj = Py_BuildValue("NO", PyBytes_FromString(name), retval); in _extract_pyvals. It alone takes 10% of time, with every operations.  Its better to avoid packing of string instead assign pointer to struct and stash them to thread local storage. Hence no allocations would be need to for errorobj every-time, which almost goes unused.


Error object include name of ufunc and callback pointer.
typedef struct {
    char *name;
    PyObject* retval;
} PyErrObject;

Fast path for consecutive same operations, it better to use caching
errvalues = PyDict_GetItem(thedict, PyUFunc_PYVALS_ERROR);    
    if (errvalues == NULL) {
        errvalues = PyObject_New(PyErrObject, &PyErrObject_Type);
        errvalues->name = name;
        errvalues->retval = retval;
        PyDict_SetItem(thedict, PyUFunc_PYVALS_ERROR, (PyObject*)errvalues);
    errvalues->name = name;
    errvalues->retval = retval;
    *errobj = errvalues;


Callgraph PyErrObject
x = np.asarray([5.0,1.0]); x+x
Time consumption of _extract_pyvals drop to 2% from 9.3%.


PR for this enhancement is #3686

No comments:

Post a Comment