PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Sunday, October 9, 2022

[FIXED] How to calculate moving / running / rolling arbitrary function (e.g. kurtosis & skewness) using NumPy / SciPy

 October 09, 2022     numpy, python, scipy, statistics     No comments   

Issue

I am working on the time-series data. To get features from data I have to calculate moving mean, median, mode, slop, kurtosis, skewness etc. I am familiar with scipy.stat which provides an easy way to calculate these quantities for straight calculation. But for the moving/running part, I have explored the whole internet and got nothing.

Surprisingly moving mean, median and mode are very easy to calculate with numpy. Unfortunately, there is no built-in function for calculating kurtosis and skewness. If someone can help, how to calculate moving kurtosis and skewness with scipy? Many thanks


Solution

Pandas offers a DataFrame.rolling() method which can be used, in combination with its Rolling.apply() method (i.e. df.rolling().apply()) to apply an arbitrary function to the specified rolling window.


If you are looking for NumPy-based solution, you could use FlyingCircus Numeric (disclaimer: I am the main author of it).

There, you could find the following:

  1. flyingcircus_numeric.running_apply(): can apply any function to a 1D array and supports weights, but it is slow;
  2. flyingcircus_numeric.moving_apply(): can apply any function supporting a axis: int parameter to a 1D array and supports weights, and it is fast (but memory-hungry);
  3. flyingcircus_numeric.rolling_apply_nd(): can apply any function supporting a axis: int|Sequence[int] parameter to any ND array and it is fast (and memory-efficient), but it does not support weights.

Based on your requirements, I would suggest to use rolling_apply_nd(), e.g.:

import numpy as np
import scipy as sp
import flyingcircus_numeric as fcn

import scipy.stats


NUM = 30
arr = np.arange(NUM)

window = 4
new_arr = fcn.rolling_apply_nd(arr, window, func=sp.stats.kurtosis)
print(new_arr)
# [-1.36 -1.36 -1.36 -1.36 -1.36 -1.36 -1.36 -1.36 -1.36 -1.36 -1.36 -1.36
#  -1.36 -1.36 -1.36 -1.36 -1.36 -1.36 -1.36 -1.36 -1.36 -1.36 -1.36 -1.36
#  -1.36 -1.36 -1.36]

Of course, feel free to inspect the source code, it is open source (GPL).


EDIT

Just to get a feeling of the kind of speed we are talking about, these are the benchmarks for the solutions implemented in FlyingCircus:

benchmark benchmark_zoom

The general approach flyingcircus_numeric.running_apply() is a couple of orders of magnitude slower than either flyingcircus_numeric.rolling_apply_nd() or flyingcircus_numeric.moving_apply(), with the first being approx. one order of magnitude faster than the second. This shows the speed price for generality or support for weighting.

The above plots were obtained using the scripts from here and the following code:

import scipy as sp
import flyingcircus_numeric as fcn 
import scipy.stats


WINDOW = 4
FUNC = sp.stats.kurtosis


def my_rolling_apply_nd(arr, window=WINDOW, func=FUNC):
    return fcn.rolling_apply_nd(arr, window, func=FUNC)


def my_moving_apply(arr, window=WINDOW, func=FUNC):
    return fcn.moving_apply(arr, window, func)


def my_running_apply(arr, window=WINDOW, func=FUNC):
    return fcn.running_apply(arr, window, func)


def equal_output(a, b):
    return np.all(np.isclose(a, b))


input_sizes = (5, 10, 50, 100, 500, 1000, 5000, 10000, 50000, 100000)
funcs = my_rolling_apply_nd, my_moving_apply, my_running_apply

runtimes, input_sizes, labels, results = benchmark(
    funcs, gen_input=np.random.random, equal_output=equal_output,
    input_sizes=input_sizes)

plot_benchmarks(runtimes, input_sizes, labels, units='s')
plot_benchmarks(runtimes, input_sizes, labels, units='ms', zoom_fastest=8)

(EDITED to reflect some refactoring of FlyingCircus)



Answered By - norok2
Answer Checked By - Katrina (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing