python - Returning two values from pandas.rolling

python - Returning two values from pandas.rolling_apply -

- May 15, 2013

i using pandas.rolling_apply fit data distribution , value it, need report rolling goodness of fit (specifically, p-value). i'm doing this:

def func(sample):     fit = genextreme.fit(sample)     return genextreme.isf(0.9, *fit)  def p_value(sample):     fit = genextreme.fit(sample)     return kstest(sample, 'genextreme', fit)[1]  values = pd.rolling_apply(data, 30, func) p_values = pd.rolling_apply(data, 30, p_value) results = pd.dataframe({'values': values, 'p_value': p_values})

the problem have lot of data, , fit function expensive, don't want call twice every sample. i'd rather this:

def func(sample):     fit = genextreme.fit(sample)     value = genextreme.isf(0.9, *fit)     p_value = kstest(sample, 'genextreme', fit)[1]     return {'value': value, 'p_value': p_value}  results = pd.rolling_apply(data, 30, func)

where results dataframe 2 columns. if try run this, exception: typeerror: float required. possible achieve this, , if so, how?

i had same issue. solved generating global data frame , feeding rolling function. in following example script, generate random input data. then, calculate single rolling apply function min, max , mean.

import pandas pd import numpy np  global outputdf global index  def myfunction(array):      global index     global outputdf      # random operation     outputdf['min'][index] = np.nanmin(array)     outputdf['max'][index] = np.nanmax(array)     outputdf['mean'][index] = np.nanmean(array)      index += 1     # returning useless variable     return 0  if __name__ == "__main__":      global outputdf     global index      # random window size     windowsize = 10      # preparing random input data     inputdf = pd.dataframe({ 'randomvalue': [np.nan] * 500 })     in range(len(inputdf)):         inputdf['randomvalue'].values[i] = np.random.rand()       # pre-allocate memory     outputdf = pd.dataframe({ 'min': [np.nan] * len(inputdf),                               'max': [np.nan] * len(inputdf),                               'mean': [np.nan] * len(inputdf)                               })         # precise staring index (due window size)     d = (windowsize - 1) / 2     index = np.int(np.floor( d ) )      # rolling apply here     inputdf['randomvalue'].rolling(window=windowsize,center=true).apply(myfunction,args=())      assert index + np.int(np.ceil(d)) == len(inputdf), 'length mismatch'      outputdf.set_index = inputdf.index      # optional : clean nulls     outputdf.dropna(inplace=true)      print(outputdf)

Search This Blog

Cap

python - Returning two values from pandas.rolling_apply -

Comments

Post a Comment

Popular posts from this blog

Need to Replace properties of single sql file using bat file -

postgresql - Lazarus + Postgres: incomplete startup packet -

c# - How to get the current UAC mode -