python tsfresh特征中文详解（python能做什么）-eolink官网

python tsfresh特征中文详解（python能做什么）

tsfresh是开源的提取时序数据特征的python包，能够提取出超过64种特征，堪称提取时序特征的瑞士军刀。最近有需求，所以一直在看，目前还没有中文文档，有些特征含义还是很难懂的，我把我已经看懂的一部分放这，没看懂的我只写了标题，待我看懂我添加注解。

tsfresh.feature_extraction.feature_calculators.abs_energy(x)

tsfresh.feature_extraction.feature_calculators.absolute_sum_of_changes(x)

tsfresh.feature_extraction.feature_calculators.agg_autocorrelation(x, param)

tsfresh.feature_extraction.feature_calculators.agg_linear_trend(x, param)

对时序分块聚合后（max, min, mean, meidan），然后聚合后的值做线性回归，算出 pvalue(),rvalue(相关系数), intercept(截距), slope(斜率), stderr(拟合的标准差) Parameters: x (pandas.Series) – the time series to calculate the feature of param (list) – contains dictionaries {“attr”: x, “chunk_len”: l, “f_agg”: f} with x, f an string and l an int Returns: the different feature values Return type: pandas.Series

tsfresh.feature_extraction.feature_calculators.approximate_entropy(x, m, r)

近似熵，用来衡量一个时间序列的周期性、不可预测性和波动性

tsfresh.feature_extraction.feature_calculators.ar_coefficient(x, param)

自回归模型系数，

tsfresh.feature_extraction.feature_calculators.augmented_dickey_fuller(x, param)

tsfresh.feature_extraction.feature_calculators.autocorrelation(x, lag)

tsfresh.feature_extraction.feature_calculators.binned_entropy(x, max_bins)

tsfresh.feature_extraction.feature_calculators.c3(x, lag)

tsfresh.feature_extraction.feature_calculators.change_quantiles(x, ql, qh, isabs, f_agg)

先用ql和qh两个分位数在x中确定出一个区间，然后在这个区间里计算时序数据的均值、绝对值、连续变化值。

Parameters: x (pandas.Series) – 时序数据 ql (float) – 分位数的下限 qh (float) – 分位数的上线 isabs (bool) – 使用使用绝对值 f_agg (str, name of a numpy function (e.g. mean, var, std, median)) – numpy自带的聚合函数（均值，方差，标准差，中位数）

tsfresh.feature_extraction.feature_calculators.cid_ce(x, normalize)

tsfresh.feature_extraction.feature_calculators.count_above_mean(x)

大于均值的数的个数

tsfresh.feature_extraction.feature_calculators.count_below_mean(x)

小于均值的数的个数

tsfresh.feature_extraction.feature_calculators.cwt_coefficients(x, param)

tsfresh.feature_extraction.feature_calculators.energy_ratio_by_chunks(x, param)

Calculates the sum of squares of chunk i out of N chunks expressed as a ratio with the sum of squares over the whole series.

Takes as input parameters the number num_segments of segments to divide the series into and segment_focus which is the segment number (starting at zero) to return a feature on.

If the length of the time series is not a multiple of the number of segments, the remaining data points are distributed on the bins starting from the first. For example, if your time series consists of 8 entries, the first two bins will contain 3 and the last two values, e.g. [ 0., 1., 2.], [ 3., 4., 5.] and [ 6., 7.].

Note that the answer for num_segments = 1 is a trivial “1” but we handle this scenario in case somebody calls it. Sum of the ratios should be 1.0.

Parameters:

x (numpy.ndarray) – the time series to calculate the feature of param – contains dictionaries {“num_segments”: N, “segment_focus”: i} with N, i both ints

Returns:

the feature values

Return type:

list of tuples (index, data)

tsfresh.feature_extraction.feature_calculators.fft_aggregated(x, param)

Returns the spectral centroid (mean), variance, skew, and kurtosis of the absolute fourier transform spectrum.

Parameters:

x (numpy.ndarray) – the time series to calculate the feature of param (list) – contains dictionaries {“aggtype”: s} where s str and in [“centroid”, “variance”, “skew”, “kurtosis”]

Returns:

the different feature values

Return type:

pandas.Series

This function is of type: combiner

tsfresh.feature_extraction.feature_calculators.fft_coefficient(x, param)

Calculates the fourier coefficients of the one-dimensional discrete Fourier Transform for real input by fast fourier transformation algorithm

The resulting coefficients will be complex, this feature calculator can return the real part (attr==”real”), the imaginary part (attr==”imag), the absolute value (attr=”“abs) and the angle in degrees (attr==”angle).

Parameters:

x (numpy.ndarray) – the time series to calculate the feature of param (list) – contains dictionaries {“coeff”: x, “attr”: s} with x int and x >= 0, s str and in [“real”, “imag”, “abs”, “angle”]

Returns:

the different feature values

Return type:

pandas.Series

This function is of type: combiner

tsfresh.feature_extraction.feature_calculators.first_location_of_maximum(x)

最大值第一次出现的位置

tsfresh.feature_extraction.feature_calculators.first_location_of_minimum(x)

最小值第一次出现的位置

tsfresh.feature_extraction.feature_calculators.friedrich_coefficients(x, param)

as described by [1].

For short time-series this method is highly dependent on the parameters.

References

[1] Friedrich et al. (2000): Physics Letters A 271, p. 217-222 Extracting model equations from experimental data

Parameters:

x (numpy.ndarray) – the time series to calculate the feature of param (list) – contains dictionaries {“m”: x, “r”: y, “coeff”: z} with x being positive integer, the order of polynom to fit for estimating fixed points of dynamics, y positive float, the number of quantils to use for averaging and finally z, a positive integer corresponding to the returned coefficient

Returns:

the different feature values

Return type:

pandas.Series

tsfresh.feature_extraction.feature_calculators.has_duplicate(x)

有没有重复值

tsfresh.feature_extraction.feature_calculators.has_duplicate_max(x

最大值有没有重复

tsfresh.feature_extraction.feature_calculators.has_duplicate_min(x)

最小值有没有重复

tsfresh.feature_extraction.feature_calculators.index_mass_quantile(x, param)

tsfresh.feature_extraction.feature_calculators.kurtosis(x)

tsfresh.feature_extraction.feature_calculators.large_standard_deviation(x, r)

tsfresh.feature_extraction.feature_calculators.last_location_of_maximum(x)

最大值最后出现的位置

tsfresh.feature_extraction.feature_calculators.last_location_of_minimum(x)

最小值最后出现的位置

tsfresh.feature_extraction.feature_calculators.length(x)

x的长度

tsfresh.feature_extraction.feature_calculators.linear_trend(x, param)

Calculate a linear least-squares regression for the values of the time series versus the sequence from 0 to length of the time series minus one. This feature assumes the signal to be uniformly sampled. It will not use the time stamps to fit the model. The parameters control which of the characteristics are returned.

Possible extracted attributes are “pvalue”, “rvalue”, “intercept”, “slope”, “stderr”, see the documentation of linregress for more information.

Parameters:

x (numpy.ndarray) – the time series to calculate the feature of param (list) – contains dictionaries {“attr”: x} with x an string, the attribute name of the regression model

Returns:

the different feature values

Return type:

pandas.Series

This function is of type: combiner

tsfresh.feature_extraction.feature_calculators.longest_strike_above_mean(x)

大于均值的最长连续子序列长度

tsfresh.feature_extraction.feature_calculators.longest_strike_below_mean(x)

小于均值的最长连续子序列长度

tsfresh.feature_extraction.feature_calculators.max_langevin_fixed_point(x, r, m)

Friedrich et al. (2000): Physics Letters A 271, p. 217-222 Extracting model equations from experimental data For short time-series this method is highly dependent on the parameters.

Parameters:

x (numpy.ndarray) – the time series to calculate the feature of m (int) – order of polynom to fit for estimating fixed points of dynamics r (float) – number of quantils to use for averaging

Returns:

Largest fixed point of deterministic dynamics

Return type:

float

tsfresh.feature_extraction.feature_calculators.maximum(x)

最大值

tsfresh.feature_extraction.feature_calculators.mean(x)

均值

tsfresh.feature_extraction.feature_calculators.mean_abs_change(x)

tsfresh.feature_extraction.feature_calculators.mean_change(x)

tsfresh.feature_extraction.feature_calculators.mean_second_derivative_central(x)

tsfresh.feature_extraction.feature_calculators.median(x)

中位数

tsfresh.feature_extraction.feature_calculators.minimum(x)

最小值

tsfresh.feature_extraction.feature_calculators.number_crossing_m(x, m)

Calculates the number of crossings of x on m. A crossing is defined as two sequential values where the first value is lower than m and the next is greater, or vice-versa. If you set m to zero, you will get the number of zero crossings.

Parameters:

x (numpy.ndarray) – the time series to calculate the feature of m (float) – the threshold for the crossing

Returns:

the value of this feature

Return type:

int

tsfresh.feature_extraction.feature_calculators.number_cwt_peaks(x, n)

This feature calculator searches for different peaks in x. To do so, x is smoothed by a ricker wavelet and for widths ranging from 1 to n. This feature calculator returns the number of peaks that occur at enough width scales and with sufficiently high Signal-to-Noise-Ratio (SNR)

Parameters:

x (numpy.ndarray) – the time series to calculate the feature of n (int) – maximum width to consider

Returns:

the value of this feature

Return type:

int

tsfresh.feature_extraction.feature_calculators.number_peaks(x, n)

峰值个数

tsfresh.feature_extraction.feature_calculators.partial_autocorrelation(x, param)

tsfresh.feature_extraction.feature_calculators.percentage_of_reoccurring_datapoints_to_all_datapoints(x)

len(different values occurring more than once) / len(different values) 出现超过1次的值的个数/总的取值的个数（重复值只算一个）

tsfresh.feature_extraction.feature_calculators.percentage_of_reoccurring_values_to_all_values(x)

出现超过1次的值的个数/总个数

tsfresh.feature_extraction.feature_calculators.quantile(x, q)

返回x中q的分位数，q% 小于分位数。

tsfresh.feature_extraction.feature_calculators.range_count(x, min, max)

x中在min和max之间的数的个数

tsfresh.feature_extraction.feature_calculators.ratio_beyond_r_sigma(x, r)

取值大于r倍标准差的比例

tsfresh.feature_extraction.feature_calculators.ratio_value_number_to_time_series_length(x)

把 x unique后的长度除以x原始长度 len(set(x))/len(x)

tsfresh.feature_extraction.feature_calculators.sample_entropy(x)

熵

tsfresh.feature_extraction.feature_calculators.set_property(key, value)

tsfresh.feature_extraction.feature_calculators.skewness(x)

tsfresh.feature_extraction.feature_calculators.spkt_welch_density(x, param)

tsfresh.feature_extraction.feature_calculators.standard_deviation(x)

标准差

tsfresh.feature_extraction.feature_calculators.sum_of_reoccurring_data_points(x)

出现过多次的点的个数

tsfresh.feature_extraction.feature_calculators.sum_of_reoccurring_values(x)

出现过多次的值的和

tsfresh.feature_extraction.feature_calculators.sum_values(x)

所有值的和

tsfresh.feature_extraction.feature_calculators.symmetry_looking(x, param)

tsfresh.feature_extraction.feature_calculators.time_reversal_asymmetry_statistic(x, lag)

tsfresh.feature_extraction.feature_calculators.value_count(x, value)

x中值等于value的计数

tsfresh.feature_extraction.feature_calculators.variance(x)

方差

tsfresh.feature_extraction.feature_calculators.variance_larger_than_standard_deviation(x)

方差是否大于标准差

java如何实现模拟USB接口的功能

573 2022-08-24

python tsfresh特征中文详解（python能做什么）

hdml指的是什么接口

分析EBS常用接口表

java如何实现模拟USB接口的功能

推荐文章

接口调用是什么意思？几种常用接口调用方式

接口设计原则

8款在线 API 接口文档管理工具

api管理系统是什么？

什么是接口调试？接口调试的步骤有哪些？

api 接口管理系统有哪些？

接口测试有几种测试方法

API文档生成工具有哪些？

微服务和api网关区别

交换机配置步骤

最近发表

热评文章

在线接口文档管理工具推荐，支持在线测试，HTTP接口

开源的在线接口文档wiki工具Mindoc的介绍与使

如何优雅的进行接口设计？接口设计的六大原则是什么？

什么是API测试,api检测公司

遇到百度网址安全中心提醒您该页面可能存在钓鱼欺诈信息

软件接口设计怎么做？前后端分离软件接口设计思路

python tsfresh特征中文详解（python能做什么）

微信扫一扫：分享

推荐文章

最近发表

热评文章