traffic.algorithms.filters
traffic comes with some pre-implemented filters to be passed to the
filter()
method. The method takes either a
FilterBase
instance, or a string parameter:
"default"
is a relatively fast option with decent performance on trajectories extracted from the OpenSky database (with their most common glitches)
noisy_landing.filter()
"aggressive"
is a composition of several filters which may result in smoother trajectories.
noisy_landing.filter("aggressive")
KalmanFilter6D
is a Kalman filter applied to the 6D state vector (latitude, longitude, altitude, track angle, groundspeed, vertical rate)
from traffic.algorithms.filters.kalman import KalmanFilter6D
# The Kalman filter needs first a projection in x, y
from cartes.crs import EuroPP
noisy_landing.compute_xy(EuroPP()).filter(KalmanFilter6D())
KalmanSmoother6D
is a Kalman smoother (a two-pass filter averaging the covariance of the errors on both sides) applied to the 6D state vector (latitude, longitude, altitude, track angle, groundspeed, vertical rate)
from traffic.algorithms.filters.kalman import KalmanSmoother6D
noisy_landing.compute_xy(EuroPP()).filter(KalmanSmoother6D())
API reference
- class traffic.algorithms.filters.FilterAboveSigmaMedian(**kwargs)
Bases:
FilterBase
Filters noisy values above one sigma wrt median filter.
The method first applies a median filter on each feature of the DataFrame. A default kernel size is applied for a number of features (resp. latitude, longitude, altitude, track, groundspeed, IAS, TAS) but other kernel values may be passed as kwargs parameters.
Rather than returning averaged values, the method computes thresholds on sliding windows (as an average of squared differences) and replace unacceptable values with NaNs.
Then, a strategy may be applied to fill the NaN values, by default a forward/backward fill. Other strategies may be passed, for instance do nothing:
None
; or interpolate:lambda x: x.interpolate()
.Note
This method if often more efficient when applied several times with different kernel values.Kernel values may be passed as integers, or list/tuples of integers for cascade of filters:
# this cascade of filters appears to work well on altitude flight.filter(altitude=17).filter(altitude=53) # this is equivalent to the default value flight.filter(altitude=(17, 53))
- cascaded_filters(df, feature, kernel_size, filt=None)
Produces a mask for data to be discarded.
The filtering applies a low pass filter (e.g medfilt) to a signal and measures the difference between the raw and the filtered signal.
The average of the squared differences is then produced (sq_eps) and used as a threshold for filtering.
Errors may raised if the kernel_size is too large
- class traffic.algorithms.filters.FilterBase(*args, **kwargs)
Bases:
Filter
Base class for filters, providing a | operator for composition.
- class traffic.algorithms.filters.FilterMean(**kwargs)
Bases:
FilterBase
Rolling mean filter.
- Parameters:
kwargs (
int
) – Each keyword argument is the name of a column, the value is the size of the kernel. Default values are provided for altitudes, vertical rate, ground speed and track angles.
- class traffic.algorithms.filters.FilterMedian(**kwargs)
Bases:
FilterBase
Rolling median filter.
- Parameters:
kwargs (
int
) – Each keyword argument is the name of a column, the value is the size of the kernel. Default values are provided for altitudes, vertical rate, ground speed and track angles.
- class traffic.algorithms.filters.FilterPosition(cascades=2)
Bases:
FilterBase
Basic filter to be deprecated.
Based on the detection of big groundspeed jumps.
- class traffic.algorithms.filters.aggressive.FilterClustering(time_column='timestamp', **kwargs)
Bases:
FilterBase
Filter based on clustering.
The method creates clusters of datapoints based on the difference in time and parameter value. If the cluster is larger than the defined group size the datapoints are kept, otherwise they are removed.
- Parameters:
time_column (
str
) – the name of the time column (default: “timestamp”)kwargs (
ClusteringParams
) –each keyword argument has the name of a feature. the value must be a dictionary with the following keys: - group_size: minimum size of the cluster to be kept - value_threshold: within the value threshold, the samples fall in
the same cluster
time_threshold: within the time threshold, the samples fall in the same cluster
- class traffic.algorithms.filters.aggressive.FilterDerivative(time_column='timestamp', **kwargs)
Bases:
FilterBase
Filter based on the 1st and 2nd derivatives of parameters
The method computes the absolute value of the 1st and 2nd derivatives of the parameters. If the value of the derivatives is above the defined threshold values, the datapoint is removed
- Parameters:
time_column (
str
) – the name of the time column (default: “timestamp”)kwargs (
DerivativeParams
) – each keyword argument has the name of a feature. the value must be a dictionary with the following keys: - first: threshold value for the first derivative - second: threshold value for the second derivative - kernel: the kernel size in seconds
If two spikes are detected within the width of the kernel, all data points in between are also removed.
- class traffic.algorithms.filters.consistency.FilterConsistency(horizon=200, backup_exact=True, backup_horizon=2000, exact_when_kept_below_verti=0.7, exact_when_kept_below_track=0.5, exact_when_kept_below_speed=0.6, **kwargs)
Bases:
FilterBase
Filters noisy values, keeping only values consistent with each other.
- Parameters:
Consistencies are checked between points \(i\) and points \(j \in [|i+1;i+\mathrm{horizon}|]\).
Using these consistencies, a graph is built: if \(i\) and \(j\) are consistent, an edge \((i,j)\) is added to the graph. The kept values is the longest path in this graph, resulting in a sequence of consistent values.
(In the following, we name \(v\) the ground speed, \(\dot{z}\) the vertical rate, and \(\theta\) the track angle.)
The consistencies checked vertically between \(t_i<t_j\) are: \(|(alt_j-alt_i)-(t_j-t_i) (\dot{z}_i+\dot{z}_j)/2| <\)
dalt_dt_error
\((t_j-t_i)\) wheredalt_dt_error
is a threshold that can be specified by the user.The consistencies checked horizontally between \(t_i<t_j\) are:
\(|(\theta_i+\theta_j)/2-atan2(lat_j-lat_i,lon_j-lon_i)| < (t_j-t_i)\)
dtrack_dt_error
and\(|dist(lat_j,lat_i,lon_j,lon_i) - (v_i+v_j)/2*(t_j-t_i)| < dist(lat_j,lat_i,lon_j,lon_i)\)
relative_error_on_dist
thresholds that can be specified by the user.
In order to compute the longest path faster, a greedy algorithm is used. However, if the ratio of kept points is inferior to
exact_when_kept_below
then an exact and slower computation is triggered. This computation uses the NetworkX library or the faster graph-tool library if available.This filter replaces unacceptable values with NaNs. Then, a strategy may be applied to fill the NaN values, by default a forward/backward fill. Other strategies may be passed, for instance do nothing:
None
; or interpolate:lambda x: x.interpolate(limit_area='inside')
- class traffic.algorithms.filters.kalman.KalmanFilter6D(reject_sigma=3)
Bases:
ProcessXYZFilterBase
A basic Kalman Filter with 6 components.
The filter requires x, y, z, dx, dy and dz components.
- class traffic.algorithms.filters.kalman.KalmanSmoother6D(reject_sigma=3)
Bases:
ProcessXYZFilterBase
A basic two-pass Kalman smoother with 6 components.
The filter requires x, y, z, dx, dy and dz components.
- class traffic.algorithms.filters.kalman.ProcessXYFilterBase(*args, **kwargs)
Bases:
FilterBase
Assistant class to preprocess the dataframe and build features.
Expects x and y features.
Provides x, y, dx and dy features.
Reconstruct groundspeed (in kts) and track angle.
- class traffic.algorithms.filters.kalman.ProcessXYZFilterBase(*args, **kwargs)
Bases:
FilterBase
Assistant class to preprocess the dataframe and build features.
Expects x and y features.
Provides x, y, z, dx, dy and dz features.
Reconstruct vertical rate (in ft/min), groundspeed (in kts) and track.