How to implement trajectory generation?

(contribution by Adrien Lafage @alafage)

This library provides a Generation class for creating synthetic traffic data. It implements fit() and sample() methods that call the corresponding methods in the generative model passed as argument.

You can import this class with the following code:

from traffic.algorithms.generation import Generation

To instantiate such an object you can pass those arguments:

  • generation: Any object implementing fit() and sample() methods. It will define the generative model to use.

  • features: The list of the features to represent a trajectory.

  • scaler: A scaler that is optional to make sure each feature weights the same during the fitting part.

In the case the generative model within your Generation object is not fitted to any Traffic object, you can use the fit() method. Depending on the generative model used, its fit() method can be rather time-consuming, esp. with neural network-based generative models.

We load here traffic data of landing trajectories at Zurich airport coming from the north.

import matplotlib.pyplot as plt
from traffic.data.datasets import landing_zurich_2019
from cartes.crs import EuroPP

t = (
    landing_zurich_2019
    .query("runway == '14' and initial_flow == '162-216'")
    .assign_id()
    .unwrap()
    .resample(100)
    .eval()
)

with plt.style.context("traffic"):
    ax = plt.axes(projection=EuroPP())
    t.plot(ax, alpha=0.05)
    t.centroid(nb_samples=None, projection=EuroPP()).plot(
        ax, color="#f58518"
    )
../_images/generation_2_0.png

Before any fitting, we enrich the Traffic DataFrame with the features we might want to use to generate trajectories. For example, instead of working with longitude and latitude values, we can compute their projection (x and y respectively).

t = t.compute_xy(projection=EuroPP())

To keep track of time we propose to compute a timedelta parameter which is for each trajectory coordinates, the difference in seconds with the beginning of the trajectory.

from traffic.core import Traffic

def compute_timedelta(df: "pd.DataFrame"):
    return (df.timestamp - df.timestamp.min()).dt.total_seconds()

t = t.iterate_lazy().assign(timedelta=compute_timedelta).eval()

Now we can use the fit() method to fit our generative model, here a Gaussian Mixture with two components.

from sklearn.mixture import GaussianMixture
from sklearn.preprocessing import MinMaxScaler

g1 = Generation(
    generation=GaussianMixture(n_components=2),
    features=["x", "y", "altitude", "timedelta"],
    scaler=MinMaxScaler(feature_range=(-1, 1))
).fit(t)

Note

This code is equivalent to the following call on the Traffic object:

g2 = t.generation(
    generation=GaussianMixture(n_components=1),
    features=["x", "y", "altitude", "timedelta"],
    scaler=MinMaxScaler(feature_range=(-1, 1))
)

Warning

Make sure the generative model you want to use implements the fit() and sample() methods.

Then we can sample the fitted model to produce new Traffic data.

t_gen1 = g1.sample(500, projection=EuroPP())
t_gen2 = g2.sample(500, projection=EuroPP())

with plt.style.context("traffic"):
    fig, ax = plt.subplots(1, 2, subplot_kw=dict(projection=EuroPP()))

    t_gen1.plot(ax[0], alpha=0.2)
    t_gen1.centroid(nb_samples=None, projection=EuroPP()).plot(
        ax[0], color="#f58518"
    )

    t_gen2.plot(ax[1], alpha=0.2)
    t_gen2.centroid(nb_samples=None, projection=EuroPP()).plot(
        ax[1], color="#f58518"
    )

Warning

This very naive model obviously does not produce very convincing results. More appropriate methods will be provided in a near future.

class traffic.algorithms.generation.Generation(generation, features, scaler=None)

Bases: object

Generation class to handle trajectory generation.

generation: GenerationProtocol

generation model, should implement fit() and sample() methods.

features: List[str]

List of features to generate. Example: ['latitude', 'longitude', 'altitude', 'timedelta'].

scaler: ScalerProtocol, default: None

if need be, apply a scaler to the data before fitting the generation model. You may want to consider StandardScaler(). The scaler object should implement fit_transform() and inverse_transform() methods.

build_traffic(X, projection=None, coordinates=None, forward=True)

Build Traffic DataFrame from numpy array according to the list of features self.features.

Return type:

Traffic

sample(n_samples=1, projection=None, coordinates=None, forward=True)

Samples trajectories from the generation model.

Return type:

Traffic

n_samples: int, default: 1

Number of trajectories to sample.

projection: pyproj.Proj, cartopy.Projection, default: None

Required if the generation model uses x and y projections instead of latitude and longitude.

coordinates: Dict[str, float], default: None

Required if the generation model uses track and groundspeed instead of latitude and longitude. It should have 'latitude' and 'longitude' keys. Example: {'latitude': 12.2, 'longitude': 43.5}.

forward: bool, default: True

Indicates whether the coordinates attribute corresponds to the first coordinate of the trajectories or the last one. If True it is the first, else it is the last.

Example usage:
# Generation of 10 trajectories with track and groundspeed
# features, considering some ending coordinates for each
# trajectories.
t_gen = g.sample(
    10,
    coordinates={"latitude": 15, "longitude":15},
    forward=False,
)