How to access ADS-B data from OpenSky history database?

For more advanced request or a dig in further history, you may be eligible for an direct access to the history database through their Impala shell or Trino database.

Warning

OpenSky data are subject to particular terms of use. In particular, if you plan to use data for commercial purposes, you should mention it when you ask for access

Provided functions are here to help:

  • build appropriate and efficient requests without any SQL knowledge;

  • split requests efficiently and store intermediary results in cache files;

  • parse results with pandas and wrap results in appropriate data structures.

The first thing to do is to put your credentials in your configuration file. Edit the following lines of your configuration file.

[opensky]
username =
password =

You can check the path to your configuration file here. The path is different according to OS versions so do not assume anything and check the contents of the variable.

>>> import traffic
>>> traffic.config_file
PosixPath('/home/xo/.config/traffic/traffic.conf')

Historical traffic data

OpenSky.history(start, stop=None, *args, callsign=None, icao24=None, serials=None, bounds=None, departure_airport=None, arrival_airport=None, airport=None, time_buffer=None, cached=True, compress=False, limit=None, selected_columns=(), return_flight=False, **kwargs)

Get Traffic from the OpenSky Trino database.

You may pass requests based on time ranges, callsigns, aircraft, areas, serial numbers for receivers, or airports of departure or arrival.

The method builds appropriate SQL requests, caches results and formats data into a proper pandas DataFrame. Requests are split by hour (by default) in case the connection fails.

Parameters:
  • start (timelike) – a string (default to UTC), epoch or datetime (native Python or pandas)

  • stop (None | timelike) – a string (default to UTC), epoch or datetime (native Python or pandas), by default, one day after start

  • date_delta – a timedelta representing how to split the requests, by default: per hour

Return type:

None | Flight | Traffic

More arguments to filter resulting data:

Parameters:
  • callsign (None | str | list[str]) – a string or a list of strings (wildcards accepted, _ for any character, % for any sequence of characters);

  • icao24 (None | str | list[str]) – a string or a list of strings identifying the transponder code of the aircraft;

  • serials (None | int | Iterable[int]) – an integer or a list of integers identifying the sensors receiving the data;

  • bounds (None | str | HasBounds | tuple[float, float, float, float]) – sets a geographical footprint. Either an airspace or shapely shape (requires the bounds attribute); or a tuple of float (west, south, east, north);

  • selected_columns (tuple[InstrumentedAttribute[Any] | str, …]) – specify the columns you want to retrieve. When empty, use all columns of the StateVectorsData4 table. You may escape column names as str. Always escape names from the FlightsData4 table.

Airports

The following options build more complicated requests by merging information from two tables in the Trino database, resp. state_vectors_data4 and flights_data4.

Parameters:
  • departure_airport (None | str) – a string for the ICAO identifier of the airport. Selects flights departing from the airport between the two timestamps;

  • arrival_airport (None | str) – a string for the ICAO identifier of the airport. Selects flights arriving at the airport between the two timestamps;

  • airport (None | str) – a string for the ICAO identifier of the airport. Selects flights departing from or arriving at the airport between the two timestamps;

  • time_buffer (None | str | pd.Timedelta) – (default: None) time buffer used to extend time bounds for flights in the OpenSky flight tables: requests will get flights between start - time_buffer and stop + time_buffer. If no airport is specified, the parameter is ignored.

Warning

  • See pyopensky.trino.flightlist() if you do not need any trajectory information.

  • If both departure_airport and arrival_airport are set, requested timestamps match the arrival time;

  • If airport is set, departure_airport and arrival_airport cannot be specified (a RuntimeException is raised).

Useful options for debug

Parameters:
  • cached (bool) – (default: True) switch to False to force a new request to the database regardless of the cached files. This option also deletes previous cache files;

  • compress (bool) – (default: False) compress cache files. Reduces disk space occupied at the expense of slightly increased time to load.

  • limit (None | int) – maximum number of records requested, LIMIT keyword in SQL.

Examples of requests

First, the opensky instance parses your configuration file upon import:

from traffic.data import opensky

Then you may send requests:

  • based on callsign:

    flight = opensky.history(
        "2017-02-05 15:45",
        stop="2017-02-05 16:45",
        callsign="EZY158T",
        # returns a Flight instead of a Traffic
        return_flight=True
    )
    flight
    

    Flight

    • callsign: EZY158T
    • aircraft: 406d95 · 🇬🇧 G-EZOP (A320)
    • start: 2017-02-05 15:45:00+00:00
    • stop: 2017-02-05 16:45:00+00:00
    • duration: 0 days 01:00:00
    • sampling rate: 1 second(s)
  • based on bounding box:

    # two hours of traffic over LFBB FIR
    t_lfbb = opensky.history(
        "2018-10-01 11:00",
        "2018-10-01 13:00",
        bounds=eurofirs['LFBB']
    )
    
  • based on airports and callsigns (with wildcard):

    # Airbus test flights from and to Toulouse airport
    t_aib = opensky.history(
        "2019-11-01 09:00",
        "2019-11-01 12:00",
        departure_airport="LFBO",
        arrival_airport="LFBO",
        callsign="AIB%",
    )
    
  • based on airport (with origin/destination ICAO id):

    # flights from and to Zurich airport
    t_lszh = opensky.history(
        start="2024-03-15 09:00",
        stop="2024-03-15 11:00",
        airport="LSZH",
        selected_columns=(
            # colums from StateVector4 (quoted or not)
            StateVectorsData4.time,
            'icao24', 'lat', 'lon', 'velocity', 'heading', 'vertrate',
            'callsign', 'onground', 'alert', 'spi', 'squawk', 'baroaltitude',
            'geoaltitude', 'lastposupdate', 'lastcontact', 'serials', 'hour',
            # (some) columns from FlightsData4: always quoted!
            # returned as columns 'estdepartureairport' and 'estarrivalairport'
            'FlightsData4.estdepartureairport', 'FlightsData4.estarrivalairport'),
    )
    
  • based on (own?) receiver’s identifier:

    t_sensor = opensky.history(
        "2019-11-11 10:00",
        "2019-11-11 12:00",
        serials=1433801924,
    )
    

Extended Mode-S (EHS)

EHS messages are not automatically decoded for you in the OpenSky Database but you may access them and decode them from your computer.

Warning

Some examples here may be outdated. To our knowledge at this time, only EHS data after January 1st 2020 are available!

Tip

Flight.query_ehs() messages also takes a dataframe argument to avoid making possibly numerous requests to the Impala database.
Consider using opensky.extended() and request all necessary data, then pass the resulting dataframe as an argument.
OpenSky.extended(start, stop=None, *args, icao24=None, serials=None, bounds=None, callsign=None, departure_airport=None, time_after_departure=None, arrival_airport=None, time_before_arrival=None, airport=None, cached=True, compress=False, limit=None, extra_columns=(), **kwargs)

Get raw message from the OpenSky Trino database.

You may pass requests based on time ranges, callsigns, aircraft, areas, serial numbers for receivers, or airports of departure or arrival.

The method builds appropriate SQL requests, caches results and formats data into a proper pandas DataFrame. Requests are split by hour (by default) in case the connection fails.

Parameters:
  • start (timelike) – a string (default to UTC), epoch or datetime (native Python or pandas)

  • stop (None | timelike) – a string (default to UTC), epoch or datetime (native Python or pandas), by default, one day after start

  • date_delta – a timedelta representing how to split the requests, by default: per hour

Return type:

None | RawData

More arguments to filter resulting data:

Parameters:
  • callsign (None | str | list[str]) – a string or a list of strings (wildcards accepted, _ for any character, % for any sequence of characters);

  • icao24 (None | str | list[str]) – a string or a list of strings identifying the transponder code of the aircraft;

  • serials (None | int | Iterable[int]) – an integer or a list of integers identifying the sensors receiving the data;

  • bounds (None | HasBounds | tuple[float, float, float, float]) – sets a geographical footprint. Either an airspace or shapely shape (requires the bounds attribute); or a tuple of float (west, south, east, north);

Airports

The following options build more complicated requests by merging information from two tables in the Trino database, resp. rollcall_replies_data4 and flights_data4.

Parameters:
  • departure_airport (None | str) – a string for the ICAO identifier of the airport. Selects flights departing from the airport between the two timestamps;

  • arrival_airport (None | str) – a string for the ICAO identifier of the airport. Selects flights arriving at the airport between the two timestamps;

  • airport (None | str) – a string for the ICAO identifier of the airport. Selects flights departing from or arriving at the airport between the two timestamps;

Warning

  • If both departure_airport and arrival_airport are set, requested timestamps match the arrival time;

  • If airport is set, departure_airport and arrival_airport cannot be specified (a RuntimeException is raised).

  • It is not possible at the moment to filter both on airports and on geographical bounds (help welcome!).

Useful options for debug

Parameters:
  • cached (bool) – (default: True) switch to False to force a new request to the database regardless of the cached files. This option also deletes previous cache files;

  • compress (bool) – (default: False) compress cache files. Reduces disk space occupied at the expense of slightly increased time to load.

  • limit (None | int) – maximum number of records requested, LIMIT keyword in SQL.

Examples of requests

  • based on transponder identifier (icao24):

    from traffic.data.samples import belevingsvlucht
    
    df = opensky.extended(
        belevingsvlucht.start,
        belevingsvlucht.stop,
        icao24=belevingsvlucht.icao24
    )
    
    enriched = belevingsvlucht.query_ehs(df)
    
  • based on geographical bounds:

    from traffic.data import eurofirs
    from traffic.data.samples import switzerland
    
    df = opensky.extended(
        switzerland.start_time,
        switzerland.end_time,
        bounds=eurofirs['LSAS']
    )
    
    enriched_ch = (
        switzerland
        .filter()
        .query_ehs(df)
        .resample('1s')
        .eval(desc='', max_workers=4)
    )
    
  • based on airports, together with traffic:

    schiphol = opensky.history(
        "2019-11-11 12:00",
        "2019-11-11 14:00",
        airport="EHAM"
    )
    
    df = opensky.extended(
        "2019-11-11 12:00",
        "2019-11-11 14:00",
        airport="EHAM"
    )
    
    enriched_eham = (
        schiphol
        .filter()
        .query_ehs(df)
        .resample('1s')
        .eval(desc='', max_workers=4)
    )
    

Flight list by airport

OpenSky.flightlist(start, stop=None, *args, departure_airport=None, arrival_airport=None, airport=None, callsign=None, icao24=None, cached=True, compress=False, limit=None, **kwargs)

Lists flights departing or arriving at a given airport.

You may pass requests based on time ranges, callsigns, aircraft, areas, serial numbers for receivers, or airports of departure or arrival.

The method builds appropriate SQL requests, caches results and formats data into a proper pandas DataFrame. Requests are split by hour (by default) in case the connection fails.

Parameters:
  • start (timelike) – a string (default to UTC), epoch or datetime (native Python or pandas)

  • stop (None | timelike) – a string (default to UTC), epoch or datetime (native Python or pandas), by default, one day after start

Return type:

None | pd.DataFrame

More arguments to filter resulting data:

Parameters:
  • departure_airport (None | str | list[str]) – a string for the ICAO identifier of the airport. Selects flights departing from the airport between the two timestamps;

  • arrival_airport (None | str | list[str]) – a string for the ICAO identifier of the airport. Selects flights arriving at the airport between the two timestamps;

  • airport (None | str | list[str]) – a string for the ICAO identifier of the airport. Selects flights departing from or arriving at the airport between the two timestamps;

  • callsign (None | str | list[str]) – a string or a list of strings (wildcards accepted, _ for any character, % for any sequence of characters);

  • icao24 (None | str | list[str]) – a string or a list of strings identifying the transponder code of the aircraft;

Warning

  • If both departure_airport and arrival_airport are set, requested timestamps match the arrival time;

  • If airport is set, departure_airport and arrival_airport cannot be specified (a RuntimeException is raised).

Useful options for debug

Parameters:
  • cached (bool) – (default: True) switch to False to force a new request to the database regardless of the cached files. This option also deletes previous cache files;

  • compress (bool) – (default: False) compress cache files. Reduces disk space occupied at the expense of slightly increased time to load.

  • limit (None | int) – maximum number of records requested, LIMIT keyword in SQL.

Requests for raw data

OpenSky.rawdata(start, stop=None, *args, icao24=None, serials=None, bounds=None, callsign=None, departure_airport=None, time_after_departure=None, arrival_airport=None, time_before_arrival=None, airport=None, cached=True, compress=False, limit=None, extra_columns=(), **kwargs)

Get raw message from the OpenSky Trino database.

You may pass requests based on time ranges, callsigns, aircraft, areas, serial numbers for receivers, or airports of departure or arrival.

The method builds appropriate SQL requests, caches results and formats data into a proper pandas DataFrame. Requests are split by hour (by default) in case the connection fails.

Parameters:
  • start (timelike) – a string (default to UTC), epoch or datetime (native Python or pandas)

  • stop (None | timelike) – a string (default to UTC), epoch or datetime (native Python or pandas), by default, one day after start

  • date_delta – a timedelta representing how to split the requests, by default: per hour

Return type:

None | RawData

More arguments to filter resulting data:

Parameters:
  • callsign (None | str | list[str]) – a string or a list of strings (wildcards accepted, _ for any character, % for any sequence of characters);

  • icao24 (None | str | list[str]) – a string or a list of strings identifying the transponder code of the aircraft;

  • serials (None | int | Iterable[int]) – an integer or a list of integers identifying the sensors receiving the data;

  • bounds (None | HasBounds | tuple[float, float, float, float]) – sets a geographical footprint. Either an airspace or shapely shape (requires the bounds attribute); or a tuple of float (west, south, east, north);

Airports

The following options build more complicated requests by merging information from two tables in the Trino database, resp. rollcall_replies_data4 and flights_data4.

Parameters:
  • departure_airport (None | str) – a string for the ICAO identifier of the airport. Selects flights departing from the airport between the two timestamps;

  • arrival_airport (None | str) – a string for the ICAO identifier of the airport. Selects flights arriving at the airport between the two timestamps;

  • airport (None | str) – a string for the ICAO identifier of the airport. Selects flights departing from or arriving at the airport between the two timestamps;

Warning

  • If both departure_airport and arrival_airport are set, requested timestamps match the arrival time;

  • If airport is set, departure_airport and arrival_airport cannot be specified (a RuntimeException is raised).

  • It is not possible at the moment to filter both on airports and on geographical bounds (help welcome!).

Useful options for debug

Parameters:
  • cached (bool) – (default: True) switch to False to force a new request to the database regardless of the cached files. This option also deletes previous cache files;

  • compress (bool) – (default: False) compress cache files. Reduces disk space occupied at the expense of slightly increased time to load.

  • limit (None | int) – maximum number of records requested, LIMIT keyword in SQL.