database

Abstraction for calibration databases.

This module the objects that are used in Tunax to describe a Database of observations used for a calibration. By datas (Data), we refer ton the union of a trajectory and a physical case which represent a measurment or a reference as a Large Eddy Simulation (LES) for example. And by observations (Obs), we refer to the union of a trajectory and the Tunax model which corresponds to it. These classes can be obtained by the prefix tunax.database. or directly by tunax..

DimsType

Type that represent the dimensions on which load the datas from a file.

alias of Tuple[int | None]

class Data[source]

Abstraction to represent an element of the database from the point of view of Tunax.

This abstraction is the link between the time-series of Trajectory and a physical situation described by Case. It can eventually contains metadatas. Typically this class appears when one want to import the different element of a database of observations or simulations. The constructor takes all the attributes as parameters.

trajectory

The time-series of the variables that represent this data.

Type:

Trajectory

case

The physical case that represent this data.

Type:

Case

metadatas

Some metadatas that we want to use later. It can be some values of the case that we want to set by hand later.

Type:

Dict[str, float], default={}

Raises:

ValueError – If the time of trajectory is not build with constant time-steps.

classmethod from_nc_yaml(nc_path, yaml_path, var_names, eos_tracers='t', do_pt=False)[source]

Create a Data instance from a netcdf and a yaml files.

This class method build a trajectory from the .nc file nc_path, it build the physical parameters from the configuration file yaml_path. var_names is used to do the link between Tunax name convention and the one from the used database.

Parameters:
  • nc_path (str) – Path of the netcdf file that contains the time-series of the observation trajectory. The file should contains at least the three dimensions zr, zw and time. The time-series can be created with default values if they are not present in the file (only for space.Trajectory.u and space.Trajectory.v). Otherwise, they must have the good dimensions described in Trajectory.

  • yaml_path (str) – Path of the yaml file that contains the parameters and forcing that describe the observation. The parameters should be float numbers and directly accessible from the root of the file with a key. Only the parameters that are described in Case will be takend in account.

  • var_names (Dict[str, str]) – Link between the convention names in Tunax and the ones in the database. The keys are the Tunax names and the values are the names in the database. It works for variables of the Trajectory and fornthe parameters of Case. It must at least contains entries for zr zw and time

  • eos_tracers (str, default='t') – Tracers used for the equation of state, cf. eos_tracers.

  • do_pt (bool, default=False) – Compute or not a passive tracer, cf. do_pt.

Returns:

data – An object that represent these files.

Return type:

Data

classmethod from_jld2(jld2_path, names_mapping, nz=None, dims=(None,), eos_tracers='t', do_pt=False)[source]

Creates a Data instance from a .jld2 file.

For the scalar parameters, the values must be registered in the file simply with their path in the file, separated with / in the same string. For the timeseries and the time, the arrays of each time step must be register with a path that ends with the reference of the time. For the other variables, just register the normal path. The time is appriximated to the order of the second.

Parameters:
  • jld2_path (str) – Path of the netcdf file that contains the time-series of the observation trajectory and the physical parameters and forcings.

  • names_mapping (Dict[str, Dict[str, str]]) –

    Contains the link between the Tunax names of variables and the path of the variables in the file. There are 3 first entries :

    • variables :for all the variables corresponding to the space.Grid and the space.Trajectory. For the grid attributes (space.Grid.zr and space.Grid.zw) the path should correspond directly to the array in the file. For the time and the time-series, the given path corresponds to a path with all the reference (with a number in string) of the time, and then in these path with the time we have the array of the variable (or the float corresponding to the value of the time) at the time with this reference. Then the 2D arrays are rebuild by concatenation. The references of the time are get with the path of the time data.

    • parameters : for all the scalar entries corresponding directly to the parameters of case.Case

    • metadatas : for the scalar entries that we want to keep in the metadatas for later.

  • nz (int, optionnal, default=None) – Expected number of steps of the grid of the water column. The method will remove the borders of the raw data from the file to keep only the middle part of this lenght. If nothing is entered for this parameter, all the raw data are kept.

  • dims (DimsType or Dict[str, DimsType], default=(None,)) – It contains the dimensions on which search the right arrays. If it’s a dictionnary, it’s like var_names, the keys are the names of the variables in terms of Tunax and the values are the dimensions for each variable. Then we have a Tuple of int or Nones which corresponds at every axis of the raw data from the file. If an axis is indexed with None, it means that we keep this dimension, if an axis is indexed with an integer, it means that we reduce this axis to the value of the raw data on this index.

  • eos_tracers (str, default='t') – Tracers used for the equation of state, cf. eos_tracers.

  • do_pt (bool, default=False) – Compute or not a passive tracer, cf. do_pt.

Returns:

data – An object that represent these file.

Return type:

Data

cut(out_nt_cut)[source]

Cuts the Trajectory in sub-trajectories, cf. space.Trajectory.cut().

Parameters:

out_nt_cut (int) – Number of output steps of the sub-trajectories.

Returns:

traj_list – List of Data instances with the sub-trajectories in the chronological order.

Return type:

List[Data]

class Weights[source]

Representation of the weights to put on every variable for the computing of the lost function. The constructor takes all the attributes as parameters.

weight_u

Weight on zonal velocity.

Type:

float, default=0.

weight_v

Weight on meridionnal velocity.

Type:

float, default=0.

weight_t

Weight on temperature.

Type:

float, default=0.

weight_s

Weight on salinity.

Type:

float, default=0.

weight_b

Weight on buoyancy.

Type:

float, default=0.

weight_pt

Weight on passive tracer.

Type:

float, default=0.

class Obs[source]

This class represents and element of the database from the point of view of the loss function.

This class prepares everything to make the loss function able to compute the loss for this element of the database. Indeed this class makes the link between the Trajectory corresponding to this element, a model (with a grid a time parameters) corresponding to this trajectory, and the weights that we want to put on each variable. The constructor takes all the attributes as parameters.

trajectory

The time-series of the variables that represent this observation.

Type:

Trajectory

model

A model built on this trajectory and on a physical case with the time and geometrical parameters.

Type:

SingleColumnModel

weights

The weights to give to the loss function.

Type:

Weights

classmethod from_data(data, dt, weights, checkpoint=False)[source]

Create a Obs instance from a Data one adding Weights and a dt.

This function builds the other time parameters of the model from the trajectory.

Parameters:
  • data (Data) – A data containing the trajectory and the physical case that we want to apply on our model.

  • dt (float) – The integration time-step that we want for our model.

  • weights (Weights) – The weights to give to the loss function.

  • checkpoint (bool, default=False) – Use the checkpoint() on the partial run method. Used for economize the memory when computing the gradient, especially on GPUs.

class Database[source]

Represent a set of several observations that form a database. The constructor takes all the attributes as parameters.

observations

A list of several observations with potentially various forcings, geometry and time configuration.

Type:

List[Obs]