NEDAS.core.state module
- class NEDAS.core.state.State(c: Context)[source]
Bases:
objectThe State class manages the state variables for the assimilation system.
The analysis is performed on a regular grid.
The entire state has dimensions: member, variable, time, z, y, x indexed by: mem_id, v, t, k, j, i with size: nens, nv, nt, nz, ny, nx
To parallelize workload, we group the dimensions into 3 indices:
mem_id indexes the ensemble members
rec_id indexes the uniq 2D fields with (v, t, k), since nz and nt may vary for different variables, we stack these dimensions in the ‘record’ dimension with size nrec
par_id indexes the spatial partitions, which are subset of the 2D grid given by (ist, ied, di, jst, jed, dj), for a complete field fld[j,i] the processor with par_id stores fld[ist:ied:di, jst:jed:dj] locally.
The entire state is distributed across the memory of many processors, at any moment, a processor only stores a subset of state in its memory: either having all the mem_id,rec_id but only a subset of par_id (we call this ensemble-complete), or having all the par_id but a subset of mem_id,rec_id (we call this field-complete). It is easier to perform i/o and pre/post processing on field-complete state, while easier to run assimilation algorithms with ensemble-complete state.
- rec_list: dict[Annotated[int, 'process id in comm_rec'], list[Annotated[int, 'field record id']]]
- partitions: list
- par_list: dict[Annotated[int, 'process id in comm_mem'], list[Annotated[int, 'partition id']]]
- fields_prior: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], ndarray], 'field-complete ensemble data']
- fields_z: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], ndarray], 'field-complete ensemble data']
- state_prior: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], dict[Annotated[int, 'partition id'], ndarray]], 'state-complete ensemble data']
- state_z: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], dict[Annotated[int, 'partition id'], ndarray]], 'state-complete ensemble data']
- state_post: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], dict[Annotated[int, 'partition id'], ndarray]], 'state-complete ensemble data']
- fields_post: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], ndarray], 'field-complete ensemble data']
- data: dict
- distribute_state_tasks(c: Context) dict[int, list[int]][source]
Distribute rec_id across processors
- prepare_state(c: Context) None[source]
Main method to collect fields from model to form the complete state (field-complete distributed)
- collect_prior_fields(c: Context) None[source]
Collect fields from prior model state, convert them to the analysis grid, preprocess (coarse-graining etc), save to fields[mem_id, rec_id] pointing to the uniq fields
- Parameters:
c (Context) – context object
- Returns:
- fields dictionary [(mem_id, rec_id), fld]
where fld is np.array defined on c.grid, it’s one of the state variable field
- dict: fields_z dictionary [(mem_id, rec_id), zfld]
where zfld is same shape as fld, it’s he z coordinates corresponding to each field
- Return type:
dict
- output_state(c: Context, tag: str, mem_id_out: int | None = None, rec_id_out: int | None = None) None[source]
Parallel output the fields to the binary state_file
- Parameters:
c (Context) – the runtime context obj
tag (str) – which version of state this is: ‘prior’, ‘post’ or ‘z’ coords?
mem_id_out (int, optional) – member id to be output, if None all available ids will output.
rec_id_out (int, optional) – record id to be output, if None all available ids will output.
- output_ens_mean(c: Context, tag: str) None[source]
Compute ensemble mean of a field stored distributively on all pid_mem collect means on pid_mem=0, and output to mean_file
- Parameters:
c (Context) – the runtime context obj
tag (str) – which version of state this is: ‘prior_mean’, ‘post_mean’, or ‘z’
mean_file (str) – path to the output binary file for the ensemble mean
- transpose_to_ensemble_complete(c: Context, fields: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], ndarray], 'field-complete ensemble data']) Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], dict[Annotated[int, 'partition id'], ndarray]], 'state-complete ensemble data'][source]
Send chunks of field owned by a pid to other pid so that the field-complete fields get transposed into ensemble-complete state with keys (mem_id, rec_id) pointing to the partition in par_list
- Parameters:
c (Context) – the runtime context
fields (FieldEns) – The locally stored field-complete fields with subset of mem_id,rec_id
- Returns:
The locally stored ensemble-complete field chunks on partitions, dict[(mem_id, rec_id), dict[par_id, fld_chk]]
- Return type:
StateEns
- transpose_to_field_complete(c: Context, state: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], dict[Annotated[int, 'partition id'], ndarray]], 'state-complete ensemble data']) Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], ndarray], 'field-complete ensemble data'][source]
Transposes back the state to field-complete fields
- Parameters:
c (Context) – the runtime context
state (StateEns) – the locally stored ensemble-complete field chunks for subset of par_id
- Returns:
the locally stored field-complete fields for subset of mem_id,rec_id.
- Return type:
FieldEns
- pack_local_state_data(c: Context, par_id: Annotated[int, 'partition id'], state_prior: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], dict[Annotated[int, 'partition id'], ndarray]], 'state-complete ensemble data'], state_z: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], dict[Annotated[int, 'partition id'], ndarray]], 'state-complete ensemble data']) dict[source]
pack state dict into arrays to be more easily handled by jitted funcs
- unpack_local_state_data(c: Context, par_id: Annotated[int, 'partition id'], state_prior: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], dict[Annotated[int, 'partition id'], ndarray]], 'state-complete ensemble data'], data: dict) None[source]
unpack data and write back to the state dict