NEDAS.core.state module

class NEDAS.core.state.State(c: Context)[source]

Bases: object

The State class manages the state variables for the assimilation system.

The analysis is performed on a regular grid.

The entire state has dimensions: member, variable, time, z, y, x indexed by: mem_id, v, t, k, j, i with size: nens, nv, nt, nz, ny, nx

To parallelize workload, we group the dimensions into 3 indices:

mem_id indexes the ensemble members

rec_id indexes the uniq 2D fields with (v, t, k), since nz and nt may vary for different variables, we stack these dimensions in the ‘record’ dimension with size nrec

par_id indexes the spatial partitions, which are subset of the 2D grid given by (ist, ied, di, jst, jed, dj), for a complete field fld[j,i] the processor with par_id stores fld[ist:ied:di, jst:jed:dj] locally.

The entire state is distributed across the memory of many processors, at any moment, a processor only stores a subset of state in its memory: either having all the mem_id,rec_id but only a subset of par_id (we call this ensemble-complete), or having all the par_id but a subset of mem_id,rec_id (we call this field-complete). It is easier to perform i/o and pre/post processing on field-complete state, while easier to run assimilation algorithms with ensemble-complete state.

info: StateInfo

rec_list: dict[Annotated[int, 'process id in comm_rec'], list[Annotated[int, 'field record id']]]

partitions: list

par_list: dict[Annotated[int, 'process id in comm_mem'], list[Annotated[int, 'partition id']]]

fields_prior: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], ndarray], 'field-complete ensemble data']

fields_z: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], ndarray], 'field-complete ensemble data']

state_prior: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], dict[Annotated[int, 'partition id'], ndarray]], 'state-complete ensemble data']

state_z: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], dict[Annotated[int, 'partition id'], ndarray]], 'state-complete ensemble data']

state_post: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], dict[Annotated[int, 'partition id'], ndarray]], 'state-complete ensemble data']

fields_post: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], ndarray], 'field-complete ensemble data']

data: dict

distribute_state_tasks(c: Context) → dict[int, list[int]][source]: Distribute rec_id across processors

prepare_state(c: Context) → None[source]: Main method to collect fields from model to form the complete state (field-complete distributed)

collect_prior_fields(c: Context) → None[source]

Collect fields from prior model state, convert them to the analysis grid, preprocess (coarse-graining etc), save to fields[mem_id, rec_id] pointing to the uniq fields

Parameters:

c (Context) – context object

Returns:

fields dictionary [(mem_id, rec_id), fld]: where fld is np.array defined on c.grid, it’s one of the state variable field
dict: fields_z dictionary [(mem_id, rec_id), zfld]: where zfld is same shape as fld, it’s he z coordinates corresponding to each field

Return type:

dict

collect_scalar_variables(c)[source]

output_state(c: Context, tag: str, mem_id_out: int | None = None, rec_id_out: int | None = None) → None[source]

Parallel output the fields to the binary state_file

Parameters:

c (Context) – the runtime context obj
tag (str) – which version of state this is: ‘prior’, ‘post’ or ‘z’ coords?
mem_id_out (int, optional) – member id to be output, if None all available ids will output.
rec_id_out (int, optional) – record id to be output, if None all available ids will output.

output_ens_mean(c: Context, tag: str) → None[source]

Compute ensemble mean of a field stored distributively on all pid_mem collect means on pid_mem=0, and output to mean_file

Parameters:

c (Context) – the runtime context obj
tag (str) – which version of state this is: ‘prior_mean’, ‘post_mean’, or ‘z’
mean_file (str) – path to the output binary file for the ensemble mean

output_ref_z(c: Context)[source]

pack_field_chunk(c: Context, fld, is_vector, dst_pid)[source]

unpack_field_chunk(c, fld, fld_chk, src_pid)[source]

transpose_to_ensemble_complete(c: Context, fields: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], ndarray], 'field-complete ensemble data']) → Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], dict[Annotated[int, 'partition id'], ndarray]], 'state-complete ensemble data'][source]

Send chunks of field owned by a pid to other pid so that the field-complete fields get transposed into ensemble-complete state with keys (mem_id, rec_id) pointing to the partition in par_list

Parameters:

c (Context) – the runtime context
fields (FieldEns) – The locally stored field-complete fields with subset of mem_id,rec_id

Returns:

The locally stored ensemble-complete field chunks on partitions, dict[(mem_id, rec_id), dict[par_id, fld_chk]]

Return type:

StateEns

transpose_to_field_complete(c: Context, state: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], dict[Annotated[int, 'partition id'], ndarray]], 'state-complete ensemble data']) → Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], ndarray], 'field-complete ensemble data'][source]

Transposes back the state to field-complete fields

Parameters:

c (Context) – the runtime context
state (StateEns) – the locally stored ensemble-complete field chunks for subset of par_id

Returns:

the locally stored field-complete fields for subset of mem_id,rec_id.

Return type:

FieldEns

pack_local_state_data(c: Context, par_id: Annotated[int, 'partition id'], state_prior: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], dict[Annotated[int, 'partition id'], ndarray]], 'state-complete ensemble data'], state_z: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], dict[Annotated[int, 'partition id'], ndarray]], 'state-complete ensemble data']) → dict[source]: pack state dict into arrays to be more easily handled by jitted funcs

unpack_local_state_data(c: Context, par_id: Annotated[int, 'partition id'], state_prior: Annotated[dict[tuple[Annotated[int, 'member id'], Annotated[int, 'field record id']], dict[Annotated[int, 'partition id'], ndarray]], 'state-complete ensemble data'], data: dict) → None[source]: unpack data and write back to the state dict