Running TsunAWI¶
If everything is at its place - the mesh in MeshPath, tsunami.namelist in the working directory - TsunAWI can be started as follows:
export OMP_NUM_THREADS=4 # adjust to your hardware
ulimit -s unlimited # to prevent spurious seg faults
/path/to/tsunawi/ompTsuna.x
The ulimit -s
command shows and sets the stack size. For realistic
setups, the default is usually too low for TsunAWI. If ompTsuna.x
crashes with a segmentation fault and you are not allowed to choose
unlimited, check with ulimit -s
for the current value and try with
multiples of this value.
Examples and small test cases¶
A few small examples, ready with mesh and namelist, are available in https://gitlab.awi.de/tsunawi/verification . Please refer to the documentation there.
Parallel runs on multicore computers¶
TsunAWI is parallelized with OpenMP-directives and can employ as many cores and hyperthreading as your computer offers with access to the same main memory.
The number of threads is set by the environment variable
OMP_NUM_THREADS
. There might be defaults set by the operating system
or, in case of a multi-user cluster, by the batch system.
Usually, TsunAWI suffers from load inbalancing, in particular in the early timesteps when the tsunami is restricted to its small region of origin.
Examples:
A PC with a 4-core CPU, no hyperthreading, still used for other work at the same time:
OMP_NUM_THREADS=3
andOMP_SCHEDULE="DYNAMIC,1024"
.A typical two-socket cluster node, each socket with a 40-core CPU and two threads per core (hyperthreading), allows a total of threads: choose
OMP_NUM_THREADS=160
.As the main memory is common, but divided physically between the two CPUs, memory locality is an issue that counteracts the dynamic scheduling. Compare runtimes for
OMP_SCHEDULE="STATIC,0"
(usually the default) andOMP_SCHEDULE="DYNAMIC, 1024"
.Usually, it is best to pin each OpenMP thread to one core. Refer to your systems documentation. Typical settings are
export OMP_PROC_BIND=TRUE
or, for Slurm job scripts,srun --cpu-bind=threads