Skip to content

Restarting from modified checkpoints with addParticles2Checkpoint does not work #5678

@finnolec

Description

@finnolec

When trying to create a minimum reproducible example for this issue here, I have coincidentally found the issue #5675. The error in #5675 is somehow related to periodic boundaries and the PMLs and restarting. However, periodic boundaries are on per default in the LWFA example but not in my actual simulation...

Setup

  • PIConGPU version 12aa847
  • LaserWakefield example, but set density.param to return 0 and remove the incident field laser from incidentField.param (so a vacuum simulation with electrons as species with the name "e" defined).
  • Run simulation for 1 timestep to create an empty checkpoint --checkpoint.period 0:0 --checkpoint.openPMD.ext bp5 and deactivate periodic boundaries
  • Run the attached create_bunch.py script... the path that I add to import bunchInit_openPMD_bp is pointing to the PIConGPU source code I'm currently using. This file is based on the mainline jupyter notebook with adjusted values for the LWFA example https://github.com/ComputationalRadiationPhysics/picongpu/blob/dev/lib/python/picongpu/extra/input/createBunch_example.ipynb
  • Create new run directory
  • Symlink modified checkpoint into simOutput
  • Add a checkpoints.txt with 0 to checkpoints directory (see below)
  • Start new simulation

Error

Unloading module cmake/4.0.3
Unloading module python/3.12.4
Unloading module volta
Loading module volta
Loading module python/3.12.4
Loading module cmake/4.0.3

The following have been reloaded with a version change:
  1) numactl/2.0.18 => numactl/2.0.18-GCCcore-13.3.0

ln: failed to create symbolic link 'output': File exists
Note: GPU memory test was skipped as no binary 'cuda_memtest' available or compute node is not exclusively allocated. This does not affect PIConGPU, starting it now
Unhandled exception of type 'St13runtime_error' with message 'Chunk does not reside inside dataset (Dimension on index 0. DS: 48346 - Chunk: 96692)', terminating
srun: error: gv032: task 0: Exited with exit code 1

Verbose openPMD output

> cat stderr | grep OPEN_PATH | awk '{print $3}' | grep -v data | grep -v 0
fields/
E
picongpu_idProvider
Convolutional
Convolutional
Convolutional
Convolutional
Convolutional
Convolutional
Convolutional
Convolutional
Convolutional
Convolutional
B
Convolutional
Convolutional
particles/
e
positionOffset
position
particlePatches
offset
extent
momentum

Additional unexpected behavior

If there is no checkpoints.txt with a 0 it does not find the checkpoint, although I have specified to directly restart from this timestep.

Checkpoint files

I think I have given you read permissions on rosi @franzpoeschel (otherwise please complain). The files are too large to upload...

Empty checkpoint: /bigdata/hplsim/scratch/carste06/runs/2026_Dev/dev_lwfa_v2_cp
Filled checkpoint: /bigdata/hplsim/scratch/carste06/runs/2026_Dev/dev_lwfa_v2_full
Simulation that tried to restart with error message: /bigdata/hplsim/scratch/carste06/runs/2026_Dev/dev_lwfa_v2_cpr

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions