Training process fails to reproduce translation accuracy, while provided checkpoint works perfectly

<html>
<body>
<h3></h3>Hello,First of all, thank you for your excellent work on MonoDiff9D and for making the code publicly available! I am a graduate student researching 6D pose estimation and have been trying to reproduce the results from your paper.I've encountered a puzzling situation that I hope you could help clarify.<h4>The Core Issue:</h4>Your provided pretrained checkpoint (<code>epoch_210.pth</code>) works perfectly. When I run the evaluation script (<code>test.py</code>) with your checkpoint, I can successfully reproduce the high-quality results reported in your paper for the REAL275 dataset.However, when I train the model from scratch using the exact same codebase and configuration, my trained models consistently fail to match the translation accuracy. The rotation accuracy (<code>10°</code>) is nearly perfect, but all translation-related metrics are significantly lower. This happens whether I train for 210 epochs or 300+ epochs.Here is a comparison between the results from my self-trained model and your paper's results (Table I, REAL275):

<img width="508" height="247" alt="Image" src="https://github.com/user-attachments/assets/9a78e82d-87b5-4adb-b9e9-05deaae3913a" />

- 3D IoU at 50: 21.5
- 3D IoU at 75: 3.2
- 10 cm: 28.8
- 10 degree: 59.0
- 10 degree, 10cm: 15.8



<h4>Verified Steps:</h4><ul><li>Environment: My setup is built precisely from your <code>environment.yaml</code> file. Since your checkpoint runs correctly, my environment and evaluation pipeline should be correct.</li><li>Data Integrity: I generated the depth maps using the prescribed DINOv2-NYU head. I performed a rigorous numerical comparison between my generated <code>.npy</code> files and the samples you provided. The results confirmed they are highly consistent (correlation &gt; 0.9999, max absolute difference ~1e-2).</li></ul><h4>My Question:</h4>Could you please advise if there are any crucial details about the training procedure that might differ from the public code? For instance, specific learning rate schedules, weight initializations, or other hyperparameters that were used to train the successful <code>epoch_210.pth</code> model?Any insight you could provide would be incredibly helpful in understanding this gap.Thank you again for your time and for this fantastic contribution to the community!
</body>
</html>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training process fails to reproduce translation accuracy, while provided checkpoint works perfectly #4

The Core Issue:

Verified Steps:

My Question:

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Training process fails to reproduce translation accuracy, while provided checkpoint works perfectly #4

Description

The Core Issue:

Verified Steps:

My Question:

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions