Skip to content

Training process fails to reproduce translation accuracy, while provided checkpoint works perfectly #4

@DHW74

Description

@DHW74

Hello,

First of all, thank you for your excellent work on MonoDiff9D and for making the code publicly available! I am a graduate student researching 6D pose estimation and have been trying to reproduce the results from your paper.

I've encountered a puzzling situation that I hope you could help clarify.

The Core Issue:

Your provided pretrained checkpoint (epoch_210.pth) works perfectly. When I run the evaluation script (test.py) with your checkpoint, I can successfully reproduce the high-quality results reported in your paper for the REAL275 dataset.

However, when I train the model from scratch using the exact same codebase and configuration, my trained models consistently fail to match the translation accuracy. The rotation accuracy (10°) is nearly perfect, but all translation-related metrics are significantly lower. This happens whether I train for 210 epochs or 300+ epochs.

Here is a comparison between the results from my self-trained model and your paper's results (Table I, REAL275):

Image
  • 3D IoU at 50: 21.5
  • 3D IoU at 75: 3.2
  • 10 cm: 28.8
  • 10 degree: 59.0
  • 10 degree, 10cm: 15.8

Verified Steps:

  • Environment: My setup is built precisely from your environment.yaml file. Since your checkpoint runs correctly, my environment and evaluation pipeline should be correct.

  • Data Integrity: I generated the depth maps using the prescribed DINOv2-NYU head. I performed a rigorous numerical comparison between my generated .npy files and the samples you provided. The results confirmed they are highly consistent (correlation > 0.9999, max absolute difference ~1e-2).

My Question:

Could you please advise if there are any crucial details about the training procedure that might differ from the public code? For instance, specific learning rate schedules, weight initializations, or other hyperparameters that were used to train the successful epoch_210.pth model?

Any insight you could provide would be incredibly helpful in understanding this gap.

Thank you again for your time and for this fantastic contribution to the community!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions