Skip to content

Use of F1 score in demos #4

@salocinc

Description

@salocinc

Hi, I'm working with the MMUSED-fallacy dataset and I was trying to replicate the results reported on the MAMKit paper for this dataset using the demo files, and realized that F1 scores calculated and stored are micro averaged. I tried changing it to macro average using
val_metrics=MetricCollection({'f1': F1Score(task='multiclass', num_classes=6, average='macro')}), test_metrics=MetricCollection({'f1': F1Score(task='multiclass', num_classes=6, average='macro')}),
instead of
val_metrics=MetricCollection({'f1': F1Score(task='multiclass', num_classes=6)}), test_metrics=MetricCollection({'f1': F1Score(task='multiclass', num_classes=6)}),
and I'm getting results quite different as the reported on the paper.

Are the results presented on MAMKit paper obtained from the demo files present in this repository?

Thank you in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions