Hi, I'm working with the MMUSED-fallacy dataset and I was trying to replicate the results reported on the MAMKit paper for this dataset using the demo files, and realized that F1 scores calculated and stored are micro averaged. I tried changing it to macro average using
val_metrics=MetricCollection({'f1': F1Score(task='multiclass', num_classes=6, average='macro')}), test_metrics=MetricCollection({'f1': F1Score(task='multiclass', num_classes=6, average='macro')}),
instead of
val_metrics=MetricCollection({'f1': F1Score(task='multiclass', num_classes=6)}), test_metrics=MetricCollection({'f1': F1Score(task='multiclass', num_classes=6)}),
and I'm getting results quite different as the reported on the paper.
Are the results presented on MAMKit paper obtained from the demo files present in this repository?
Thank you in advance.
Hi, I'm working with the MMUSED-fallacy dataset and I was trying to replicate the results reported on the MAMKit paper for this dataset using the demo files, and realized that F1 scores calculated and stored are micro averaged. I tried changing it to macro average using
val_metrics=MetricCollection({'f1': F1Score(task='multiclass', num_classes=6, average='macro')}), test_metrics=MetricCollection({'f1': F1Score(task='multiclass', num_classes=6, average='macro')}),instead of
val_metrics=MetricCollection({'f1': F1Score(task='multiclass', num_classes=6)}), test_metrics=MetricCollection({'f1': F1Score(task='multiclass', num_classes=6)}),and I'm getting results quite different as the reported on the paper.
Are the results presented on MAMKit paper obtained from the demo files present in this repository?
Thank you in advance.