modules.metrics¶

The metrics module contains implementations of various metrics used commonly to understand how well our models are performing. For e.g. accuracy, vqa_accuracy, r@1 etc.

For implementing your own metric, you need to follow these steps:

Create your own metric class and inherit BaseMetric class.
In the __init__ function of your class, make sure to call super().__init__('name') where ‘name’ is the name of your metric. If you require any parameters in your __init__ function, you can use keyword arguments to represent them and metric constructor will take care of providing them to your class from config.
Implement a calculate function which takes in SampleList and model_output as input and return back a float tensor/number.
Register your metric with a key ‘name’ by using decorator, @registry.register_metric('name').

Example:

import torch

from pythia.common.registry import registry
from pythia.modules.metrics import BaseMetric

@registry.register_metric("some")
class SomeMetric(BaseMetric):
    def __init__(self, some_param=None):
        super().__init__("some")
        ....

    def calculate(self, sample_list, model_output):
        metric = torch.tensor(2, dtype=torch.float)
        return metric

Example config for above metric:

model_attributes:
    pythia:
        metrics:
        - type: some
          params:
            some_param: a

class pythia.modules.metrics.Accuracy[source]¶

Metric for calculating accuracy.

Key: accuracy

calculate(sample_list, model_output, *args, **kwargs)[source]¶

Calculate accuracy and return it back.

Parameters:	sample_list (SampleList) – SampleList provided by DataLoader for current iteration model_output (Dict) – Dict returned by model.
Returns:	accuracy.
Return type:	torch.FloatTensor

class pythia.modules.metrics.BaseMetric(name, *args, **kwargs)[source]¶

Base class to be inherited by all metrics registered to Pythia. See the description on top of the file for more information. Child class must implement calculate function.

Parameters:	name (str) – Name of the metric.

calculate(sample_list, model_output, *args, **kwargs)[source]¶

Abstract method to be implemented by the child class. Takes in a SampleList and a dict returned by model as output and returns back a float tensor/number indicating value for this metric.

Parameters:	sample_list (SampleList) – SampleList provided by the dataloader for the current iteration. model_output (Dict) – Output dict from the model for the current SampleList
Returns:	Value of the metric.
Return type:	torch.Tensor\|float

class pythia.modules.metrics.CaptionBleu4Metric[source]¶

Metric for calculating caption accuracy using BLEU4 Score.

Key: caption_bleu4

bleu_score = <module 'nltk.translate.bleu_score' from '/home/docs/checkouts/readthedocs.org/user_builds/learnpythia/envs/stable/lib/python3.7/site-packages/nltk-3.4.1-py3.7.egg/nltk/translate/bleu_score.py'>¶

calculate(sample_list, model_output, *args, **kwargs)[source]¶

Calculate accuracy and return it back.

Parameters:	sample_list (SampleList) – SampleList provided by DataLoader for current iteration model_output (Dict) – Dict returned by model.
Returns:	bleu4 score.
Return type:	torch.FloatTensor

class pythia.modules.metrics.MeanRank[source]¶

Calculate MeanRank which specifies what was the average rank of the chosen candidate.

Key: mean_r.

calculate(sample_list, model_output, *args, **kwargs)[source]¶

Calculate Mean Rank and return it back.

Parameters:	sample_list (SampleList) – SampleList provided by DataLoader for current iteration model_output (Dict) – Dict returned by model.
Returns:	mean rank
Return type:	torch.FloatTensor

class pythia.modules.metrics.MeanReciprocalRank[source]¶

Calculate reciprocal of mean rank..

Key: mean_rr.

calculate(sample_list, model_output, *args, **kwargs)[source]¶

Calculate Mean Reciprocal Rank and return it back.

Parameters:	sample_list (SampleList) – SampleList provided by DataLoader for current iteration model_output (Dict) – Dict returned by model.
Returns:	Mean Reciprocal Rank
Return type:	torch.FloatTensor

class pythia.modules.metrics.Metrics(metric_list)[source]¶

Internally used by Pythia, Metrics acts as wrapper for handling calculation of metrics over various metrics specified by the model in the config. It initializes all of the metrics and when called it runs calculate on each of them one by one and returns back a dict with proper naming back. For e.g. an example dict returned by Metrics class: {'val/vqa_accuracy': 0.3, 'val/r@1': 0.8}

Parameters:	metric_list (List[ConfigNode]) – List of ConfigNodes where each ConfigNode specifies name and parameters of the metrics used.

class pythia.modules.metrics.RecallAt1[source]¶

Calculate Recall@1 which specifies how many time the chosen candidate was rank 1.

Key: r@1.

calculate(sample_list, model_output, *args, **kwargs)[source]¶

Calculate Recall@1 and return it back.

Parameters:	sample_list (SampleList) – SampleList provided by DataLoader for current iteration model_output (Dict) – Dict returned by model.
Returns:	Recall@1
Return type:	torch.FloatTensor

class pythia.modules.metrics.RecallAt10[source]¶

Calculate Recall@10 which specifies how many time the chosen candidate was among first 10 ranks.

Key: r@10.

calculate(sample_list, model_output, *args, **kwargs)[source]¶

Calculate Recall@10 and return it back.

Parameters:	sample_list (SampleList) – SampleList provided by DataLoader for current iteration model_output (Dict) – Dict returned by model.
Returns:	Recall@10
Return type:	torch.FloatTensor

class pythia.modules.metrics.RecallAt5[source]¶

Calculate Recall@5 which specifies how many time the chosen candidate was among first 5 rank.

Key: r@5.

calculate(sample_list, model_output, *args, **kwargs)[source]¶

Calculate Recall@5 and return it back.

Parameters:	sample_list (SampleList) – SampleList provided by DataLoader for current iteration model_output (Dict) – Dict returned by model.
Returns:	Recall@5
Return type:	torch.FloatTensor

class pythia.modules.metrics.RecallAtK(name='recall@k')[source]¶

calculate(sample_list, model_output, k, *args, **kwargs)[source]¶

Abstract method to be implemented by the child class. Takes in a SampleList and a dict returned by model as output and returns back a float tensor/number indicating value for this metric.

Parameters:	sample_list (SampleList) – SampleList provided by the dataloader for the current iteration. model_output (Dict) – Output dict from the model for the current SampleList
Returns:	Value of the metric.
Return type:	torch.Tensor\|float

class pythia.modules.metrics.VQAAccuracy[source]¶

Calculate VQAAccuracy. Find more information here

Key: vqa_accuracy.

calculate(sample_list, model_output, *args, **kwargs)[source]¶

Calculate vqa accuracy and return it back.

Parameters:	sample_list (SampleList) – SampleList provided by DataLoader for current iteration model_output (Dict) – Dict returned by model.
Returns:	VQA Accuracy
Return type:	torch.FloatTensor