tasks.processors¶

The processors exist in Pythia to make data processing pipelines in various datasets as similar as possible while allowing code reuse.

The processors also help maintain proper abstractions to keep only what matters inside the dataset’s code. This allows us to keep the dataset get_item logic really clean and no need about maintaining opinions about data type. Processors can work on both images and text due to their generic structure.

To create a new processor, follow these steps:

Inherit the BaseProcessor class.
Implement _call function which takes in a dict and returns a dict with same keys preprocessed as well as any extra keys that need to be returned.
Register the processor using @registry.register_processor('name') to registry where ‘name’ will be used to refer to your processor later.

In processor’s config you can specify preprocessor option to specify different kind of preprocessors you want in your dataset.

Let’s break down processor’s config inside a dataset (VQA2.0) a bit to understand different moving parts.

Config:

task_attributes:
    vqa:
        datasets:
        - vqa2
        dataset_attributes:
            vqa2:
                processors:
                  text_processor:
                    type: vocab
                    params:
                      max_length: 14
                      vocab:
                        type: intersected
                        embedding_name: glove.6B.300d
                        vocab_file: vocabs/vocabulary_100k.txt
                      answer_processor:
                        type: vqa_answer
                        params:
                          num_answers: 10
                          vocab_file: vocabs/answers_vqa.txt
                          preprocessor:
                            type: simple_word
                            params: {}

BaseDataset will init the processors and they will available inside your dataset with same attribute name as the key name, for e.g. text_processor will be available as self.text_processor inside your dataset. As is with every module in Pythia, processor also accept a ConfigNode with a type and params attributes. params defined the custom parameters for each of the processors. By default, processor initialization process will also init preprocessor attribute which can be a processor config in itself. preprocessor can be then be accessed inside the processor’s functions.

Example:

from pythia.common.registry import registry
from pythia.tasks.processors import BaseProcessor


class MyProcessor(BaseProcessor):
    def __init__(self, config, *args, **kwargs):
        return

    def __call__(self, item, *args, **kwargs):
        text = item['text']
        text = [t.strip() for t in text.split(" ")]
        return {"text": text}

class pythia.tasks.processors.BBoxProcessor(config, *args, **kwargs)[source]¶

Generates bboxes in proper format. Takes in a dict which contains “info” key which is a list of dicts containing following for each of the the bounding box

Example bbox input:

{
    "info": [
        {
            "bounding_box": {
                "top_left_x": 100,
                "top_left_y": 100,
                "width": 200,
                "height": 300
            }
        },
        ...
    ]
}

This will further return a Sample in a dict with key “bbox” with last dimension of 4 corresponding to “xyxy”. So sample will look like following:

Example Sample:

Sample({
    "coordinates": torch.Size(n, 4),
    "width": List[number], # size n
    "height": List[number], # size n
    "bbox_types": List[str] # size n, either xyxy or xywh.
    # currently only supports xyxy.
})

class pythia.tasks.processors.BaseProcessor(config, *args, **kwargs)[source]¶

Every processor in Pythia needs to inherit this class for compatability with Pythia. End user mainly needs to implement __call__ function.

Parameters:	config (ConfigNode) – Config for this processor, containing type and params attributes if available.

class pythia.tasks.processors.CaptionProcessor(config, *args, **kwargs)[source]¶

Processes a caption with start, end and pad tokens and returns raw string.

Parameters:	config (ConfigNode) – Configuration for caption processor.

class pythia.tasks.processors.FastTextProcessor(config, *args, **kwargs)[source]¶

FastText processor, similar to GloVe processor but returns FastText vectors.

Parameters:	config (ConfigNode) – Configuration values for the processor.

class pythia.tasks.processors.GloVeProcessor(config, *args, **kwargs)[source]¶

Inherits VocabProcessor, and returns GloVe vectors for each of the words. Maps them to index using vocab processor, and then gets GloVe vectors corresponding to those indices.

Parameters:	config (ConfigNode) – Configuration parameters for GloVe same as `VocabProcessor()`.

class pythia.tasks.processors.MultiHotAnswerFromVocabProcessor(config, *args, **kwargs)[source]¶

compute_answers_scores(answers_indices)[source]¶

Generate VQA based answer scores for answers_indices.

Parameters:	answers_indices (torch.LongTensor) – tensor containing indices of the answers
Returns:	tensor containing scores.
Return type:	torch.FloatTensor

class pythia.tasks.processors.Processor(config, *args, **kwargs)[source]¶

Wrapper class used by Pythia to initialized processor based on their type as passed in configuration. It retrieves the processor class registered in registry corresponding to the type key and initializes with params passed in configuration. All functions and attributes of the processor initialized are directly available via this class.

Parameters:	config (ConfigNode) – ConfigNode containing `type` of the processor to be initialized and `params` of that procesor.

class pythia.tasks.processors.SimpleSentenceProcessor(*args, **kwargs)[source]¶

Tokenizes a sentence and processes it.

tokenizer¶

Type of tokenizer to be used.

Type:	function

class pythia.tasks.processors.SimpleWordProcessor(*args, **kwargs)[source]¶

Tokenizes a word and processes it.

tokenizer¶

Type of tokenizer to be used.

Type:	function

class pythia.tasks.processors.SoftCopyAnswerProcessor(config, *args, **kwargs)[source]¶

Similar to Answer Processor but adds soft copy dynamic answer space to it. Read https://arxiv.org/abs/1904.08920 for extra information on soft copy and LoRRA.

Parameters:	config (ConfigNode) – Configuration for soft copy processor.

get_true_vocab_size()[source]¶

Actual vocab size which only include size of the vocabulary file.

Returns:	Actual size of vocabs.
Return type:	int

get_vocab_size()[source]¶

Size of Vocab + Size of Dynamic soft-copy based answer space

Returns:	Size of vocab + size of dynamic soft-copy answer space.
Return type:	int

class pythia.tasks.processors.VQAAnswerProcessor(config, *args, **kwargs)[source]¶

Processor for generating answer scores for answers passed using VQA accuracy formula. Using VocabDict class to represent answer vocabulary, so parameters must specify “vocab_file”. “num_answers” in parameter config specify the max number of answers possible. Takes in dict containing “answers” or “answers_tokens”. “answers” are preprocessed to generate “answers_tokens” if passed.

Parameters:	config (ConfigNode) – Configuration for the processor

answer_vocab¶

Class representing answer vocabulary

Type:	VocabDict

compute_answers_scores(answers_indices)[source]¶

Generate VQA based answer scores for answers_indices.

Parameters:	answers_indices (torch.LongTensor) – tensor containing indices of the answers
Returns:	tensor containing scores.
Return type:	torch.FloatTensor

get_true_vocab_size()[source]¶

True vocab size can be different from normal vocab size in some cases such as soft copy where dynamic answer space is added.

Returns:	True vocab size.
Return type:	int

get_vocab_size()[source]¶

Get vocab size of the answer vocabulary. Can also include soft copy dynamic answer space size.

Returns:	size of the answer vocabulary
Return type:	int

idx2word(idx)[source]¶

Index to word according to the vocabulary.

Parameters:	idx (int) – Index to be converted to the word.
Returns:	Word corresponding to the index.
Return type:	str

word2idx(word)[source]¶

Convert a word to its index according to vocabulary

Parameters:	word (str) – Word to be converted to index.
Returns:	Index of the word.
Return type:	int

class pythia.tasks.processors.VocabProcessor(config, *args, **kwargs)[source]¶

Use VocabProcessor when you have vocab file and you want to process words to indices. Expects UNK token as “<unk>” and pads sentences using “<pad>” token. Config parameters can have preprocessor property which is used to preprocess the item passed and max_length property which points to maximum length of the sentence/tokens which can be convert to indices. If the length is smaller, the sentence will be padded. Parameters for “vocab” are necessary to be passed.

Key: vocab

Example Config:

task_attributes:
    vqa:
        vqa2:
            processors:
              text_processor:
                type: vocab
                params:
                  max_length: 14
                  vocab:
                    type: intersected
                    embedding_name: glove.6B.300d
                    vocab_file: vocabs/vocabulary_100k.txt

Parameters:	config (ConfigNode) – node containing configuration parameters of the processor

vocab¶

Vocab class object which is abstraction over the vocab file passed.

Type:	Vocab

get_pad_index()[source]¶

Get index of padding <pad> token in vocabulary.

Returns:	index of the padding token.
Return type:	int

get_vocab_size()[source]¶

Get size of the vocabulary.

Returns:	size of the vocabulary.
Return type:	int