tasks.processors¶
The processors exist in Pythia to make data processing pipelines in various datasets as similar as possible while allowing code reuse.
The processors also help maintain proper abstractions to keep only what matters
inside the dataset’s code. This allows us to keep the dataset get_item
logic really clean and no need about maintaining opinions about data type.
Processors can work on both images and text due to their generic structure.
To create a new processor, follow these steps:
- Inherit the
BaseProcessor
class. - Implement
_call
function which takes in a dict and returns a dict with same keys preprocessed as well as any extra keys that need to be returned. - Register the processor using
@registry.register_processor('name')
to registry where ‘name’ will be used to refer to your processor later.
In processor’s config you can specify preprocessor
option to specify
different kind of preprocessors you want in your dataset.
Let’s break down processor’s config inside a dataset (VQA2.0) a bit to understand different moving parts.
Config:
task_attributes:
vqa:
datasets:
- vqa2
dataset_attributes:
vqa2:
processors:
text_processor:
type: vocab
params:
max_length: 14
vocab:
type: intersected
embedding_name: glove.6B.300d
vocab_file: vocabs/vocabulary_100k.txt
answer_processor:
type: vqa_answer
params:
num_answers: 10
vocab_file: vocabs/answers_vqa.txt
preprocessor:
type: simple_word
params: {}
BaseDataset
will init the processors and they will available inside your
dataset with same attribute name as the key name, for e.g. text_processor will
be available as self.text_processor inside your dataset. As is with every module
in Pythia, processor also accept a ConfigNode
with a type and params
attributes. params defined the custom parameters for each of the processors.
By default, processor initialization process will also init preprocessor attribute
which can be a processor config in itself. preprocessor can be then be accessed
inside the processor’s functions.
Example:
from pythia.common.registry import registry
from pythia.tasks.processors import BaseProcessor
class MyProcessor(BaseProcessor):
def __init__(self, config, *args, **kwargs):
return
def __call__(self, item, *args, **kwargs):
text = item['text']
text = [t.strip() for t in text.split(" ")]
return {"text": text}
-
class
pythia.tasks.processors.
BBoxProcessor
(config, *args, **kwargs)[source]¶ Generates bboxes in proper format. Takes in a dict which contains “info” key which is a list of dicts containing following for each of the the bounding box
Example bbox input:
{ "info": [ { "bounding_box": { "top_left_x": 100, "top_left_y": 100, "width": 200, "height": 300 } }, ... ] }
This will further return a Sample in a dict with key “bbox” with last dimension of 4 corresponding to “xyxy”. So sample will look like following:
Example Sample:
Sample({ "coordinates": torch.Size(n, 4), "width": List[number], # size n "height": List[number], # size n "bbox_types": List[str] # size n, either xyxy or xywh. # currently only supports xyxy. })
-
class
pythia.tasks.processors.
BaseProcessor
(config, *args, **kwargs)[source]¶ Every processor in Pythia needs to inherit this class for compatability with Pythia. End user mainly needs to implement
__call__
function.Parameters: config (ConfigNode) – Config for this processor, containing type and params attributes if available.
-
class
pythia.tasks.processors.
CaptionProcessor
(config, *args, **kwargs)[source]¶ Processes a caption with start, end and pad tokens and returns raw string.
Parameters: config (ConfigNode) – Configuration for caption processor.
-
class
pythia.tasks.processors.
FastTextProcessor
(config, *args, **kwargs)[source]¶ FastText processor, similar to GloVe processor but returns FastText vectors.
Parameters: config (ConfigNode) – Configuration values for the processor.
-
class
pythia.tasks.processors.
GloVeProcessor
(config, *args, **kwargs)[source]¶ Inherits VocabProcessor, and returns GloVe vectors for each of the words. Maps them to index using vocab processor, and then gets GloVe vectors corresponding to those indices.
Parameters: config (ConfigNode) – Configuration parameters for GloVe same as VocabProcessor()
.
-
class
pythia.tasks.processors.
Processor
(config, *args, **kwargs)[source]¶ Wrapper class used by Pythia to initialized processor based on their
type
as passed in configuration. It retrieves the processor class registered in registry corresponding to thetype
key and initializes withparams
passed in configuration. All functions and attributes of the processor initialized are directly available via this class.Parameters: config (ConfigNode) – ConfigNode containing type
of the processor to be initialized andparams
of that procesor.
-
class
pythia.tasks.processors.
SimpleSentenceProcessor
(*args, **kwargs)[source]¶ Tokenizes a sentence and processes it.
-
tokenizer
¶ Type of tokenizer to be used.
Type: function
-
-
class
pythia.tasks.processors.
SimpleWordProcessor
(*args, **kwargs)[source]¶ Tokenizes a word and processes it.
-
tokenizer
¶ Type of tokenizer to be used.
Type: function
-
-
class
pythia.tasks.processors.
SoftCopyAnswerProcessor
(config, *args, **kwargs)[source]¶ Similar to Answer Processor but adds soft copy dynamic answer space to it. Read https://arxiv.org/abs/1904.08920 for extra information on soft copy and LoRRA.
Parameters: config (ConfigNode) – Configuration for soft copy processor.
-
class
pythia.tasks.processors.
VQAAnswerProcessor
(config, *args, **kwargs)[source]¶ Processor for generating answer scores for answers passed using VQA accuracy formula. Using VocabDict class to represent answer vocabulary, so parameters must specify “vocab_file”. “num_answers” in parameter config specify the max number of answers possible. Takes in dict containing “answers” or “answers_tokens”. “answers” are preprocessed to generate “answers_tokens” if passed.
Parameters: config (ConfigNode) – Configuration for the processor -
answer_vocab
¶ Class representing answer vocabulary
Type: VocabDict
-
compute_answers_scores
(answers_indices)[source]¶ Generate VQA based answer scores for answers_indices.
Parameters: answers_indices (torch.LongTensor) – tensor containing indices of the answers Returns: tensor containing scores. Return type: torch.FloatTensor
-
get_true_vocab_size
()[source]¶ True vocab size can be different from normal vocab size in some cases such as soft copy where dynamic answer space is added.
Returns: True vocab size. Return type: int
-
get_vocab_size
()[source]¶ Get vocab size of the answer vocabulary. Can also include soft copy dynamic answer space size.
Returns: size of the answer vocabulary Return type: int
-
-
class
pythia.tasks.processors.
VocabProcessor
(config, *args, **kwargs)[source]¶ Use VocabProcessor when you have vocab file and you want to process words to indices. Expects UNK token as “<unk>” and pads sentences using “<pad>” token. Config parameters can have
preprocessor
property which is used to preprocess the item passed andmax_length
property which points to maximum length of the sentence/tokens which can be convert to indices. If the length is smaller, the sentence will be padded. Parameters for “vocab” are necessary to be passed.Key: vocab
Example Config:
task_attributes: vqa: vqa2: processors: text_processor: type: vocab params: max_length: 14 vocab: type: intersected embedding_name: glove.6B.300d vocab_file: vocabs/vocabulary_100k.txt
Parameters: config (ConfigNode) – node containing configuration parameters of the processor -
vocab
¶ Vocab class object which is abstraction over the vocab file passed.
Type: Vocab
-