[TOC]
PyTorch Dataset and DataLoader
check the video
collate_fn(batch)
|
|
collate_fn would settle how to output the data batch
|
|
ConcatDataset (list(Dataset))
can be used for data that is stored in different files.
the ConcatDataset
will automatically concatenate each Dataset efficiently
IterableDataset
An iterable-style dataset is an instance of a subclass of IterableDataset
that implements the __iter__()
protocol, and represents an iterable over data samples. This type of datasets is particularly suitable for cases where random reads are expensive or even improbable, and where the batch size depends on the fetched data.
For example, such a dataset, when called iter(dataset)
, could return a stream of data reading from a database, a remote server, or even logs generated in real time.
Dataset for pickle
|
|