Basic: Basic usage#
This guide will cover basic usage of bioimageloader
. Examples down below will load a
dataset from bioimageloader.collections
, and transform images with a image
augmentation library called albumentations.
Additionally, how to load them with multi-processing, when you need a computationally
heavy set of augmentations.
Load a dataset from collections#
Let’s load bioimageloader.collections.DSB2018
(2018 Data Science Bowl)
dataset for instance. Output from the iteration is a dictionary of strings as keys and
numpy array as values. You can control the output type through output
parameter.
1from bioimageloader.collections import DSB2018
2
3dsb2018 = DSB2018('./data/DSB2018')
4
5# iterate
6data: dict[str, numpy.ndarray]
7for data in dsb2018:
8 image = data['image']
9 mask = data['mask']
10 do_something(image, mask)
Data augmentation with albumentations#
Use transforms
keyword argument. Below example defines a set of random crop,
horizontal flip, and random contrast image augmentations. Check out A list of
transforms and their supported targets from
albumentations library.
Applying transformations often implies random shuffling and you may want to sample more
from datasets. Once you pass num_samples
, it will automatically perform shuffle and
set the sampling number to num_samples
.
1import albumentations as A
2from bioimageloader.collections import DSB2018
3
4transforms = A.Compose([
5 A.RandomCrop(width=256, height=256),
6 A.HorizontalFlip(p=0.5),
7 A.RandomBrightnessContrast(p=0.2),
8])
9num_samples = 2000 # DSB2018 training set has 670 images
10
11dsb2018 = DSB2018('./data/DSB2018', transforms=transforms, num_samples=num_samples)
12
13# iterate transformed images
14data: dict[str, numpy.ndarray]
15for data in dsb2018:
16 image = data['image']
17 mask = data['mask']
18 # these assertions will not throw AssertionError
19 assert image.shape[0] == 256 and image.shape[1] == 256
20 assert mask.shape[0] == 256 and mask.shape[1] == 256
21 do_something(image, mask)
Batch loading with multi-processing#
Batch loading is essential especially when you have a set of augmentations that requires heavy computation, or when you would like to run deep neural network models which can benefit from GPU.
Wrap a dataset with bioimageloader.BatchDataloader
and specify a batch size
as well as number of processes.
1import albumentations as A
2from bioimageloader.collections import DSB2018
3from bioimageloader import BatchDataloader
4
5heavy_transforms = A.Compose([
6 A.RandomCrop(width=256, height=256),
7 A.HorizontalFlip(p=0.5),
8 A.RandomBrightnessContrast(p=0.2),
9])
10# construct dset with transforms
11dsb2018 = DSB2018('./data/DSB2018', transforms=heavy_transforms)
12batch_loader = BatchDataloader(dsb2018,
13 batch_size=16,
14 drop_last=True,
15 num_workers=8)
16# iterate transformed images
17data: dict[str, numpy.ndarray]
18for data in dsb2018:
19 image = data['image']
20 mask = data['mask']
21 # these assertions will not throw AssertionError
22 assert image.shape[0] == 16 and mask.shape[0] == 16
23 assert image.shape[1] == 256 and image.shape[2] == 256
24 assert mask.shape[1] == 256 and mask.shape[2] == 256
25 do_something(image, mask)