More: Modifying existing collections#
All collections are python class, meaning you can make a subclass out of it in order to modify default behaviors, or implement new methods and attributes.
Let’s say, you would like to use bioimageloader.collections.BBBC007
dataset to train a U-Net. While BBBC007
does not provide mask annotation,
which U-Net typically requires, it comes with outline annotation that we can
easily convert into mask annotation.
All you have to do is to copy class definition of BBBC007 source
and to override get_mask()
method. Go to the source link and compare it with
codes below. In summary,
Copy class source
Change class name to something else
Override
get_mask()
method(optional) Give new argument(s) and write docs
Below codes use scipy.ndimage.binary_fill_holes()
from scipy
package.
1from pathlib import Path
2from typing import List, Optional, Sequence, Union
3
4import albumentations
5import numpy as np
6import tifffile
7
8from bioimageloader.collections import BBBC007
9from bioimageloader.utils import stack_channels
10
11# ndi.binary_fill_holes() will fill outline annotation
12import scipy.ndimage as ndi
13
14
15class BBBC007variant(BBBC007):
16 """Drosophila Kc167 cells (VARIANT)
17
18 NU-Net needs outline to be filled and only need DNA annotation.
19
20 Outline annotation
21
22 Images were acquired using a motorized Zeiss Axioplan 2 and a Axiocam MRm
23 camera, and are provided courtesy of the laboratory of David Sabatini at the
24 Whitehead Institute for Biomedical Research. Each image is roughly 512 x 512
25 pixels, with cells roughly 25 pixels in dimeter, and 80 cells per image on
26 average. The two channels (DNA and actin) of each image are stored in
27 separate gray-scale 8-bit TIFF files.
28
29 Notes
30 -----
31 - [4, 5, 11, 14, 15] have 3 channels but they are just all gray scale
32 images. Extra work is required in get_image().
33
34 .. [1] Jones et al., in the Proceedings of the ICCV Workshop on Computer
35 Vision for Biomedical Image Applications (CVBIA), 2005.
36 .. [2] [BBBC007](https://bbbc.broadinstitute.org/BBBC007)
37 """
38 # Dataset's acronym
39 acronym = 'BBBC007'
40
41 def __init__(
42 self,
43 root_dir: str,
44 *,
45 output: str = 'both',
46 transforms: Optional[albumentations.Compose] = None,
47 num_samples: Optional[int] = None,
48 # specific to this dataset
49 image_ch: Sequence[str] = ('DNA', 'actin',),
50 anno_ch: Sequence[str] = ('DNA',),
51 # arguments for VARIANT
52 fill_holes: bool = True,
53 **kwargs
54 ):
55 """
56 Parameters
57 ----------
58 root_dir : str
59 Path to root directory
60 output : {'image', 'mask', 'both'} (default: 'both')
61 Change outputs. 'both' returns {'image': image, 'mask': mask}.
62 transforms : albumentations.Compose, optional
63 An instance of Compose (albumentations pkg) that defines
64 augmentation in sequence.
65 num_samples : int, optional
66 Useful when ``transforms`` is set. Define the total length of the
67 dataset. If it is set, it overwrites ``__len__``.
68 image_ch : {'DNA', 'actin'} (default: ('DNA', 'actin'))
69 Which channel(s) to load as image. Make sure to give it as a
70 Sequence when choose a single channel.
71 anno_ch : {'DNA', 'actin'} (default: ('DNA',))
72 Which channel(s) to load as annotation. Make sure to give it as a
73 Sequence when choose a single channel.
74 fill_holes : bool (default: True)
75 Fill outline annotation using `scipy.ndimage.binary_fill_holes()`
76
77 See Also
78 --------
79 BBBC007 : Super class
80 MaskDataset : Super class
81 DatasetInterface : Interface
82 """
83 # Pass existing arguments to its super class
84 super().__init__(
85 root_dir=root_dir,
86 output=output,
87 transforms=transforms,
88 num_samples=num_samples,
89 image_ch=image_ch,
90 anno_ch=anno_ch,
91 **kwargs
92 )
93 # arguments for VARIANT
94 self.fill_holes = fill_holes
95
96 # override
97 def get_mask(self, p: Union[Path, List[Path]]) -> np.ndarray:
98 if isinstance(p, Path):
99 mask = tifffile.imread(p)
100 else:
101 mask = stack_channels(tifffile.imread, p)
102 # VARIANT behavior
103 if self.fill_holes:
104 mask = ndi.binary_fill_holes(mask)
105 # output.dtype=bool and bool is not well handled by albumentations
106 return mask.astype(np.float32)