Read/ write AcquisitionData and ImageData#

NeXuS#

The CCPi Framework provides classes to read and write AcquisitionData and ImageData as NeXuS files.

# imports
from cil.io import NEXUSDataWriter, NEXUSDataReader

# initialise NEXUS Writer
writer = NEXUSDataWriter()
writer.set_up(data=my_data,
            file_name='tmp_nexus.nxs')
# write data
writer.write()

# read data
# initialize NEXUS reader
reader = NEXUSDataReader()
reader.set_up(file_name='tmp_nexus.nxs')
# load data
ad1 = reader.read()
# get AcquisitionGeometry
ag1 = reader.get_geometry()
class cil.io.NEXUSDataReader(file_name=None)[source]#

Create a reader for NeXus files.

Parameters:

file_name (str) – the full path to the NeXus file to read.

set_up(file_name=None)[source]#

Initialise reader.

Parameters:

file_name (str) – Full path to NeXus file

get_geometry()[source]#

Parse NEXUS file and return acquisition or reconstructed volume parameters, depending on file type.

Returns:

Acquisition or reconstructed volume parameters. Exact type depends on file content.

Return type:

AcquisitionGeometry or ImageGeometry

get_data_scale()[source]#

Parse NEXUS file and return the scale factor applied to compress the dataset.

Returns:

scale – The scale factor applied to compress the dataset

Return type:

float

get_data_offset()[source]#

Parse NEXUS file and return the offset factor applied to compress the dataset.

Returns:

offset – The offset factor applied to compress the dataset

Return type:

float

read_as_original()[source]#

Returns the compressed data from the file.

Returns:

output – The raw, compressed data. Exact type depends on file content.

Return type:

ImageData or AcquisitionData

read()[source]#

Returns the uncompressed data as numpy.float32.

Returns:

output – The uncompressed data. Exact type depends on file content.

Return type:

ImageData or AcquisitionData

load_data()[source]#

Alias of read.

See also

read

class cil.io.NEXUSDataWriter(data=None, file_name=None, compression=None)[source]#

Create a writer for NEXUS files.

Parameters:
  • data (AcquisitionData, ImageData,) – The dataset to write to file

  • file_name (os.path or string, default None) – The file name to write

  • compression (str, {'uint8', 'uint16', None}, default None) – The lossy compression to apply, default None will not compress data. uint8 or unit16 will compress to 8 and 16 bit dtypes respectively.

set_up(data=None, file_name=None, compression=None)[source]#

Set up the writer

data: AcquisitionData, ImageData,

The dataset to write to file

file_name: os.path or string, default None

The file name to write

compression: int, default 0

The lossy compression to apply, default 0 will not compress data. 8 or 16 will compress to 8 and 16 bit dtypes respectively.

write()[source]#

write dataset to disk


Nikon#

class cil.io.NikonDataReader(file_name=None, roi=None, normalise=True, mode='bin', fliplr=False)[source]#

Basic reader for xtekct files

Parameters:
  • file_name (str) – full path to .xtekct file

  • roi (dict, default=None) –

    dictionary with roi to load: {‘angle’: (start, end, step),

    ’horizontal’: (start, end, step), ‘vertical’: (start, end, step)}

  • normalise (bool, default=True) – normalises loaded projections by detector white level (I_0)

  • fliplr (bool, default = False,) – flip projections in the left-right direction (about vertical axis)

  • mode (str: {'bin', 'slice'}, default='bin') – In bin mode, ‘step’ number of pixels is binned together, values of resulting binned pixels are calculated as average. In ‘slice’ mode ‘step’ defines standard numpy slicing. Note: in general output array size in bin mode != output array size in slice mode

Notes

roi behaviour:

Files are stacked along axis_0. axis_1 and axis_2 correspond to row and column dimensions, respectively.

Files are stacked in alphabetic order.

To skip projections or to change number of projections to load, adjust ‘angle’. For instance, ‘angle’: (100, 300) will skip first 100 projections and will load 200 projections.

'angle': -1 is a shortcut to load all elements along axis.

start and end can be specified as None which is equivalent to start = 0 and end = load everything to the end, respectively. Start and end also can be negative.

get_geometry()[source]#

Return AcquisitionGeometry object

get_roi()[source]#

returns the roi

read()[source]#

Reads projections and returns AcquisitionData with corresponding geometry, arranged as [‘angle’, horizontal’] if a single slice is loaded and [‘vertical, ‘angle’, horizontal’] if more than 1 slice is loaded.

load_projections()[source]#

alias of read for backward compatibility

ZEISS#

class cil.io.ZEISSDataReader(file_name=None, roi=None)[source]#

Create a reader for ZEISS files

Parameters:
  • file_name (str) – file name to read

  • roi (dict, default None) – dictionary with roi to load for each axis: {'axis_labels_1': (start, end, step),'axis_labels_2': (start, end, step)}. axis_labels are defined by ImageGeometry and AcquisitionGeometry dimension labels.

Notes

roi behaviour:

For ImageData to skip files or to change number of files to load, adjust vertical. E.g. 'vertical': (100, 300) will skip first 100 files and will load 200 files.

'axis_label': -1 is a shortcut to load all elements along axis.

start and end can be specified as None which is equivalent to start = 0 and end = load everything to the end, respectively.

set_up(file_name, roi=None)[source]#

Set up the reader

Parameters:
  • file_name (str) – file name to read

  • roi (dict, default None) – dictionary with roi to load for each axis: {'axis_labels_1': (start, end, step),'axis_labels_2': (start, end, step)}. axis_labels are defined by ImageGeometry and AcquisitionGeometry dimension labels.

Notes

roi behaviour:

'axis_label': -1 is a shortcut to load all elements along axis.

start and end can be specified as None which is equivalent to start = 0 and end = load everything to the end, respectively.

Acquisition Data

The axis labels in the roi dict for AcquisitionData will be: {'angle':(...),'vertical':(...),'horizontal':(...)}

Image Data

The axis labels in the roi dict for ImageData will be: {'angle':(...),'vertical':(...),'horizontal':(...)}

To skip files or to change number of files to load, adjust vertical. E.g. 'vertical': (100, 300) will skip first 100 files and will load 200 files.

slice_metadata(metadata)[source]#

Slices metadata to configure geometry before reading data

read()[source]#

Reads projections and return Acquisition (TXRM) or Image (TXM) Data container

get_geometry()[source]#

Return Acquisition (TXRM) or Image (TXM) Geometry object

get_metadata()[source]#

return the metadata of the file

TIFF Reader/Writer#

class cil.io.TIFFStackReader(file_name=None, roi=None, transpose=False, mode='bin', dtype=<class 'numpy.float32'>)[source]#

Basic TIFF reader which loops through all tiff files in a specific folder and loads them in alphabetic order

Parameters:
  • file_name (str, abspath to folder, list) – Path to folder with tiff files, list of paths of tiffs, or single tiff file

  • roi (dictionary, default None) –

    dictionary with roi to load: ``{‘axis_0’: (start, end, step),

    ’axis_1’: (start, end, step), ‘axis_2’: (start, end, step)}``

    roi is specified for axes before transpose.

  • transpose (bool, default False) – Whether to transpose loaded images

  • mode (str, {'bin', 'slice'}, default 'bin'.) –

    Defines the ‘step’ in the roi parameter:

    In bin mode, ‘step’ number of pixels are binned together, values of resulting binned pixels are calculated as average.

    In ‘slice’ mode ‘step’ defines standard numpy slicing.

    Note: in general output array size in bin mode != output array size in slice mode

  • dtype (numpy type, string, default np.float32) – Requested type of the read image. If set to None it defaults to the type of the saved file.

Notes:#

roi behaviour:

Files are stacked along axis_0, in alphabetical order.

axis_1 and axis_2 correspond to row and column dimensions, respectively.

To skip files or to change number of files to load, adjust axis_0. For instance, 'axis_0': (100, 300) will skip first 100 files and will load 200 files.

'axis_0': -1 is a shortcut to load all elements along axis 0.

start and end can be specified as None which is equivalent to start = 0 and end = load everything to the end, respectively.

Start and end also can be negative.

roi is specified for axes before transpose.

Example:#

You can rescale the read data as rescaled_data = (read_data - offset)/scale with the following code:

>>> reader = TIFFStackReader(file_name = '/path/to/folder')
>>> rescaled_data = reader.read_rescaled(scale, offset)

Alternatively, if TIFFWriter has been used to save data with lossy compression, then you can rescale the read data to approximately the original data with the following code:

>>> writer = TIFFWriter(file_name = '/path/to/folder', compression='uint8')
>>> writer.write(original_data)
>>> reader = TIFFStackReader(file_name = '/path/to/folder')
>>> about_original_data = reader.read_rescaled()
read()[source]#

Reads images and return numpy array

read_as_ImageData(image_geometry)[source]#

reads the TIFF stack as an ImageData with the provided geometry

Notice that the data will be reshaped to what requested in the geometry but there is no warranty that the data will be read in the right order! In facts you can reshape a (2,3,4) array as (3,4,2), however we do not check if the reshape leads to sensible data.

read_as_AcquisitionData(acquisition_geometry)[source]#

reads the TIFF stack as an AcquisitionData with the provided geometry

Notice that the data will be reshaped to what requested in the geometry but there is no warranty that the data will be read in the right order! In facts you can reshape a (2,3,4) array as (3,4,2), however we do not check if the reshape leads to sensible data.

read_scale_offset()[source]#

Reads the scale and offset from a json file in the same folder as the tiff stack

This is a courtesy method that will work only if the tiff stack is saved with the TIFFWriter

Returns:#

tuple: (scale, offset)

read_rescaled(scale=None, offset=None)[source]#

Reads the TIFF stack and rescales it with the provided scale and offset, or with the ones in the json file if not provided

This is a courtesy method that will work only if the tiff stack is saved with the TIFFWriter

Parameters:#

scale: float, default None

scale to apply to the data. If None, the scale will be read from the json file saved by TIFFWriter.

offset: float, default None

offset to apply to the data. If None, the offset will be read from the json file saved by TIFFWriter.

Returns:#

numpy.ndarray in float32

class cil.io.TIFFWriter(data=None, file_name=None, counter_offset=0, compression=None)[source]#

Write a DataSet to disk as a TIFF file or stack of TIFF files

Parameters:
  • data (DataContainer, AcquisitionData or ImageData) – This represents the data to save to TIFF file(s)

  • file_name (string) – This defines the file name prefix, i.e. the file name without the extension.

  • counter_offset (int, default 0.) – counter_offset indicates at which number the ordinal index should start. For instance, if you have to save 10 files the index would by default go from 0 to 9. By counter_offset you can offset the index: from counter_offset to 9+counter_offset

  • compression (str, default None. Accepted values None, 'uint8', 'uint16') – The lossy compression to apply. The default None will not compress data. ‘uint8’ or ‘unit16’ will compress to unsigned int 8 and 16 bit respectively.

Note

If compression uint8 or unit16 are used, the scale and offset used to compress the data are saved in a file called scaleoffset.json in the same directory as the TIFF file(s).

The original data can be obtained by: original_data = (compressed_data - offset) / scale

Note

In the case of 3D or 4D data this writer will save the data as a stack of multiple TIFF files, not as a single multi-page TIFF file.

write()[source]#

Write data to disk

RAW File Writer#

class cil.io.RAWFileWriter(data, file_name, compression=None)[source]#

Writer to write DataContainer (or subclass AcquisitionData, ImageData) to disk as a binary blob

Parameters:
  • data (DataContainer, AcquisitionData or ImageData) – This represents the data to save to TIFF file(s)

  • file_name (string) – This defines the file name prefix, i.e. the file name without the extension.

  • compression (str, default None. Accepted values None, 'uint8', 'uint16') – The lossy compression to apply. The default None will not compress data. ‘uint8’ or ‘unit16’ will compress to unsigned int 8 and 16 bit respectively.

This writer will also write a text file with the minimal information necessary to read the data back in. This text file will need to reside in the same directory as the raw file.

The text file will look something like this:

[MINIMAL INFO]
file_name = filename.raw
data_type = <u2
shape = (6, 5, 4)
is_fortran = False

[COMPRESSION]
scale = 550.7142857142857
offset = -0.0

The data_type describes the data layout when packing and unpacking data. This can be read as numpy dtype with np.dtype('<u2').

Example

Example of using the writer with compression to uint8:

>>> from cil.io import RAWFileWriter
>>> writer = RAWFileWriter(data=data, file_name=fname, compression='uint8')
>>> writer.write()

Example

Example of reading the data from the ini file:

>>> config = configparser.ConfigParser()
>>> inifname = "file_name.ini"
>>> config.read(inifname)
>>> read_dtype = config['MINIMAL INFO']['data_type']
>>> dtype = np.dtype(read_dtype)
>>> fname = config['MINIMAL INFO']['file_name']
>>> read_array = np.fromfile(fname, dtype=read_dtype)
>>> read_shape = eval(config['MINIMAL INFO']['shape'])
>>> scale = float(config['COMPRESSION']['scale'])
>>> offset = float(config['COMPRESSION']['offset'])

Note

If compression uint8 or unit16 are used, the scale and offset used to compress the data are saved in the ini file in the same directory as the raw file, in the “COMPRESSION” section .

The original data can be obtained by: original_data = (compressed_data - offset) / scale

Note

Data is always written in ‘C’ order independent of the order of the original data, https://numpy.org/doc/stable/reference/generated/numpy.ndarray.tofile.html#numpy.ndarray.tofile,

write()[source]#

Write data to disk

Return Home

HDF5 Utilities#

Utility functions to browse HDF5 files. These allow you to browse groups and read in datasets as numpy.ndarrays.

A CIL geometry and dataset must be constructed manually from the array and metadata.

class cil.io.utilities.HDF5_utilities[source]#

Utility methods to read in from a generic HDF5 file and extract the relevant data

static print_metadata(filename, group='/', depth=-1)[source]#

Prints the file metadata

Parameters:
  • filename (str) – The full path to the HDF5 file

  • group ((str), default: '/') – a specific group to print the metadata for, this defaults to the root group

  • depth (int, default -1) – depth of group to output the metadata for, -1 is fully recursive

static get_dataset_metadata(filename, dset_path)[source]#

Returns the dataset metadata as a dictionary

Parameters:
  • filename (str) – The full path to the HDF5 file

  • dset_path (str) – The internal path to the requested dataset

Returns:

ndim, shape, size, dtype, nbytes, compression, chunks, is_virtual

Return type:

A dictionary containing keys, values are None if attribute can’t be read.

static read(filename, dset_path, source_sel=None, dtype=<class 'numpy.float32'>)[source]#

Reads a dataset entry and returns a numpy array with the requested data

Parameters:
  • filename (str) – The full path to the HDF5 file

  • dset_path (str) – The internal path to the requested dataset

  • source_sel (tuple of slice objects, optional) – The selection of slices in each source dimension to return

  • dtype (numpy type, default np.float32) – the numpy data type for the returned array

Returns:

The requested data

Return type:

numpy.ndarray

Note

source_sel takes a tuple of slice objects to defining crop and slicing behaviour

This can be constructed using numpy indexing, i.e. the following lines are equivalent.

>>> source_sel = (slice(2, 4, None), slice(2, 10, 2))
>>> source_sel = np.s_[2:4,2:10:2]
static read_to(filename, dset_path, out, source_sel=None, dest_sel=None)[source]#

Reads a dataset entry and directly fills a numpy array with the requested data

Parameters:
  • filename (str) – The full path to the HDF5 file

  • dset_path (str) – The internal path to the requested dataset

  • out (numpy.ndarray) – The output array to be filled

  • source_sel (tuple of slice objects, optional) – The selection of slices in each source dimension to return

  • dest_sel (tuple of slice objects, optional) – The selection of slices in each destination dimension to fill

Note

source_sel and dest_sel take a tuple of slice objects to defining crop and slicing behaviour

This can be constructed using numpy indexing, i.e. the following lines are equivalent.

>>> source_sel = (slice(2, 4, None), slice(2, 10, 2))
>>> source_sel = np.s_[2:4,2:10:2]

Return Home