diffractem.proc2d module¶

diffractem.proc2d.analyze_and_correct(imgs, opts, correct_non_hits=False, reference=None, pxmask=None)[source]¶

Analyzes a diffraction pattern (centering and peak finding), and immediately applies a correction.

This function combines get_pattern_info and correct_image, but works differently in that it does not inherently handle any lazy/parallel computations: it only simply loops over a numpy array. It is hence especially useful to check if the preprocessing pipeline works on a small set, or to embed it into dask delayed objects for parallel execution outside the function, which may be faster than get_pattern_info + correct_image (see example below).

Parameters

imgs (np.ndarray) – Input image stack as numpy array
opts (PreProcOpts) – pre-processing options
correct_non_hits (bool, optional) – Apply correction also to images that do not have sufficient Bragg spots in them (as defined by opts.min_peaks). Defaults to False.
reference (Union[None, Union[np.ndarray, str]], optional) – Reference image as numpy array or TIF file name. If None, read file defined in options. Defaults to None.
pxmask (Union[None, Union[np.ndarray, str]], optional) – Pixel mask image as numpy array or TIF file name. If None, read file defined in options. Defaults to None.

Returns

Corrected image stack and pattern info structure, as: returned by correct_image and get_pattern_info, respectively.

Return type

Tuple[np.ndarray, dict]

Example

To run a parallel computation efficiently, use this function like

>>> results = [dask.delayed(proc2d.analyze_and_correct)(img_chunk, opts)                     for img_chunk in img_stack.to_delayed().ravel()]
>>> dask.compute(results)

diffractem.proc2d.apply_flatfield(img, reference, keep_type=True, ref_smooth_range=None, normalize_reference=False)[source]¶

Corrects the detector response by dividing the images in the image (stack) by a reference image (gain reference image), which should vary around 1.

Parameters

img (Union[np.ndarray, da.Array]) – Input image
reference (Union[np.ndarray, str]) – array containing the reference image, or filename of a TIF file containing the reference image
keep_type (bool, optional) – Keep the image data type, that is, round the pixel values back to integers if the input is an integer image. If False, the output image will always be a float. Defaults to True.
ref_smooth_range (Optional[float], optional) – If not None, applies a Gaussian blur to the reference image before correction, use this parameter to set its width. Defaults to None.
normalize_reference (bool, optional) – Re-normalize the reference image such that its average value is exactly 1. Defaults to False.

Returns

flatfield-corrected image

Return type

np.ndarray

diffractem.proc2d.apply_saturation_correction(img, exp_time, dead_time=0.0019, gap_factor=2)[source]¶

Apply detector correction function to image. Should ideally be done even before flatfield. Uses a 5th order polynomial approximation to the Lambert function, which is appropriate for a paralyzable detector, up to the point where its signal starts inverting (which is where nothing can be done anymore)

The default dead time value of 1.9 microseconds has been determined for a Medipix3 sensor.

Parameters

img (np.ndarray) – Input image or image stack
exp (float) – Exposure time in ms
dead_time (float, optional) – Dead time of detector in ms. Defaults to 1.9e-3.
gap_factor (float, optional) – Factor to scale dead time for gap pixels. Defaults to 2.4.

diffractem.proc2d.apply_virtual_detector(img, r_inner, r_outer, x0=None, y0=None)[source]¶

Apply a “virtual STEM detector” to stack, with given inner and outer radii. Returns the mean value of all pixels that fall inside this annulus.

Parameters

img (np.ndarray) – input image (or stack thereof)
r_inner (float) – Inner radius
r_outer (float) – Outer radius
x0 (float) – Beam center position along x. If None, assumes center of image. Defaults to None. Should follow CXI convention, i.e. relative to pixel center, not corner.
y0 (float) – Similar for y

Returns

mean value of pixels inside the annulus defined by r_inner and r_outer

Return type

float

diffractem.proc2d.center_image(imgs, x0, y0, xsize, ysize, padval=None, parallel=True)[source]¶

Shifts a stack of images, such that the original image coordinates x0, y0 are in the center of the output image, which has a size of xsize, ysize.

This function is typically used to change diffraction images such that the zero-order beam sits in the center of the image. The size of the output image should be sufficiently larger as to not truncate the shifted diffraction pattern.

Note

The coordinates in this function refer to pixel centers (CXI convention), not pixel corners (CrystFEL convention). I.e., if shifting based on CrystFEL output or similar, the shifts must be increased by 0.5.

Parameters

imgs (Union[np.ndarray, da.Array]) – Input image stack
x0 (Union[np.ndarray, da.Array]) – x position in input image to be shifted to the center of the output image
y0 (Union[np.ndarray, da.Array]) – y position in input image to be shifted to the center of the output image
xsize (int) – x size of the output image
ysize (int) – y size of the output image
padval (Union[float, int, None], optional) – value of the pixels used to pad the output image. If None, use nan for float images and -1 for integer images. Defaults to None.
parallel (bool, optional) – execute operation in parallel. Defaults to True.

Returns

output image stack of size (ysize, xsize) with centered diffraction patterns

Return type

Union[np.ndarray, da.Array]

diffractem.proc2d.center_of_mass(img, threshold=0.0)[source]¶

Returns the center of mass of an image using all the pixels larger than the threshold. Automatically skips values below threshold. Fast for sparse images, for more crowded ones center_of_mass2 may be faster.

Parameters

img (np.ndarray) – Input image
threshold (float, optional) – minimum pixel value to include. Defaults to 0.0.

Returns

[x0, y0] -> image center of mass

Return type

np.ndarray

diffractem.proc2d.center_of_mass2(img, threshold=None)[source]¶

Returns the center of mass of an image using all the pixels larger than the threshold. Automatically skips values below threshold. Can be faster than center_of_mass for crowded images (just try it out).

Parameters

img (np.ndarray) – Input image
threshold (float, optional) – minimum pixel value to include. If None, does not apply a threshold. Defaults to None.

Returns

[x0, y0] -> image center of mass

Return type

np.ndarray

diffractem.proc2d.correct_dead_pixels(img, pxmask, strategy='interpolate', interp_range=1, replace_val=None, mask_gaps=False, edge_mask_x=(100, 30), edge_mask_y=0, invert_mask=False)[source]¶

Corrects a set of images for dead pixels by either replacing values with a constant, or interpolation from a Gaussian-smoothed version of the image. It requires a binary array (pxmask) which is 1 (or 255 or True) for dead pixels. The function accepts a 3D array where the first dimension corresponds to a stack/movie.

Parameters

img (np.ndarray) – the image or image stack (first dimension is stack). For strategy==’replace’ it can be a dask or numpy array, otherwise numpy only.
pxmask (Union[np.ndarray, str]) – pixel mask with values as described above, or name of a TIF file containing the pixel mask
strategy (str, optional) – ‘interpolate’ or ‘replace’. Defaults to ‘interpolate’.
interp_range (int, optional) – range of interpolation for ‘interpolate’ strategy, in pixels. Defaults to 1.
replace_val (Union[float, int], optional) – replacement value for ‘replace’ strategy. If None, use -1 for integer images and nan for float images. Defaults to None.
mask_gaps (bool, optional) – mask gaps between detector panels as returned by the gap_pixels() function. Defaults to False.
edge_mask_x (int, optional) – Declare this number of pixels near the edges along x as invalid and replace them with replaceval. Defaults to 70.
edge_mask_y (int, optional) – Declare this number of pixels near the edges along y as invalid and replace them with replaceval. Defaults to 0.
invert_mask (bool, optional) – invert the pixel mask, i.e., invalid pixels are zero/False. Defaults to False.

Returns

dead-pixel corrected image. Can be da.Array for ‘replace’ strategy.

Return type

np.ndarray

diffractem.proc2d.correct_image(img, opts, x0=None, y0=None, peakinfo=None, reference=None, pxmask=None)[source]¶

Runs correction pipeline on stack of diffraction images (numpy or dask).

The correction pipeline comprises flat-field, saturation and dead-pixel correction, as well as background subtraction, optionally including exclusion of diffraction peaks for computation of the background (recommended).

Note

This function essentially wraps proc2d._get_corr_image with smart features to take care of dask input arrays. If you want to change the correction pipeline, that is the function to modify.

Parameters

img (Union[np.ndarray, da.Array]) – Diffraction pattern stack
opts (PreProcOpts) – Pre-processing options. Options used are: (…)
x0 (Union[None, np.ndarray, da.Array, pd.Series], optional) – Pattern X centers (None: use image center). Defaults to None.
y0 (Union[None, np.ndarray, da.Array, pd.Series], optional) – Pattern Y centers (None: use image center). Defaults to None.
peakinfo (Union[None, Dict[str, Union[np.ndarray, da.Array]]], optional) – Diffraction peak dict in CXI format (None: no peak exclusion during background subtraction). Defaults to None.
reference (Union[None, Union[np.ndarray, str]], optional) – Flat-field reference (None: use reference file specified in options). Defaults to None.
pxmask (Union[None, Union[np.ndarray, str]], optional) – Pixel mask reference (default: use reference file specified in options). Defaults to None.

Returns

Corrected image stack of identical dimension as input stack.

Return type

Union[np.ndarray, da.Array]

diffractem.proc2d.cut_peaks(img, nPeaks, peakXPosRaw, peakYPosRaw, radius=2, replaceval=None)[source]¶

Cuts peaks out of an image and replaces them with replaceval. Peak positions are provided in CXI format.

This function is mainly interesting for calculation of radial profiles, ignoring Bragg peaks.

Parameters

img (np.ndarray) – Input image (or stack thereof)
nPeaks (np.ndarray) – number of peaks
peakXPosRaw (np.ndarray) – peak X positions
peakYPosRaw (np.ndarray) – peak y positions
radius (int, optional) – Radius of circle within which image values are replaced around each peak. Defaults to 2.
replaceval (Union[int, float, None], optional) – Value to paint into the circles. If None, uses -1 on integer images and np.nan otherwise. Defaults to None.

Returns

Image with cut-out peaks.

Return type

np.ndarray

diffractem.proc2d.func_lorentz(p, x, y)[source]¶

Function that returns a Student’t distribution or generalised Cauchy distribution in Two Dimensions(x,y):

amp * [(1 + ((x-x_0)/scale)**2) + (1 + ((y-y_0)/scale)**2)] ** (-shape/2)

Parameters

p (Union[list, tuple, np.ndarray]) – Parameter array: [amp, x_0, y_0, scale, shape]
x (Union[float, np.ndarray]) – x coordinate(s)
y (Union[float, np.ndarray]) – y coordinate(s)

Returns

function value at (x, y)

Return type

Union[float, np.ndarray]

diffractem.proc2d.get_pattern_info(img, opts, client=None, reference=None, pxmask=None, centers=None, lorentz_fit=True, lazy=False, sync=True, errors='raise', via_array=False, output_file=None, shots=None, dummy_stack_name='corrected')[source]¶

‘Macro’ function for getting information about diffraction patterns.

get_pattern_info finds diffraction peaks and computes information such as pattern center on a given diffraction pattern or stack thereof. By default (lazy=False and sync=True) it will return a pandas DataFrame containing general information on each pattern, and a dict holding the found peaks in CXI format.

The options for preprocessing are passed as a PreProcOpts object.

Note

This function is essentially a smart wrapper around prof2d._generate_pattern_info. If you’d like to change what is actually calculated and how, that is the function to modify!

Note

As this function is computationally heavy, it is very advisable to use a dask.distributed cluster for computation, with a client object supplied to the function call.

Parameters

img (Union[np.ndarray, da.Array]) – stack of diffraction patterns, typically a dask array
opts (PreProcOpts) – pre-processing options.
client (Optional[Client], optional) – Client object for dask.distributed cluster. If None, runs computation by simply calling compute on the stack dask array (discouraged). Defaults to None.
reference (Optional[np.ndarray], optional) – Flat-field reference image. If None, load the one specified in preprocessing options. Defaults to None.
pxmask (Optional[np.ndarray], optional) – Pixel mask image. If None, load the one specified in preprocessing options. Defaults to None.
centers (np.array or da.Array, optional) – N x 2 matrix with known centers of all diffraction patterns. If set, the center-of-mass and Lorentz fit steps are skipped. Depending on the setting of opts.friedel_refine, Friedel-mate center refinement is still performed. Defaults to None.
lazy (bool, optional) – Return dask.delayed objects for pattern info generation tasks instead of the final results. Mostly useful for debugging or embedding into more complex workflows. Defaults to False.
sync (bool, optional) – Immediately compute pattern info. If False, returns futures to pattern info dictionaries instead of DataFrame and peak dict. Defaults to True.
errors (str, optional) – Behavior if errors arise during eager computation (i.e., lazy=False, sync=True). If ‘raise’, errors are raised, if ‘skip’, they are skipped, and the final data is missing the corresponding shots, which needs to be handled downstream to avoid making a mess. Defaults to ‘raise’.
via_array (bool, optional) – Modify calculation such that it avoids dask.delayed objects. This drastically improves the scheduling behavior for large datasets. It is also required if you supply the pattern centers to the function. However, precludes the use of lazy and sync. Defaults to False.
output_file (str, optional) – Filename to store calculation results into. The file will be a valid diffractem-type data file that can be loaded using Dataset objects.
shots (pd.DataFrame, optional) – Dataframe of shot data of same height as the image array. If not None, its columns will be joined to those of the shot data for storing the results into the output file.
dummy_ds_name (str, optional) – Name of virtual data set to be written into the output file in order to fake data, if required by another program (e.g. CrystFEL). Defaults to ‘corrected’.

Returns

pandas DataFrame holding general pattern information, and dict holding CXI-format: peaks. (note that return values are different when using lazy=True or sync=False - see above)

Return type

Tuple[pd.DataFrame, dict]

diffractem.proc2d.get_peaks(img, x0, y0, max_peaks=500, pxmask=None, min_snr=4.0, threshold=7.0, min_pix_count=2, max_pix_count=20, local_bg_radius=3, min_res=0, max_res=500, as_dict=True, extended_info=False)[source]¶

Find peaks in diffraction pattern using the peakfinder8 algorithm as used in CrystFEL, OnDA and Cheetah. For explanation of the finding parameters, please consult the CrystFEL documentation (or just run man indexamajig).

Parameters

img (np.ndarray) – image stack
x0 (float) – image stack x center
y0 (float) – image stack y center
max_peaks (int, optional) – maximum number of peaks. Defaults to 500.
pxmask (Optional[np.ndarray], optional) – pixel mask. Defaults to None.
min_snr (float, optional) – minimum peak SNR. Defaults to 4..
threshold (float, optional) – count threshold. Defaults to 8.
min_pix_count (int, optional) – minimum number of pixels in peak. Defaults to 2.
max_pix_count (int, optional) – maximum number of pixels in peak. Defaults to 20.
local_bg_radius (int, optional) – radius for peak backgroud estimation. Defaults to 3.
min_res (int, optional) – minimum resolution (= radial range) in pixels. Defaults to 0.
max_res (int, optional) – maximum resolution (= radial range) in pixels. Defaults to 500.
as_dict (bool, optional) – return results as a dictionary instead of a single numpy array. Defaults to True.

Returns

CXI-format peaks information. If as_dict=False, instead returns a 1d array: of size (3 * max_peaks + 1), which contains x positions, y positions, intensities, and number of peaks concatenated.

Return type

dict

Note

The returned peak positions follow CXI convention, that is, they refer to pixel centers, not corners (as in CrystFEL). For CrystFEL-convention you have to add 0.5 to the returned peak positions.

diffractem.proc2d.loop_over_stack(fun)[source]¶

Decorator to (sequentially) loop a 2D processing function over a stack.

In brief, if you have a function that either modifies a (single) image or extracts some reduced data from it, this decorator wraps it such that it can operate on a whole stack of images.

Works on all functions with signature fun(imgs: np.ndarray, *args, **kwargs), where imgs is a numpy 3D stack or a 2D single image. It has to return either a numpy array,n which case it returns a stacked array of the function output, or a collection containing numpy arrays, each of which is stacked individually. If any of the positional/named arguments is an iterable of the same length as the image stack, it is distributed over the function calls for each image.

Note

loop_over_stack only works on functions eating numpy arrays, not dask arrays. If you want to apply a function to a dask-array image stack, you have to additionally wrap it in dask.array.map_blocks, diffractem.dataset._map_sub_blocks, diffractem.compute.map_reduction_func or similar.

Parameters: fun (Callable) – function to be decorated
Returns: function that loops over an image stack automatically
Return type: Callable

diffractem.proc2d.lorentz_fast(img, x_0=None, y_0=None, amp=None, scale=5.0, radius=None, limit=None, threshold=0, threads=False, verbose=False)[source]¶

Fast Lorentzian fit for finding beam center; especially suited for refinement after a reasonable estimate (i.e. to a couple of pixels) has been made by another method such as truncated COM. Compared to the other fits, it always assumes a shape parameter 2 (i.e. standard Lorentzian with asymptotic x^-2). It can restrict the fit to only a small region around the initial value for the beam center, which massively speeds up the function. Also, it auto-estimates the intial parameters somewhat reasonably if nothing else is given.

Parameters

img (float) – input image or image stack. If a stack is supplied, it is serially looped. Not accepting dask directly.
x_0 (float, optional) – estimated x beam center. If None, is assumed to be in the center of the image. Defaults to None.
y_0 (float, optional) – analogous. Defaults to None.
amp (float, optional) – estimated peak amplitude. If None, is set to the 99.99% percentile of img. Defaults to None.
scale (float, optional) – peak HWHM estimate in pixels. Defaults to 5.0.
radius (float, optional) – radius of a box around x_0, y_0 where the fit is actually done. If None, the entire image is used. Defaults to None.
limit (float, optional) – If not None, the fit result is discarded if the found beam_center is further away than this value from the initial estimate. Defaults to None.
threshold (int, optional) – pixel value threshold below which pixels are ignored. Defaults to 0.
threads (bool, optional) – if True, uses scipy.optimize.least_squares, which for larger arrays (radius more than around 15) uses multithreaded function evaluation. Especially for radius < 50, this may be slower than single-threaded. In this case, best set to False. Defaults to False.
verbose (bool, optional) – if True, a message is printed on some occasions. Defaults to False.

Returns

numpy array of refined parameters [amp, x0, y0, scale]

Return type

np.ndarray

diffractem.proc2d.lorentz_fit(img, amp=1.0, x_0=0.0, y_0=0.0, scale=5.0, shape=2.0, threshold=0)[source]¶

Fits a Lorentz profile to find the center (x_0, y_0) of a diffraction pattern, ignoring any pixels with values < threshold.

The fit function is based on:

amp * [(1 + ((x-x_0)/scale)**2) + (1 + ((y-y_0)/scale)**2)] ** (-shape/2)

Build upon optimize.least_squares function which is thread safe Note: least.sq is not. Analytical Jacobian has been added.

Note

If possible (i.e. you leave shape at 2.0), do not use this function, it’s really slow. Instead use lorentz_fast.

Parameters: above (see fit function) –
Returns: result of optimization
Return type: OptimizeResult

diffractem.proc2d.mean_clip(c, sigma=2.0)[source]¶

Iteratively keeps only the values from the array that satisfies: 0 < c < c_mean + sigma*std and return the mean of the array. Assumes the array contains positive entries, if it does not or the array is empty returns -1

Parameters

c (np.ndarray) – input value array
sigma (float, optional) – number of standard deviations away from the mean that is used for mean calculation. Defaults to 2.0.

Returns

Mean of clipped values

Return type

float

diffractem.proc2d.radial_proj(img, x0=None, y0=None, scale=1, scale_axis=0, my_func=<function nanmean>, min_size=600, max_size=850, filter_len=1)[source]¶

Applies a function to azimuthal bins of the image around the center (x0, y0) for each integer radius and returns the result in a np.array of size max_size, yielding a radial profile. Skips values that are set to -1 or nan.

Optionally, a median filter can be applied to the output.

Parameters

img (np.ndarray) – input image or stack
x0 (Optional[float], optional) – x center of pattern. Center of image is None. Defaults to None.
y0 (Optional[float], optional) – y center of pattern. Center of image is None. . Defaults to None.
my_func (Union[Callable[[np.ndarray], np.ndarray], List[Callable[[np.ndarray], np.ndarray]]], optional) – function to call on all pixel values at a given radius, or iterable thereof. Defaults to np.nanmean.
min_size (int, optional) – Minimum length of the output profile. Defaults to 600.
max_size (int, optional) – Maximum length of the output profile. Defaults to 850.
filter_len (int, optional) – Kernel size of median filter applied after profile calculation.
odd (filter_len must be) –
1. (and filtering is at the moment incompatible with multiple functions. Defaults to) –

Returns

radial profile calculated using my_func

Return type

np.ndarray

Note

The median filter will currently only work, if a single function is used only! Sorry for that.

diffractem.proc2d.remove_background(img, x0=None, y0=None, nPeaks=None, peakXPosRaw=None, peakYPosRaw=None, peak_radius=3, filter_len=5, rfunc=<function nanmean>, pxmask=None, truncate=False, offset=0)[source]¶

Combines radial_proj, cut_peaks and strip_img into a background-removal protocol for diffration patterns, assuming radial symmetry of the background.

The diffraction pattern is first azimuthally integrated, excluding Bragg peaks, and the resulting radial profile is further smoothed. The profile is then re-projected to the full image and subtracted. This procedure usually works excellently well - at least, if the peak finding has been done carefully. If there are hard issues with peak finding, it might be worth setting rfunc=np.nanmedian.

Peaks have to be provided in CXI format and convention.

Parameters

img (np.ndarray) – Input image or stack thereof
x0 (Optional[float], optional) – Diffraction pattern center along x. If None, use the image center. Defaults to None.
y0 (Optional[float], optional) – Diffraction pattern center along y. If None, use the image center. Defaults to None.
nPeaks (Optional[np.ndarray], optional) – Number of peaks. Defaults to None.
peakXPosRaw (Optional[np.ndarray], optional) – peak X positions. Defaults to None.
peakYPosRaw (Optional[np.ndarray], optional) – peak Y positions. Defaults to None.
peak_radius (int, optional) – Radius around each peak excluded from background calculation. Defaults to 3.
filter_len (int, optional) – Range of median filter applied to radial profile. Defaults to 5.
rfunc (Callable[[np.ndarray], np.ndarray], optional) – Function for calculation of the radial profile through azimuthal averaging. Defaults to np.nanmean.
pxmask ([type], optional) – Pixel mask to be applied after correction. Defaults to None.
truncate (bool, optional) – Set all pixels of value < offset to 0. Defaults to False.
offset (int, optional) – Offset for the output image. Defaults to 0.

Returns

[description]

Return type

np.ndarray

diffractem.proc2d.stack_nested(data_list, func=<function stack>)[source]¶

Applies a numpy/dask concatenation/stacking function recursively to a recursive python collection (tuple/list/dict) containing numpy or dask arrays on the lowest level.

Parameters

data_list (Union[tuple, list, dict]) – Collection of numpy arrays (can be recursive)
func (Callable, optional) – Concatenation function to apply. Defaults to np.stack.

Returns

tuple/list/dict with concatenated/stacked numpy arrays

Return type

same as data_list

diffractem.proc2d.strip_img(img, prof, x0=None, y0=None, pxmask=None, truncate=False, offset=0, keep_edge_offset=False, replaceval=None, interp=True, dtype=None)[source]¶

Subtract a radial profile from a diffraction pattern, assuming radial symmetry of the background.

Parameters

img (np.ndarray) – Input image (or stack thereof)
prof (np.ndarray) – Radial profile to be subtracted
x0 (float, optional) – Diffraction pattern center along x. If None, use the image center. Defaults to None.
y0 (float, optional) – Diffraction pattern center along y. If None, use the image center. Defaults to None.
pxmask (Optional[np.ndarray], optional) – Pixel mask to apply after subtraction. Defaults to None.
truncate (bool, optional) – Replace all values below the offset by replaceval. Defaults to False.
offset (Union[float, int], optional) – Offset to apply to the output image. Required if you want to keep positive pixel values. Defaults to 0.
keep_edge_offset (bool, optional) – [description]. Defaults to False.
replaceval (Optional[float], optional) – Replace value for pixels falling below offset. Defaults to None.
interp (bool, optional) – Interpolate background pixel values, otherwise use nearest neighbour. Defaults to True.
dtype (Optional[np.dtype], optional) – If not None, convert output image to this data type. Defaults to None.

Returns

Image with subtracted radial profile.

Return type

np.ndarray