IMC data transformation and normalisation across ROI

Looking for ways to analyze your dataset? Need help with a software package?
Computational image analysis of multiplexed Imaging Mass Cytometry acquisitions
Post Reply
acoenen
Posts: 1
Joined: Thu Aug 22, 2019 3:20 pm

IMC data transformation and normalisation across ROI

Post by acoenen » Fri Aug 23, 2019 10:09 am

Hi not sure what the policy on this forum is but I have a question relating to this exact topic so I will post here or open a new discussion if this preferred?!

I have a received a flat file with median marker intensity per cell derived from segmented IMC data. I am trying to integrate this data into a R pipeline I have developed for suspension cell cytof. Currently I have converted the data from each ROI into a fcs file and then assembled into a flowset. I would like to cluster the data with flowsom (or could use phenograph if this is prefered?!) and tsne/umap to identify immune cell types.

My question is what is the best way of transforming the median cell intensity data for downstream analysis as it is highly skewed. I have tried the 99th percentile normalisation but perhaps Dennis could you advise if this the correct way doing it:
- calculate the 99th percentile for all ROI combined for each channel
- divide the intensities for each cell in each ROI by the 99th percentile of the respective channel

Would this be the ideal way of transforming the data (was using arcsin for suspension cells)? Is there any other normalisation needed before combining the ROIs?

Thanks a lot in advance,
Anna

EDIT: The forum administrators prefer individual topics per question and therefore moved this post into a new topic.
The post was originally posted in the following topic: viewtopic.php?f=4&t=23.
DenisSchapiro
Posts: 36
Joined: Wed Nov 29, 2017 11:34 am

Re: IMC data transformation and normalisation across ROI

Post by DenisSchapiro » Wed Sep 04, 2019 3:38 pm

Dear Anna,

Your question is highly dependent on the underlying dataset/distribution and therefore difficult to answer in general.

- We are currently using raw values for IMC and log-transform for Immunofluorescence based on the underlying dynamic range.
- We have used 99-percentile for tSNE and PhenoGraph across all ROI combined for each channel of interest.

Best

Denis
User avatar
votti
Posts: 24
Joined: Mon Nov 27, 2017 12:46 pm

Re: IMC data transformation and normalisation across ROI

Post by votti » Sat Sep 07, 2019 8:15 am

Hi there,

I find it difficult to give universal recommendations for these kind of questions, as they really depend on the data quality and the planned downstream analysis, which can have various assumptions on the data structure. Thus, different downstream analysis methods suggest different transformations.

As a general point: I would recommend considering to use MeanIntensity per cell rather than MedianIntenstity. The reason is, that for many markers, I would suspect them to be localized non-homogenously over the cell, e.g. specifically localized in nucleus, membrane or other organell. Median intensity mainly works robustly if the marker is distributed within the pixels of a cell area in a unimodal or homogenous fashion. However, this is often not the case if the marker is specifically localized: a common case is that a marker is high in the membrane pixels only and close to absent in the nuclear pixels. Taking the median in this case will give you a really unstable, not quantitative result: in case that >50% of the pixels are membranous the median will be quite high OR or in case that >50% pixels in the cell are nuclear the median will be really low. E.g. phospho S6 is a marker where this often happens.
MeanInstensity over the cell pixels is in such cases a much more stable and quantitative readout. The main drawback there is that it might be affected by single outlier pixel artifacts, thus I recommend filtering outlier pixels before image quantification using MeanIntensity.

To the transformation: For me the data really looks mostly log-normal, which is also what theoretical considerations would suggest. Transformation of the MeanIntenstiy by a log(x+small value) transform makes the data looking Gaussian, which is an assumption of many methods. While there are methods to estimate the 'small value', log(x+0.01) usually looks very fine for MeanIntenstiy IMC data for most channels.

As for outlier removal: given how nicely normal my data often looks after log transformation, censoring 1% of the data often seems excessive to me and I usually censor the 99.99 percentile (0.01%).

But as said: what is 'best' really depends on the actual data and what question you are addressing with the analysis.

Cheers, Vito
Post Reply