I've been testing Rphenograph, but I have a few doubts:
In what way this is "simple" implementation of the PhenoGraph algorithm? Are there main differences compared to the Matlab and Python implementations described in the original paper (https://www.ncbi.nlm.nih.gov/pubmed/26095251)?
In fact, I noticed that compared to those implementations, that have a component of randomness and it is necessary to set a seed to reproduce the results, every time I run the algorithm in R, the results are absolutely identical. I even tried to explicitly set a different seed, but there is no change. Why is this the case?
Elisa
Rphenograph

 Posts: 8
 Joined: Mon Oct 08, 2018 3:02 pm
Re: Rphenograph
Hi there,
I remember comparing the two packages, Python and Rphenograph, some time ago and found no major difference between them.
To my understanding, the implementation is 'simple' in the sense of 'easy to read code',
Usually, the nondeterministic part of `phenograph` is the `Louvain method`, a greedy optimization algorithm that can be nondeterministic. The original python/matlab code uses the original Louvain implementation: https://sites.google.com/site/findcommunities/
`Rphenograph` however uses the louvain implementation from `igraph`  a highly regarded opensource network analysis package. This implementation uses a deterministic routine, thus there is no need to set a random seed. Note that this does not make the results more 'correct', as it will also just provide a locally optimal cluster solution.
To see how much robust the results of a clustering algorithm really are, one can either add a tiny amount of noise to the data or subsample a larger dataset and rerun the algorithm many times.
If still in doubt you can also ask the JinmiaoChenLab directly on Github. Otherwise using R `reticulate` one can nowadays also use the Python phenograph from R directly (https://cran.rproject.org/web/packages ... ython.html).
I remember comparing the two packages, Python and Rphenograph, some time ago and found no major difference between them.
To my understanding, the implementation is 'simple' in the sense of 'easy to read code',
Usually, the nondeterministic part of `phenograph` is the `Louvain method`, a greedy optimization algorithm that can be nondeterministic. The original python/matlab code uses the original Louvain implementation: https://sites.google.com/site/findcommunities/
`Rphenograph` however uses the louvain implementation from `igraph`  a highly regarded opensource network analysis package. This implementation uses a deterministic routine, thus there is no need to set a random seed. Note that this does not make the results more 'correct', as it will also just provide a locally optimal cluster solution.
To see how much robust the results of a clustering algorithm really are, one can either add a tiny amount of noise to the data or subsample a larger dataset and rerun the algorithm many times.
If still in doubt you can also ask the JinmiaoChenLab directly on Github. Otherwise using R `reticulate` one can nowadays also use the Python phenograph from R directly (https://cran.rproject.org/web/packages ... ython.html).

 Posts: 8
 Joined: Mon Oct 08, 2018 3:02 pm
Re: Rphenograph
Thank you very much, I couldn't have asked for a more complete answer!
Elisa
Elisa