Skip to content

API Docs

pymarkovclustering

easymcl

easymcl(
    edges: str | Path | list[tuple[str, str, float]],
    /,
    *,
    inflation: float = 2.0,
    max_iter: int = 100,
    quiet: bool = True,
) -> list[list[str]]

Run Markov Clustering from edges file or list of tuples

easymcl automates load edges as matrix, MCL, extract clusters

PARAMETER DESCRIPTION
edges

Edges(source, target, weight) file or list of tuples

TYPE: str | Path | list[tuple[str, str, float]]

inflation

Inflation factor

TYPE: float DEFAULT: 2.0

max_iter

Max number of iterations

TYPE: int DEFAULT: 100

quiet

If True, pring log on screen

TYPE: bool DEFAULT: True

RETURNS DESCRIPTION
clusters

List of clusters, each cluster is a list of node labels

TYPE: list[list[str]]

Notes

This function is designed to produce MCL results similar to $ mcl abc.tsv --abc -I 2.0 command. See https://micans.org/mcl/man/mcl.html in detail.

References

MCL - a cluster algorithm for graphs (https://micans.org/mcl/)

Examples:

>>> import pymarkovclustering as pymcl
>>> # easymcl automates load edges as matrix, MCL, extract clusters
>>> clusters = pymcl.easymcl("edges.tsv")
>>> # easymcl is same as code below
>>> matrix, labels = pymcl.edges_to_sparse_matrix("edges.tsv")
>>> mcl_matrix = pymcl.mcl(matrix)
>>> clusters = pymcl.extract_clusters(mcl_matrix, labels)

mcl

mcl(
    matrix: ndarray | csr_matrix,
    /,
    *,
    inflation: float = 2.0,
    max_iter: int = 100,
    quiet: bool = True,
) -> csr_matrix

Run Markov Clustering

PARAMETER DESCRIPTION
matrix

Adjacency matrix

TYPE: ndarray | csr_matrix

inflation

Inflation factor

TYPE: float DEFAULT: 2.0

max_iter

Max number of iterations

TYPE: int DEFAULT: 100

quiet

If True, pring log on screen

TYPE: bool DEFAULT: True

RETURNS DESCRIPTION
mcl_matrix

MCL result matrix

TYPE: csr_matrix

Notes

This function is designed to produce MCL results similar to $ mcl abc.tsv --abc -I 2.0 command. See https://micans.org/mcl/man/mcl.html in detail.

References

MCL - a cluster algorithm for graphs (https://micans.org/mcl/)

edges_to_sparse_matrix

edges_to_sparse_matrix(
    edges: str | Path | list[tuple[str, str, float]],
) -> tuple[csr_matrix, list[str]]

Convert edges file or list of tuples to sparse matrix

PARAMETER DESCRIPTION
edges

Edges(source, target, weight) file or list of tuples

TYPE: str | Path | list[tuple[str, str, float]]

RETURNS DESCRIPTION
matrix

Sparse matrix representation of edges

TYPE: csr_matrix

nodes

List of node labels

TYPE: list[str]

extract_clusters

extract_clusters(matrix: csr_matrix, labels: list[str] | None = None) -> list[list[str]]

Extract clusters from MCL result matrix and map them to labels

PARAMETER DESCRIPTION
matrix

MCL result matrix

TYPE: csr_matrix

labels

List of labels corresponding to matrix indices. If None, '0','1','2'...'X' index label is used.

TYPE: list[str] | None DEFAULT: None

RETURNS DESCRIPTION
clusters

List of clusters, where each cluster is a list of labels

TYPE: list[list[str]]

random_edges

random_edges(
    node_size: int = 100,
    /,
    *,
    min_cluster_size: int = 1,
    max_cluster_size: int = 10,
    min_weight: float = 0.0,
    max_weight: float = 1.0,
    random_add_rate: float = 0,
    random_remove_rate: float = 0,
    seed: int | None = 0,
) -> list[tuple[str, str, float]]

Simple random edges generator for clustering test

Node name is set as {ClusterID}_{NodeID in Cluster} e.g. 18_2

PARAMETER DESCRIPTION
node_size

Total number of nodes to generate

TYPE: int DEFAULT: 100

min_cluster_size

Min-Max size of each cluster

TYPE: int DEFAULT: 1

max_cluster_size

Min-Max size of each cluster

TYPE: int DEFAULT: 1

min_weight

Min-Max weight for edges

TYPE: float DEFAULT: 0.0

max_weight

Min-Max weight for edges

TYPE: float DEFAULT: 0.0

random_add_rate

Random add, remove edges rate for noisy dataset generation

TYPE: float DEFAULT: 0

random_remove_rate

Random add, remove edges rate for noisy dataset generation

TYPE: float DEFAULT: 0

seed

Random seed for reproducibility

TYPE: int | None DEFAULT: 0

RETURNS DESCRIPTION
edges

Edges list of tuples

TYPE: list[tuple[str, str, float]]

write_clusters

write_clusters(outfile: str | Path, clusters: list[list[str]]) -> None

Write clusters to file

PARAMETER DESCRIPTION
outfile

Output tab-delimited clusters file

TYPE: str | Path

clusters

List of clusters, each cluster is a list of node labels

TYPE: list[list[str]]

write_edges

write_edges(outfile: str | Path, edges: list[tuple[str, str, float]]) -> None

Write edges to file

PARAMETER DESCRIPTION
outfile

Output tab-delimited edges file

TYPE: str | Path

edges

Edges list of tuples

TYPE: list[tuple[str, str, float]]

easymclviz

easymclviz(
    edges: str | Path | list[tuple[str, str, float]],
    /,
    *,
    inflation: float = 2.0,
    max_iter: int = 100,
    quiet: bool = True,
    ax: Axes | None = None,
    node_size: int = 20,
    node_cmap: str = "gist_rainbow",
    node_alpha: float = 1.0,
    edge_width: float = 1.0,
    edge_color: str = "lightgray",
    show_label: bool = False,
    label_va: str = "bottom",
    font_size: int = 8,
) -> Figure

Run Markov Clustering and visualize clusters using networkx

easymclviz automates load edges as matrix, MCL, extract clusters, visualization

PARAMETER DESCRIPTION
edges

Edges(source, target, weight) file or list of tuples

TYPE: str | Path | list[tuple[str, str, float]]

inflation

Inflation factor

TYPE: float DEFAULT: 2.0

max_iter

Max number of iterations

TYPE: int DEFAULT: 100

quiet

If True, pring log on screen

TYPE: bool DEFAULT: True

ax

Matplotlib axes. If None, auto created.

TYPE: Axes | None DEFAULT: None

node_size

Node plot size

TYPE: int DEFAULT: 20

node_cmap

Node colormap (e.g. gist_rainbow, jet, viridis, tab20)

TYPE: str DEFAULT: 'gist_rainbow'

node_alpha

Node color alpha parameter

TYPE: float DEFAULT: 1.0

edge_width

Edge line width

TYPE: float DEFAULT: 1.0

edge_color

Edge color

TYPE: str DEFAULT: 'lightgray'

show_label

If True, show node label

TYPE: bool DEFAULT: False

label_va

Node label vertical alignment (top|center|bottom|baseline|center_baseline)

TYPE: str DEFAULT: 'bottom'

font_size

Node label size

TYPE: int DEFAULT: 8

RETURNS DESCRIPTION
fig

Matplotlib figure

TYPE: Figure

Notes

Additional installation of networkx and matplotlib are required for MCL clusters visualization. For better position layout, extra packages pygraphviz, pydot, lxml installation is preferred. See https://networkx.org/documentation/stable/install.html in details.

Examples:

>>> import pymarkovclustering as pymcl
>>> # easymclviz automates load edges as matrix, MCL, extract clusters, visualization
>>> fig = pymcl.easymclviz("edges.tsv")
>>> # easymclviz is same as code below
>>> matrix, labels = pymcl.edges_to_sparse_matrix("edges.tsv")
>>> mcl_matrix = pymcl.mcl(matrix)
>>> clusters = pymcl.extract_clusters(mcl_matrix, labels)
>>> fig = pymcl.mclviz(matrix, labels, clusters)

mclviz

mclviz(
    matrix: ndarray | csr_matrix,
    labels: list[str],
    clusters: list[list[str]],
    /,
    *,
    ax: Axes | None = None,
    node_size: int = 20,
    node_cmap: str = "gist_rainbow",
    node_alpha: float = 1.0,
    edge_width: float = 1.0,
    edge_color: str = "lightgray",
    show_label: bool = False,
    label_va: str = "bottom",
    font_size: int = 8,
) -> Figure

Visualize Markov Clustering clusters using networkx

PARAMETER DESCRIPTION
matrix

Adjacency matrix used as MCL input

TYPE: ndarray | csr_matrix

labels

Matrix labels

TYPE: list[str]

clusters

MCL clusters

TYPE: list[list[str]]

ax

Matplotlib axes. If None, auto created.

TYPE: Axes | None DEFAULT: None

node_size

Node plot size

TYPE: int DEFAULT: 20

node_cmap

Node colormap (e.g. gist_rainbow, jet, viridis)

TYPE: str DEFAULT: 'gist_rainbow'

node_alpha

Node color alpha parameter

TYPE: float DEFAULT: 1.0

edge_width

Edge line width

TYPE: float DEFAULT: 1.0

edge_color

Edge color

TYPE: str DEFAULT: 'lightgray'

show_label

If True, show node label

TYPE: bool DEFAULT: False

label_va

Node label vertical alignment (top|center|bottom|baseline|center_baseline)

TYPE: str DEFAULT: 'bottom'

font_size

Node label size

TYPE: int DEFAULT: 8

RETURNS DESCRIPTION
fig

Matplotlib figure

TYPE: Figure

Notes

Additional installation of networkx and matplotlib are required for MCL clusters visualization. For better position layout, extra packages pygraphviz, pydot, lxml installation is preferred. See https://networkx.org/documentation/stable/install.html in details.