RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://github.com/jafarilab/NIMAA/tree/v0.1.0 below:

GitHub - jafarilab/NIMAA at v0.1.0

NIMAA

The goal of NIMAA is to use bipartite graphs for nominal data mining.

It can select a larger sub-matrix with no missing values in a matrix containing missing data, and then use the matrix to generate a bipartite graph and cluster on two projections. In addition, NIMAA can also impute the missing data, verify and score according to the previous clustering results obtained from the sub-matrix, and give suggestions on which imputation method is better.

You can install the released version of NIMAA from CRAN with:

install.packages("NIMAA")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("jafarilab/NIMAA")

library(NIMAA)
## load the beatAML data
beatAML_data <- NIMAA::beatAML

# plot the original data
beatAML_incidence_matrix <- plotInput(
  x = beatAML_data, # original data with 3 columns
  index_nominal = c(2,1), # the first two columns are nominal data
  index_numeric = 3,  # the third column inumeric data
  print_skim = FALSE, # if you want to check the skim output, set this as TRUE(Default)
  plot_weight = TRUE, # when plotting the figure, show the weights
  )
#> 
#> Na/missing values Proportion:     0.2603

beatAML dataset as incidence matrix

Plot the bipartite graph of the original data

graph <- plotBipartite(inc_mat = beatAML_incidence_matrix)

Extract the sub-matrices without missing data

extractSubMatrix() will extract the sub-matrices which have no missing value inside or with specific proportion of missing values inside (not for elements-max matrix), depends on the user’s input.

sub_matrices <- extractSubMatrix(
  beatAML_incidence_matrix,
  shape = c("Square", "Rectangular_element_max"), # the shapes you want to extract
  row.vars = "patient_id",
  col.vars = "inhibitor",
  plot_weight = T,
  )

Row-wise arrangement

Column-wise arrangement

Do clustering based on sub-matrices

findCluster() will perform optional pre-processing on the input incidence matrix, such as normalization. Then use the matrix to perform bipartite graph projection, and perform optional pre-processing in one of the specified parts, such as removing edges with lower weights, that is, weak edges.

cls <- findCluster(
  sub_matrices$Rectangular_element_max,
  dim = 1,
  method = "all", # clustering mehod
  normalization = TRUE, # normalize the input matrix
  rm_weak_edges = TRUE, # remove the weak edges in graph
  rm_method = 'delete', # removing method is deleting the edges
  threshold = 'median', # edges with weights under the median of all edges' weight are weak edges
  set_remaining_to_1 = TRUE, # set the weights of remaining edges to 1
  )

The imputeMissingValue() function can impute the missing values in the matrix, we only need to select which methods are needed. The result will be a list, each element is a matrix with no missing values.

it will perform a variety of numerical imputation according to the user’s input, and return all the data that does not contain any missing data, a list of matrices.

‘median’ will replace the missing values with the median of each rows (observations)
‘knn’ is the method in package
‘als’ and ‘svd’ are methods from package
‘CA’, ‘PCA’ and ‘FAMD’ are from package
others are from the famous package.

imputations <- imputeMissingValue(
  inc_mat = beatAML_incidence_matrix,
  method = c('svd','median','als','CA')
  )

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4