To compare the network of communities, pairwise distance between adjacency matrix, which present all connection information, are calculated. By substrate adjacency matrix (A) by the degree matrix (D), Laplacian matrix is obtained and the corresponding eigenvector and eigenvalues are calculated. Spectral distance then defined as the Euclidean distance between first k eigenvalues. Alternatively, Jaccard distance between matrix is implemented as dividing the sum of matrix contrast by the sum of larger absolute value between two adjacency matrices.
Bootstrap-permutation based network constructionTo be able to test the significance of distances between matrices, a bootstrap-permutation based method is developed. By subsampling and bootstrap, true correlation adjacency matrices were constructed from subset of original data. Then the metadata of samples is randomly swapped as permutated datasets, from which the pseudo correlation coefficient is calculated. By comparing the true adjacency matrices with the pseudo ones, the significance of distance is obtained.
# compare the networks from different compartments
maize <- fit_tabs(maize)
maize <- bs_pm(maize, group = "Compartment")
# only get the distance, no significance test
maize <- bs_pm(maize, group = "Compartment", sig = FALSE)
When the composition number is big, the bootstrap-permutation could take very long time, thus pre-filtering is needed. g_size
is the minimum number of samples for groups defined by group
. Conditions with less than g_size
would be removed for later analysis and this is set as 88 by default. s_size
is the sub-sampling size for bootstrap and permutation, 30 by default. s_size
should definitely smaller than g_size
and preferably smaller than half of it. Also compositions appear in less than specific percentage of samples could be filtered by setting the occupancy threshold per
and rm
. By default, the compositions which present in less than 10% samples would be filtered. When the quantitative matrix is too big, one could choose to output the bootstrap and permutation results separately for each comparison.
# set the size of group to remove consitions with less sample
# also larger s_size will lead to more stable results but will consume more
# computation and time resource
maize <- bs_pm(maize, group = "Compartment", g_size = 200, s_size = 80)
# remove the compositions appear in less than 20% of samples
maize <- bs_pm(maize, group = "Compartment", per = 0.2)
# set the bootstrap and permutation times. Again the more times bootstrap
# and permutation, the more reliable the significance, with increased
# computation and time resource.
maize <- bs_pm(maize, group = "Compartment", bs = 11, pm = 11)
# output the comparison separately to the defined directory
bs_pm(maize, group = "Compartment", bs = 6, pm = 6,
individual = TRUE, out_dir = out_dir)
Network distance calculation and significance test
After getting the true and pseudo adjacency matrices, Spectral and Jaccard distance defined before is then calculated and p value is obtained by comparing the F (the real distance) and Fp (the pseudo distance) following the formula: p = \(\frac { C_{F_p > F} + 1 }{ N_{dis} + 1 }\) For the individual generated network comparison results, the distance calculation is implemented by the function net_dis_indi()
. Same methods are available.
# check the available methods
? net_dis_method_list
# calculate the distances between matrices
maize <- net_dis(maize, method = "spectra")
maize <- net_dis(maize, method = "Jaccard")
# check the ditance results and significance (if applicable)
dis_stat(maize)
# the comparison stored separately in previous step
ja <- net_dis_indi(out_dir, method = "Jaccard")
dis_stat(ja)
spectra <- net_dis_indi(out_dir, method = "spectra")
dis_stat(spectra)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4