Introduction
DIVAS (Data Integration via Analysis of Subspaces) is an R package for multi-modal data integration. Based on the DIVAS methodology proposed by Prothero et al. (2024), it provides tools for signal extraction, noise estimation, and joint structure identification from high-dimensional datasets, useful for multi-omics and other complex data types.
Documentation website: https://github.com/ByronSyun/DIVAS_Develop/tree/main
Repository Structure
.
├── pkg/ # R package source code
├── docs/ # Generated documentation website
├── man/ # Package manual and figures
├── papers/ # Related publications
└── sourceCode/ # Development code and examples
├── examples/ # Example R scripts
└── matlab/ # Original MATLAB implementation
Installation
Dependencies
The DIVAS package requires a modern version of the CVXR
package to ensure compatibility with underlying solvers like SCS. We strongly recommend installing the latest stable version from CRAN.
# Install devtools (if not already installed)
install.packages("devtools")
# Install the latest version of CVXR from CRAN
# This is critical to avoid issues with solver status recognition (e.g., for SCS)
install.packages("CVXR")
Installing the DIVAS package
You can install the development version of DIVAS from GitHub using devtools
:
# Install DIVAS package from the main branch on GitHub
devtools::install_github("ByronSyun/DIVAS_Develop/pkg", ref = "main")
# Or install from a local folder if you have cloned the repository
# devtools::install("path/to/DIVAS-main/pkg")
Usage Examples
The DIVAS package supports analysis of various data formats. Here are two examples using different data formats:
Example 1: Using MATLAB data
# Load necessary libraries
library(devtools)
library(R.matlab)
library(DIVAS)
# Construct the path to the data file within the package
data_path <- system.file("extdata", "toyDataThreeWay.mat", package = "DIVAS")
# Read MATLAB data
data <- readMat(data_path)
# Prepare data blocks
datablock <- list(
X1 = data$datablock[1,1][[1]][[1]],
X2 = data$datablock[1,2][[1]][[1]],
X3 = data$datablock[1,3][[1]][[1]]
)
# Run DIVAS main function
result <- DIVASmain(datablock)
# Visualize results
dataname <- paste0("DataBlock_", 1:length(datablock))
plots <- DJIVEAngleDiagnosticJP(datablock, dataname, result, 566, "Demo")
print(plots)
See the documentation website for more detailed examples and tutorials.
Available Datasets
Dataset | Brief Description | Vignette Link | Format | Primary Reference |
---|---|---|---|---|
toyDataThreeWay.mat | Synthetic 3-block data with known joint structures | Toy Dataset Example | .mat | Prothero et al. (2024) |
gnp_imputed.qs | GNP economic time series data | GNP Dataset Example | .qs | Stock & Watson (2016) |
covid_multi_omics.qs | Multi-omics COVID-19 patient data (plasma proteins, metabolites, PBMC transcriptomics) | Coming Soon | .qs | Su et al. (2020) |
Case Study: COVID-19 Multi-Omics Analysis
This project serves as a comprehensive, real-world application of the DIVAS package on a complex multi-omics dataset from a COVID-19 patient cohort. It demonstrates the full data processing and analysis workflow, from raw data cleaning to final DIVAS results, showcasing the practical utility of the package.
Developers
Core Team
- Jiadong Mao - Lead Developer, Maintainer
- Yinuo Sun - Package Developer, Maintainer
References
Prothero, J., et al. (2024). Data integration via analysis of subspaces (DIVAS).
Su, Y., Chen, D., Yuan, D., et al. (2020). Multi-Omics Resolves a Sharp Disease-State Shift between Mild and Moderate COVID-19. Cell, 183(6), 1479-1495. https://doi.org/10.1016/j.cell.2020.10.037
Contributing
We welcome contributions to the DIVAS package. Please see our contributing guidelines for more information.