This function implements robust algorithms to obtain a list of genes associated to a given clinical variable. It is based on the elastic net algorithm and the robustness and reproducibility of the subset of genes is improved using a bootstrap strategy combined with ensemble methods.
genePheno(
seData,
DEgenes,
vectorGroups,
vectorSampleID,
iter = 100,
numberOfFolds = 5
)
SummarizedExperiment object with the normalized expression data and the phenotypic data in colData.
Vector containing the genes to be used. Expected to be in the same format as the rows of the assay(seData). Usually this vector is the result of running prefilterSAM().
Clinical variable or phenotypic variable tested. It must be provided as a numeric binary vector.
Vector containing the sample names in the same order as in assay(seData).
Number of bootstrap iterations (default: 100, should be changed if the function takes too long to execute).
Number of folds to implement nested cross-validation. By default 5.
A list containing the following elements:
- genes
: A list of genes ranked according to the degree of
association with the clinical or phenotypic variable tested.
- listCoeff
: A list with the beta regression coefficients and the
AUC score for each bootstrap iteration.
- stability
: Gene selection probability estimated by bootstrap
(the number of times discovered over "n" iterations).
- betasMedian
: Median of the beta coefficients over the B replicates.
- betasMean
: Mean of the beta coefficients over the B replicates.
- betasTable
: Table of genes ordered by decreasing value of the
stability coefficient. Contains several metrics: the stability index,
the mean and the median of the beta coefficients.
This function implements a robust version of the elastic net algorithm proposed by Tibshirani (Tibshirani et al., 2009). This algorithm considers a penalty term to avoid overfitting that is a convex combination of the L2 norm (ridge regression) and L1 (Lasso regression). When the alpha parameter is 1, the regularization term perfoms similarly to Lasso and minimizes the number of non-null coefficients. If a subset of features are slightly correlated Lasso selects only one of them randomly. To avoid this extreme behavior the alpha parameter is set up to 0.75 that includes more relevant variables than Lasso and improves the prediction accuracy. Besides, this choice will help to improve the stability and to reduce the variance in the feature selection process. In order to improve the robustness and reproducibility of the gene signature discovered, a bootstrap strategy is implemented. The patients are resampled with replacement giving rise to B replicates. For each replicate, a gene signature is obtained using double nested cross-validation to avoid overfitting. The final gene list is built as an ensemble of lists, considereing several metrics that evaluate the stability, the robustness and the predictive power of each gene. See (Martinez-Romero et al., 2018) for more details.
martinezromero2018asuri BuenoFortes2023asuri
data(seBRCA)
# prefilterSAM ---
groupsVector <- SummarizedExperiment::colData(seBRCA)$ER.IHC
set.seed(5)
DE_list_genes <- prefilterSAM(seBRCA, groupsVector)
#> 2025-07-08 12:14:52.171525
#>
|
| | 0%
|
| | 1%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|= | 2%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|== | 3%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|== | 4%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|== | 5%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|=== | 6%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|==== | 7%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|==== | 8%
|
|==== | 9%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|===== | 10%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|====== | 11%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|====== | 12%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|====== | 13%
|
|======= | 14%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|======== | 15%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|======== | 16%
|
|======== | 17%
|
|========= | 18%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|========== | 19%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|========== | 20%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|========== | 21%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|=========== | 22%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============ | 23%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============ | 24%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============ | 25%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============= | 26%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============== | 27%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============== | 28%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============== | 29%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|=============== | 30%
|
|================ | 31%
|
|================ | 32%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|================ | 33%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|================= | 34%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|================== | 35%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|================== | 36%
|
|================== | 37%
|
|=================== | 38%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|==================== | 39%
|
|==================== | 40%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|==================== | 41%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|===================== | 42%
|
|====================== | 43%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|====================== | 44%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|====================== | 45%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|======================= | 46%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|======================== | 47%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|======================== | 48%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|======================== | 49%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|========================= | 50%
|
|========================== | 51%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|========================== | 52%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|========================== | 53%
|
|=========================== | 54%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============================ | 55%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============================ | 56%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============================ | 57%
|
|============================= | 58%
|
|============================== | 59%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============================== | 60%
|
|============================== | 61%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|=============================== | 62%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|================================ | 63%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|================================ | 64%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|================================ | 65%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|================================= | 66%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|================================== | 67%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|================================== | 68%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|================================== | 69%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|=================================== | 70%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|==================================== | 71%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|==================================== | 72%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|==================================== | 73%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|===================================== | 74%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|====================================== | 75%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|====================================== | 76%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|====================================== | 77%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|======================================= | 78%
|
|======================================== | 79%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|======================================== | 80%
|
|======================================== | 81%
|
|========================================= | 82%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|========================================== | 83%
|
|========================================== | 84%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|========================================== | 85%
|
|=========================================== | 86%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============================================ | 87%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============================================ | 88%
|
|============================================ | 89%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============================================= | 90%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============================================== | 91%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============================================== | 92%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|============================================== | 93%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|=============================================== | 94%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|================================================ | 95%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|================================================ | 96%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|================================================ | 97%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|================================================= | 98%
#> Warning: The spline based estimation of pi0 results in a non-positive value of pi0.
#> Therefore, pi0 is estimated by using lambda = 0.5.
#>
|
|==================================================| 99%
|
|==================================================| 100%
#> 2025-07-08 12:16:20.721493
# genePheno ---
vectorSampleID <- rownames(SummarizedExperiment::colData(seBRCA))
vectorGroups <- SummarizedExperiment::colData(seBRCA)$ER.IHC |> as.numeric()
Pred_ER.IHC <- genePheno(seBRCA, DE_list_genes, vectorGroups, vectorSampleID)
#>
|
| | 0%
|
| | 1%
|
|= | 2%
|
|== | 3%
|
|== | 4%
|
|== | 5%
|
|=== | 6%
|
|==== | 7%
|
|==== | 8%
|
|==== | 9%
|
|===== | 10%
|
|====== | 11%
|
|====== | 12%
|
|====== | 13%
|
|======= | 14%
|
|======== | 15%
|
|======== | 16%
|
|======== | 17%
|
|========= | 18%
|
|========== | 19%
|
|========== | 20%
|
|========== | 21%
|
|=========== | 22%
|
|============ | 23%
|
|============ | 24%
|
|============ | 25%
|
|============= | 26%
|
|============== | 27%
|
|============== | 28%
|
|============== | 29%
|
|=============== | 30%
|
|================ | 31%
|
|================ | 32%
|
|================ | 33%
|
|================= | 34%
|
|================== | 35%
|
|================== | 36%
|
|================== | 37%
|
|=================== | 38%
|
|==================== | 39%
|
|==================== | 40%
|
|==================== | 41%
|
|===================== | 42%
|
|====================== | 43%
|
|====================== | 44%
|
|====================== | 45%
|
|======================= | 46%
|
|======================== | 47%
|
|======================== | 48%
|
|======================== | 49%
|
|========================= | 50%
|
|========================== | 51%
|
|========================== | 52%
|
|========================== | 53%
|
|=========================== | 54%
|
|============================ | 55%
|
|============================ | 56%
|
|============================ | 57%
|
|============================= | 58%
|
|============================== | 59%
|
|============================== | 60%
|
|============================== | 61%
|
|=============================== | 62%
|
|================================ | 63%
|
|================================ | 64%
|
|================================ | 65%
|
|================================= | 66%
|
|================================== | 67%
|
|================================== | 68%
|
|================================== | 69%
|
|=================================== | 70%
|
|==================================== | 71%
|
|==================================== | 72%
|
|==================================== | 73%
|
|===================================== | 74%
|
|====================================== | 75%
|
|====================================== | 76%
|
|====================================== | 77%
|
|======================================= | 78%
|
|======================================== | 79%
|
|======================================== | 80%
|
|======================================== | 81%
|
|========================================= | 82%
|
|========================================== | 83%
|
|========================================== | 84%
|
|========================================== | 85%
|
|=========================================== | 86%
|
|============================================ | 87%
|
|============================================ | 88%
|
|============================================ | 89%
|
|============================================= | 90%
|
|============================================== | 91%
|
|============================================== | 92%
|
|============================================== | 93%
|
|=============================================== | 94%
|
|================================================ | 95%
|
|================================================ | 96%
|
|================================================ | 97%
|
|================================================= | 98%
|
|==================================================| 99%
|
|==================================================| 100%
# Pred_ER.IHC is an output object with the list of genes that show a
# significant correlation with the clinical variable. Since a bootstrap is
# performed, the results of how many times across iterations a gene is found
# significant are reported as *stability* (in relative numbers 0-1, 1=100%)
# and the *beta values* from the regression across iterations are also
# provided as *betaMedian* and *betaMean* :
names(Pred_ER.IHC)
#> [1] "genes" "listCoeff" "stability" "betasMedian" "betasMean"
#> [6] "betasTable"
# [1] "genes" "listCoeff" "stability" "betasMedian" "betasMean" "betasTable"