Classification and Feature Selection for microRNA/mRNA Interactions


[Up] [Top]

Documentation for package ‘feamiR’ version 0.1.0

Help Pages

decisiontree Decision tree Trains a decision on the given training dataset and uses it to predict classification for test dataset. The resulting accuracy, sensitivity and specificity are returned, as well as a tree summary.
dtreevoting Decision tree voting scheme. Implements a feature selection approach based on Decision Trees, using a voting scheme across the top levels on trees trained on multiple subsamples.
eGA Embryonic Genetic Algorithm. Feature selection based on Embryonic Genetic Algorithms. It performs feature selection by maintaining an ongoing set of 'good' set of features which are improved run by run. It outputs training and test accuracy, sensitivity and specificity and a list of <=k features.
feamiR feamiR: Classification and feature selection for microRNA/mRNA interactions
forwardfeatureselection Forward Feature Selection. Performs forward feature selection on the given list of features, placing them in order of discriminative power using a given model on the given dataset up to the accuracy plateau.
geneticalgorithm Standard Genetic Algorithm. Implements a standard genetic algorithm using GA package (ga) with a fitness function specialised for feature selection.
preparedataset Dataset preparation This step performs all preparation necessary to perform feamiR analysis, taking a set of mRNAs, a set of miRNAs and an interaction dataset and creating corresponding positive and negative datasets for ML modelling.
randomforest Random Forest. Trains a random forest on the training dataset and uses it to predict the classification of the test dataset. The resulting accuracy, sensitivity and specificity are returned, as well as a summary of the importance of features in the dataset.
rfgini Random Forest cumulative MeanDecreaseGini feature selection. Implements a feature selection approach based on cumulative MeanDecreaseGini using Random Forests trained on multiple subsamples.
runallmodels Run all models. Trains and tests Decision Tree, Random Forest and SVM models on 100 subsamples and provides a summary of the results, to select the best model. The number of trees and kernel chosen by selectsvmkernel and selectrfnumtrees should be used for SVM and Random Forest respectively. We can use this function to inform feature selection, using a Decision Tree voting scheme and a Random Forest measure based on the Gini index.
selectrfnumtrees Tuning number of trees hyperparameter. Trains random forests with a range of number of trees so the optimal number can be identified (using the resulting plot) with cross validation
selectsvmkernel Tuning SVM kernel. Trains SVMs with a range of kernels (linear, polynomial degree 2, 3 and 4, radial and sigmoid) using cross validation so the optimal kernel can be chosen (using the resulting plots). If specified (by showplots=FALSE) the plots are saved as jpegs.
svm SVM
svmlinear Linear SVM Implements a linear SVM using the general svm function (for ease of use in feature selection)
svmpolynomial2 Polynomial degree 2 SVM Implements a polynomial degree 2 SVM using the general svm function (for ease of use in feature selection)
svmpolynomial3 Polynomial degree 3 SVM Implements a polynomial degree 3 SVM using the general svm function (for ease of use in feature selection)
svmpolynomial4 Polynomial degree 4 SVM Implements a polynomial degree 4 SVM using the general svm function (for ease of use in feature selection)
svmradial Radial SVM Implements a radial SVM using the general svm function (for ease of use in feature selection)
svmsigmoid Sigmoid SVM Implements a sigmoid SVM using general svm function (for ease of use in feature selection)