Trimming samples or features whose prevalence is less than threshold

trim samples or features in profile by Prevalence, which means the samples or features will be discarded if they could not pass the cutoff.

trim_prevalence(
    object,
    level = c(NULL, "Kingdom", "Phylum", "Class",
           "Order", "Family", "Genus",
           "Species", "Strain", "unique"),
    cutoff = 0.1,
    group = NULL,
    trim = c("none",
      "both",
      "feature", "feature_group",
      "sample"),
    at_least_one = FALSE)

Arguments

object

(Required). a matrix, otu_table, phyloseq::phyloseq or SummarizedExperiment

level

(Optional). character. taxonomic level to summarize, default the top level rank of the ps. taxonomic level(Kingdom, Phylum, Class, Order, Family, Genus, Species, Strains; default: NULL).

cutoff

(Optional). Numeric. the Prevalence threshold (default: 0.1).

group

(Optional). character. filtering features or samples by group (default: NULL).

trim

(Optional). Character. trimming to apply, the options include:

"none", return the original data without any actions.
"both", prevalence of features and samples more than cutoff.
"feature", prevalence of features more than cutoff.
"feature_group", prevalence of features more than cutoff by groups.
"sample", prevalence of samples more than cutoff.

at_least_one

(Optional). Logical. prevalence of at least one group meets cutoff (FALSE means all groups meet cutoff, default: FALSE).

Value

A trimed object whose prevalence of features or samples more than cutoff.

Author

Created by Hua Zou (11/30/2021 Shenzhen China)

Examples

# \donttest{
# phyloseq object
data("Zeybel_2022_gut")
trim_prevalence(
  Zeybel_2022_gut,
  level = "Phylum",
  cutoff = 0.1,
  trim = "feature")
#> phyloseq-class experiment-level object
#> otu_table()   OTU Table:         [ 6 taxa and 42 samples ]
#> sample_data() Sample Data:       [ 42 samples by 46 sample variables ]
#> tax_table()   Taxonomy Table:    [ 6 taxa by 3 taxonomic ranks ]

# SummarizedExperiment object
data("Zeybel_2022_protein")
trim_prevalence(
  Zeybel_2022_protein,
  level = NULL,
  cutoff = 0.1,
  trim = "feature")
#> class: SummarizedExperiment 
#> dim: 72 54 
#> metadata(0):
#> assays(1): ''
#> rownames(72): IL8 VEGFA ... TNFB CSF_1
#> rowData names(3): ProteinID LOD prop
#> colnames(54): P101001 P101003 ... P101095 P101096
#> colData names(47): PatientID Gender ... Right_leg_fat_free_mass
#>   Right_leg_total_body_water
# }