Rxivist indexes articles from bioRxiv, a free preprint server by Cold Spring Harbor Laboratory. This package is a client for the Rxivist API and can be used to access metadata from:
To install rxivistr package, run:
devtools::install_github("ikodvanj/rxivistr")
Load the package using library()
function.
Package contains following functions:
rxivist_search
- retrieves articles with the matching descriptionarticle_details
- retrieves data about a single paper and all of its authorsarticle_downloads
- retrieves monthly download statistics for articles.authors_rank
- retrieves top 200 authors in the specified category.author
- provides information about the specified author.category_list
- retrieves a list of all categoriesrxivist_stats
- retrieves basic statistics about the number of articles indexed by the Rxivist.In the following text, examples are provided for each function.
Following function retrieves articles top 5 most downloaded articles related to COVID-19:
res <- rxivist_search(search_phrase = "COVID-19", from = "alltime", sortby = "downloads", limit = 5)
dplyr::glimpse(res)
#> Rows: 5
#> Columns: 10
#> $ id <int> 81793, 77469, 76533, 78358, 83296
#> $ metric <int> 219628, 164766, 102134, 98929, 86680
#> $ title <chr> "Spike mutation pipeline reveals the emergence of a more…
#> $ url <chr> "https://api.rxivist.org/v1/papers/81793", "https://api.…
#> $ biorxiv_url <chr> "https://www.biorxiv.org/content/10.1101/2020.04.29.0690…
#> $ doi <chr> "10.1101/2020.04.29.069054", "10.1101/2020.03.22.002386"…
#> $ category <chr> "evolutionary-biology", "systems-biology", "microbiology…
#> $ first_posted <chr> "2020-04-30", "2020-03-22", "2020-03-12", "2020-03-31", …
#> $ abstract <chr> "We have developed an analysis pipeline to facilitate re…
#> $ authors <list> [<data.frame[17 x 2]>, <data.frame[100 x 2]>, <data.fra…
At the time of writing this vignette, the most downloaded article had an id 72514. With the following function we will retrieve information about this article:
res <- article_details(72514)
dplyr::glimpse(res)
#> List of 11
#> $ id : chr "72514"
#> $ doi : chr "10.1101/2020.01.30.927871"
#> $ first_posted: chr "2020-01-31"
#> $ biorxiv_url : chr "https://www.biorxiv.org/content/10.1101/2020.01.30.927871v2"
#> $ url : chr "https://api.rxivist.org/v1/papers/72514"
#> $ title : chr "Uncanny similarity of unique inserts in the 2019-nCoV spike protein to HIV-1 gp120 and Gag"
#> $ category : chr "evolutionary-biology"
#> $ abstract : chr "This paper has been withdrawn by its authors. They intend to revise it in response to comments received from th"| __truncated__
#> $ authors :'data.frame': 9 obs. of 4 variables:
#> ..$ id : int [1:9] 580441 580442 580443 580444 580445 582554 295517 580447 580448
#> ..$ name : chr [1:9] "Prashant Pradhan" "Ashutosh Kumar Pandey" "Akhilesh Mishra" "Parul Gupta" ...
#> ..$ institution: chr [1:9] "Indian Institute of Technology Delhi" "Indian Institute of Technology Delhi" "Indian Institute of Technology, New Delhi" "Indian Institute of Technology Delhi" ...
#> ..$ orcid : logi [1:9] NA NA NA NA NA NA ...
#> $ ranks :List of 4
#> ..$ alltime :List of 4
#> .. ..$ downloads: int 962296
#> .. ..$ rank : int 1
#> .. ..$ out_of : int 94912
#> .. ..$ tie : logi FALSE
#> ..$ ytd :List of 4
#> .. ..$ downloads: int 962296
#> .. ..$ rank : int 1
#> .. ..$ out_of : int 94912
#> .. ..$ tie : logi FALSE
#> ..$ lastmonth:List of 4
#> .. ..$ downloads: int 14987
#> .. ..$ rank : int 3
#> .. ..$ out_of : int 94912
#> .. ..$ tie : logi FALSE
#> ..$ category :List of 4
#> .. ..$ downloads: int 962296
#> .. ..$ rank : int 1
#> .. ..$ out_of : int 5736
#> .. ..$ tie : logi FALSE
#> $ publication : Named list()
To investigate the number of downloads, article_downloads
function can be used:
This function returns a list of all categories to which articles are classified:
category_list()
#> $results
#> [1] "animal-behavior-and-cognition"
#> [2] "biochemistry"
#> [3] "bioengineering"
#> [4] "bioinformatics"
#> [5] "biophysics"
#> [6] "cancer-biology"
#> [7] "cell-biology"
#> [8] "clinical-trials"
#> [9] "developmental-biology"
#> [10] "ecology"
#> [11] "epidemiology"
#> [12] "evolutionary-biology"
#> [13] "genetics"
#> [14] "genomics"
#> [15] "immunology"
#> [16] "microbiology"
#> [17] "molecular-biology"
#> [18] "neuroscience"
#> [19] "paleontology"
#> [20] "pathology"
#> [21] "pharmacology-and-toxicology"
#> [22] "physiology"
#> [23] "plant-biology"
#> [24] "scientific-communication-and-education"
#> [25] "synthetic-biology"
#> [26] "systems-biology"
#> [27] "zoology"
Returns information about the number of articles indexed by the Rxivist.
res <- rxivist_stats()
dplyr::glimpse(res)
#> List of 8
#> $ papers_indexed : int 94912
#> $ authors_indexed : int 404161
#> $ missing_abstract : int 1
#> $ missing_date : int 0
#> $ outdated_count :List of 28
#> ..$ animal-behavior-and-cognition : int 1486
#> ..$ biochemistry : int 3211
#> ..$ bioengineering : int 2155
#> ..$ bioinformatics : int 8837
#> ..$ biophysics : int 4144
#> ..$ cancer-biology : int 3367
#> ..$ cell-biology : int 4894
#> ..$ clinical-trials : int 99
#> ..$ developmental-biology : int 2812
#> ..$ ecology : int 4171
#> ..$ epidemiology : int 1556
#> ..$ evolutionary-biology : int 5736
#> ..$ genetics : int 4824
#> ..$ genomics : int 5954
#> ..$ immunology : int 2857
#> ..$ microbiology : int 8320
#> ..$ molecular-biology : int 3265
#> ..$ neuroscience : int 16862
#> ..$ paleontology : int 127
#> ..$ pathology : int 550
#> ..$ pharmacology-and-toxicology : int 909
#> ..$ physiology : int 1326
#> ..$ plant-biology : int 2941
#> ..$ scientific-communication-and-education: int 645
#> ..$ synthetic-biology : int 885
#> ..$ systems-biology : int 2425
#> ..$ zoology : int 494
#> ..$ null : int 50
#> $ missing_authors : int 63
#> $ missing_category : int 50
#> $ authors_no_papers: int 23