BiocCheck
Summary
library(BiocCheck)
BiocCheck
encapsulates Bioconductor package guidelines and best practices, analyzing packages and reporting three categories of issues:
BiocCheck
will continue past an ERROR
, thus it is possible to have more than one, but it will exit with an error code if run from the OS command line.)BiocCheck
BiocCheck
is meant to run within R on a directory containing an R package, or a source tarball (.tar.gz
file):
BiocCheck("<packageDirOrTarball>")
BiocCheck
takes options which can be seen with ?BioCheck
.
Note that the --new-package
option is turned on in the Single Package Builder (SPB) during the new package submission process.
BiocCheck
be run
BiocCheck
should always be run after R CMD check
.
Note that BiocCheck
is not a replacement for R CMD check
; it is complementary. It should be run after R CMD check
completes successfully.
BiocCheck
can also be run via GitHub Actions, a continuous integration system on GitHub. This service allows automatic testing of R packages in a controlled build environment.
See the biocthis package for more details.
InstallingBiocCheck
BiocCheck
should be installed as follows:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("BiocCheck")
Interpreting BiocCheck
output
Actual BiocCheck
output is shown below in bold.
Checking for deprecated package usageâ¦
Can be disabled with --no-check-deprecated
.
At present, this looks to see whether your package has a dependency on the multicore
package (ERROR
).
Our recommendation is to use BiocParallel. Note that âforkâ clusters do not provide any gain from parallelizing code on Windows. Socket clusters work on all operating systems.
Also checks Deprecated
Packages currently specified in release and devel versions of Bioconductor (ERROR
).
Checking for remote package usageâ¦
Can be disabled with --no-check-remotes
Bioconductor only allows dependencies that are hosted on CRAN or Bioconductor. The use of Remotes:
in the DESCRIPTION to specify a unique remote location is not allowed.
Checking for âLazyData: trueâ usageâ¦
For packages that include data, we recommend not including LazyData: TRUE
. This rarely proves to be a good thing. In our experience it only slows down the loading of packages with large data (NOTE
).
Can be disabled with --no-check-version-num
and --no-check-R-ver
.
Checking version numberâ¦
Version:
field in your DESCRIPTION
file. If it doesnât, it usually means you did not build the tarball with R CMD build
. (ERROR
)99
âyâ version in the x.y.z
versioning scheme (ERROR
). Package versions starting with a non-zero value will get flagged with a warning. Typical new package submissions start with a zero âxâ version (e.g., 0.99.*
; WARNING
). This is only done if the --new-package
option is supplied. An âxâ nonzero will only be accepted if the package was pre-released or published under such a case.ERROR
).Depends:
field of your DESCRIPTION
file, BiocCheck
checks to make sure that the R version specified matches the version currently used in Bioconductor. This helps to prevent mixing of Bioconductor release and devel versions (esp. when R versions differ) which is a frequent source of confusion and errors (NOTE
).For more information on package versions, see the Version Numbering HOWTO.
Package and File Size CheckCan be disabled with --no-check-pkg-size
and --no-check-file-size
.
Checking package size Checks that the package size meets Bioconductor requirements. The current package size limit is 5 MB for Software packages. Experiment Data and Annotation packages are excluded from this check. This check is only run if checking a source tarball. (ERROR)
Checking individual file sizes The current size limit for all individual files is 5 MB. Checks inspect both package-wide files and data files found in the data
, inst/extdata
, and data-raw
folders. (WARNING)
It may be necessary to remove large files from your Git history; see Remove Large Data Files and Clean Git Tree
biocViews ChecksThese can be disabled with the --no-check-bioc-views
option, which might be useful when checking non-Bioconductor packages (since biocViews is a concept unique to Bioconductor).
Checking biocViewsâ¦
Can be disabled with --no-check-bioc-views
biocViews
field is present in the DESCRIPTION file (ERROR
).ERROR
).WARNING
).WARNING
).recommendBiocViews()
function from biocViews
to automatically suggest some biocViews for your package.More information about biocViews is available in the Using biocViews HOWTO.
Build System Compatibility ChecksThe Bioconductor Build System (BBS) is our nightly build system and it has certain requirements. Packages which donât meet these requirements can be silently skipped by BBS, so itâs important to make sure that every package meets the requirements.
Can be disabled with --no-check-bbs
Checking build system compatibilityâ¦
Checking for blank lines in DESCRIPTION⦠Checks to make sure there are no blank lines in the DESCRIPTION file (ERROR
).
Checking if DESCRIPTION is well formatted⦠Checks if the DESCRIPTION can be parsed with read.dcf (ERROR
)
Checking Description: field length⦠Checks that the Description field in the DESCRIPTION file has a minimum
WARNING
if less than 50)WARNING
if less than 20)NOTE
if less than 3)Checking for whitespace in DESCRIPTION field names⦠Checks to make sure there is no whitespace in DESCRIPTION file field names (ERROR
).
Checking that Package field matches dir/tarball name⦠Checks to make sure that Package
field of DESCRIPTION file matches directory or tarball name (ERROR
).
Checking for Version field⦠Checks to make sure a Version
field is present in the DESCRIPTION file (ERROR
).
Checking for valid maintainer⦠Checks to make sure the DESCRIPTION file has a valid Authors@R
field which resolves to a valid Maintainer
(ERROR
).
A valid Authors@R
field consists of:
person
.cre
(creator) role.family
or given
name defined.NOTE
if not.Suggests that the maintainer provide an ORCID iD in the Authors@R
field as an argument in the person function, e.g., comment = c(ORCID = ...)
(NOTE
).
License:
in the DESCRIPTION
file does not restrict use, e.g., to academic-use only (ERROR
). Licenses are compared to Râs internal database provided at $R_HOME/share/licenses/license.db and read with read.dcf
. Licenses not listed in the database or with spelling deviations e.g., GPL-3.0
vs GPL-3
are flagged with a NOTE
. A NOTE
is also generated if the license is not a valid SPDX license identifier (with the exception of those already in the database file) or if the license cannot be verified in the database. A NOTE
is also generated if the License:
field is malformed, or the database cannot be located. We also recommend developers to browse to the choosealicense to find a suitable license for their package as well as the SPDX License List website.==
in the DESCRIPTION
file (ERROR
).DESCRIPTION
file to see whether recommended fields i.e., âURLâ, âDateâ and âBugReportsâ are populated (NOTE
). Date
field is checked for the format YYYY-MM-DD
.Depends:
and Imports:
fields; if none (WARNING
).Authors@R
field has a person or organization with the fnd
(funder) role (NOTE
)..Rbuildignore
file (ERROR
).<package_name>.BiocCheck
folder byproduct when running BiocCheck(".")
locally does not get included in the package directory (ERROR
).R CMD build
; therefore, inst/doc
folder is not needed (ERROR
).Can be disabled with --no-check-vignettes
.
Checking vignette directoryâ¦
vignettes
directory exists (ERROR
).vignettes
directory only contains vignette sources (.Rmd, .qmd, .Rnw, .Rrst, .Rhtml, .Rtex) (ERROR
)..Rnw
vignettes, if any found, suggest RMarkdown (.Rmd
) vignettes instead (WARNING
).qmd
vignettes, if any found, a âSystemRequirementsâ field must be present in the DESCRIPTION
file (WARNING
). The field should list quarto
as a system requirement.ERROR
).WARNING
)ERROR
)WARNING
)eval=FALSE
chunks is more than 50% of the total (WARNING
).eval=FALSE
. The majority of vignette code is expected to be evaluated (WARNING
)BiocInstaller
code (WARNING
)sessionInfo()
or session_info()
for reproducibility (NOTE
).ERROR
). Function calls in the vignette that match install.*
will produce a warning (WARNING
).ERROR
).NOTE
).Checking whether vignette is built with âR CMD buildââ¦
Only run when --build-output-file
is specified.
Analyzes the output of R CMD build
to see if vignettes are built. It simply looks for a line that starts:
* creating vignettes ...
If this line is not present, it means R
has not detected that a vignette needs to be built (ERROR
).
If you have vignette sources yet still get this message, there could be several causes:
VignetteBuilder
line in the DESCRIPTION
file.VignetteEngine
line in the vignette source.See knitr
âs package vignette page, or the Non-Sweave vignettes section of âWriting R Extensionsâ for more information.
Can be disabled with --no-check-library-calls
and --no-check-install-self
.
NOTE
) Check for use of functions that install or update packages. This list currently includes the use of install
, install.packages
, install_packages
update.packages
or biocLite
.ERROR
) It is not necessary to call library()
or require()
on your own package within code in the R directory or in man page examples. In these contexts, your package is already loaded.Can be disabled with --no-check-coding-practices
.
Checking coding practicesâ¦
Checks to see whether certain programming practices are found in the R directory.
We recommend that vapply()
be used instead of sapply()
. Problems arise when the X
argument to sapply()
has length 0; the return type is then a list()
rather than a vector or array. (NOTE
)
We recommend that seq_len()
or seq_along()
be used instead of 1:...
. This is because the case 1:0
creates the sequence c(1, 0)
which may be an unexpected or unwanted result (NOTE
).
Single colon typos are checked for when a user inputs âpackage:functionâ instead of using double colons (â::â) to import a function (ERROR
).
Users should not download data from external hosting platforms. This means avoiding references to major platforms such as GitHub, GitLab, and BitBucket. For the same reason we do not import GitHub packages, external data can be unstable and not well maintained. Maintainers should re-use data already available in Bioconductor or contribute an ExperimentHub
, AnnotationHub
or similar package (ERROR
).
A package should not download files at the time of loading or attaching i.e., using library
. Using download.file
and download
should be avoided and when found, an ERROR
will be emitted.
paste
and paste0
function calls within signaling functions such as message
, warning
, and stop
are redundant and should be avoided (NOTE
). paste
calls with the collapse
argument are ignored.
When notifying users, message
should be used. When cat
and print
are used, users will get a note saying that these should only be used in show methods for classes (NOTE
).
message
, warn*
, and error
keywords should not be included in signal condition functions: message
, warning
, and stop
. This is redundant and should be avoided (NOTE
).
It is favorable to use the assignment arrow (â<-â) over the equals assignment (â=â) for clarity in the code and legibility. Any use of the =
will be flagged with a NOTE
.
New submissions should not use any .Deprecated
, .Defunct
, lifeCycle
, deprecate_warn
, or deprecate_stop
function calls (WARNING
). Existing packages should evolve these functions after a Bioconductor release according to the package guidelines.
Checking for T⦠Checking for F⦠It is bad practice to use T
and F
for TRUE
and FALSE
. This is because T
and F
are ordinary variables whose value can be altered, leading to unexpected results, whereas the value of TRUE
and FALSE
cannot be changed (WARNING
).
Avoid class membership checks with class()
/ is()
and ==
/ !=
. Developers should use is(x, 'class')
for S4 classes. (WARNING
)
Use system2()
over system()
. âsystem2â is a more portable and flexible interface than âsystemâ.(NOTE
)
Use of set.seed()
in R code. The set.seed
should not be set in R functions directly. The user should always have the option for the set.seed and know when it is being invoked. (WARNING
)
Checking parsed R code in R directory, examples, vignettesâ¦
BiocCheck
parses the code in your packageâs R directory, and in evaluated man page and vignette examples to look for various symbols, which result in issues of varying severity.
BiocCheck
checks for direct slot access (via @
or slot()
) to S4 objects in vignette and example code. This code should always use accessors to interact with S4 classes. Since you may be using S4 classes (which donât provide accessors) from another package, the severity is only NOTE
. But if the S4 object is defined in your package, itâs mandatory to write accessors for it and to use them (instead of direct slot access) in all vignette and example code (NOTE
).browser()
causes the command-line R debugger to be invoked, and should not be used in production code (though itâs OK to wrap such calls in a conditional that evaluates to TRUE if some debugging option is set) (WARNING
).<<-
is bad practice. It can over-write user-defined symbols, and introduces non-linear paths of evaluation that are difficult to debug (NOTE
).Sys.setenv
function (ERROR
).suppressWarnings
and suppressMessages
is problematic as it usually indicates a larger underlying issue with the fragility of the package codebase (NOTE
).Can be disabled with --no-check-function-len
.
Checking function lengthsâ¦
BiocCheck
prints an informative message about the length (in lines) of your five longest functions (this includes functions in your R directory and in evaluated man page and vignette examples).
If there are functions longer than 50 lines, BiocCheck
outputs (NOTE
). You may want to consider breaking up long functions into smaller ones. This is a basic refactoring technique that results in code thatâs easier to read, debug, test, reuse, and maintain.
Can be disabled with --no-check-man-doc
.
Checking man page documentationâ¦
It can be handy to generate man page skeletons with prompt()
and/or RStudio. These skeletons contain comments that look like this:
%% ~~ A concise (1-5 lines) description of the dataset. ~~
BiocCheck
asks you to remove such comments (NOTE
).
Every man page must have a non-empty \value
section (WARNING
). Rd
pages without \usage
sections or documenting data sets are excluded.
man page examples examples
Checking exported objects have runnable examplesâ¦
BiocCheck
looks at all man pages which document exported objects and lists the ones that donât contain runnable examples (either because there is no examples
section or because its examples are tagged with dontrun
or donttest
). Runnable examples are a key part of literate programming and help ensure that your code does what you say it does.
ERROR
).BiocCheck
lists the missing ones and asks you to add runnable examples to them (NOTE
).dontrun
or donttest
. Use of these functions is not recommended and shoud be justified (NOTE
). If exception is made the recommended usage is to use donttest over dontrun (NOTE
) as donttest requires valid R code.Can be disabled with --no-check-news
.
Checking package NEWSâ¦
BiocCheck
looks to see if there is a valid NEWS file either in the âinstâ directory or in the top-level directory of your package, and checks whether it is properly formatted (NOTE
).
The location and format of the NEWS file must be consistent with ?news
. Meaning the file can be one of the following four options:
inst/NEWS.Rd
./NEWS.md
./NEWS
inst/NEWS
NEWS files are a good way to keep users up-to-date on changes to your package. Excerpts from properly formatted NEWS files will be included in Bioconductor release announcements to tell users what has changed in your package in the last release. In order for this to happen, your NEWS file must be formatted in a specific way; you may want to consider using an inst/NEWS.Rd
file instead as the format is more well-defined. Malformatted NEWS file outputs WARNING
.
More information on NEWS files is available in the help topic ?news
.
Can be disabled with --no-check-unit-tests
.
Checking unit testsâ¦
We strongly recommend unit tests, though we do not at present require them. For more on what unit tests are, why they are helpful, and how to implement them, read our Unit Testing HOWTO.
At present we just check to see whether unit tests are present, and if not, urge you to add them (NOTE
).
Checking skip_on_bioc() in testsâ¦
Can be disabled with --no-check-skip-bioc-tests
.
Finds flag for skipping tests in the bioconductor environment (NOTE
)
Can be disabled with --no-check-formatting
.
Checking formatting of DESCRIPTION, NAMESPACE, man pages, R source, and vignette sourceâ¦
There is no 100% correct way to format code. These checks adhere to the Bioconductor Style Guide (NOTE
).
We think itâs important to avoid very long lines in code. Note that some text editors do not wrap text automatically, requiring horizontal scrolling in order to read it. Also note that R syntax is very flexible and whitespace can be inserted almost anywhere in an expression, making it easy to break up long lines.
These checks are run against not just R code, but the DESCRIPTION and NAMESPACE files as well as man pages and vignette source files. All of these files allow long lines to be broken up.
The output of this check includes the first 6 offending lines of code; see more with BiocCheck:::checkFormatting("path/to/YourPackage", nlines=Inf)
.
There are several helpful packages that can be used for formatting of R code to particular coding standards such as formatR and styler as well as the âReformat codeâ button in RStudio Desktop. Each solution has its advantages, though styler works with roxygen2
examples and is actively maintained. You can re-format your code using styler as shown below:
## Install styler if necessary
if (!requireNamespace("styler", quietly = TRUE)) {
install.packages("styler")
}
## Automatically re-format the R code in your package
styler::style_pkg(transformers = styler::tidyverse_style(indent_by = 4))
If you are working with RStudio Desktop use also the âReformat codeâ button which will help you break long lines of code. Alternatively, use formatR, though beware that it can break valid R code involving both types of quotation marks ("
and '
) and does not support re-formatting roxygen2
examples. In general, it is best to version control your code before applying any automatic re-formatting solutions and implement unit tests to verify that your code runs as intended after you re-format your code.
Checking if package already exists in CRAN⦠This can be disabled with the --no-check-CRAN
option. A package with the same name (case differences are ignored) cannot exist on CRAN. Packages submitted to Bioconductor must be removed from CRAN before the next Bioconductor release (WARNING
).
Checking if new package already exists in Bioconductor⦠Only run if the --new-package
flag is turned on. A package with the same name (case differences are ignored) cannot exist in Bioconductor (ERROR
).
Checking for bioc-devel mailing list subscriptionâ¦
This only applies if BiocCheck
is run on the Bioconductor build machines, because this step requires special authorization. This can be disabled with the --no-check-bioc-help
option.
Check that the email address in the Maintainer (or Authors@R) field is subscribed to the bioc-devel mailing list (ERROR
).
All maintainers must subscribe to the bioc-devel mailing list, with the email address used in the DESCRIPTION file. You can subscribe here.
Checking for support site registrationâ¦
Check that the package maintainer is register at our support site using the same email address that is in the Maintainer
field of their package DESCRIPTION
file (ERROR
). This can be disabled with the --no-check-bioc-help
option.
The main place people ask questions about Bioconductor packages is the support site. Please register and then include your package name in the list of watched tags. When a question is asked and tagged with your package name, youâll get an email.
Package name is in support site watched tags is now a requirement.
BiocCheckGitClone
BiocCheckGitClone
provides a few additional Bioconductor package checks that can only should be run on a open source directory (raw Git clone) NOT a tarball. Reporting similarly in three categories as discussed above:
ERROR.
WARNING.
NOTE.
BiocCheckGitClone
BiocCheckGitClone
is meant to run within R on a directory containing an R package:
BiocCheckGitClone("packageDir")
Installing BiocCheckGitClone
Please see previous Installing BiocCheck
section.
BiocCheckGitClone
output
Actual BiocCheckGitClone
output is shown below in bold.
Checking valid files
There are a number of files that should not be Git tracked. This check notifies if any of these files are present (ERROR
)
The current list of files checked are given by this internal constant:
BiocCheck:::.HIDDEN_FILE_EXTS
## [1] ".renviron" ".rprofile" ".rproj" ".rproj.user"
## [5] ".rhistory" ".rapp.history" ".o" ".sl"
## [9] ".so" ".dylib" ".a" ".dll"
## [13] ".def" ".ds_store" "unsrturl.bst" ".log"
## [17] ".aux" ".backups" ".cproject" ".directory"
## [21] ".dropbox" ".exrc" ".gdb.history" ".gitattributes"
## [25] ".gitmodules" ".hgtags" ".project" ".seed"
## [29] ".settings" ".tm_properties" ".rdata"
These files may be included in your personal directories but should be added to a .gitignore
file so they are not Git tracked.
Checking DESCRIPTION
Default R CMD build behavior will format the DESCRIPTION file; After this occurs, it is hard to determine certain aspects of the original DESCRIPTION file. An example would be how the Authors and Maintainers are specified. The DESCRIPTION file is therefore checked in its raw original form.
Checking if DESCRIPTION is well formatted The DESCRIPTION file must be properly formatted and able to be read in with read.dcf()
in order to function properly on the Bioconductor build machines. This check attempts to read.dcf("DESCRIPTION")
and throws an ERROR if mal-formatted. (ERROR
)
Checking for valid maintainer While in the past using the Author and Maintainer fields were acceptable, R has moved towards using the Authors@R
standard for listing package contributors. This checks that Authors@R is utilized and that there are no instances of Author or Maintainer in the DESCRIPTION (ERROR
)
Checking that CITATION file is correctly formatted
BiocCheck
tries to read the provided CITATION
file (i.e. not the one automatically generated by each package) with readCitationFile()
- this is expected to be in the INST
folder (NOTE
). readCitationFile()
needs to work properly without the package being installed. Most common causes of failure occur when trying to use helper functions like packageVersion()
or packageDate()
instead of using meta$Version
or meta$Date
. See R documentation for more information.
Here is an example of a formatted CITATION
file. See the GenomicRanges
package CITATION
file for details.
library(utils)
readCitationFile(
system.file("CITATION", package = "GenomicRanges")
)
## Lawrence M, Huber W, Pag\`es H, Aboyoun P, Carlson M, et al. (2013)
## Software for Computing and Annotating Genomic Ranges. PLoS Comput Biol
## 9(8): e1003118. doi:10.1371/journal.pcbi.1003118
##
## A BibTeX entry for LaTeX users is
##
## @Article{,
## title = {Software for Computing and Annotating Genomic Ranges},
## author = {Michael Lawrence and Wolfgang Huber and Herv\'e Pag\`es and Patrick Aboyoun and Marc Carlson and Robert Gentleman and Martin Morgan and Vincent Carey},
## year = {2013},
## journal = {{PLoS} Computational Biology},
## volume = {9},
## issue = {8},
## doi = {10.1371/journal.pcbi.1003118},
## url = {http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118},
## }
CITATION
files are expected to contain a doi
input within the bibentry()
function call. When a doi
input is not present, a WARNING
is emitted as most modern publications should have an assigned DOI.
Note that citEntry()
should be updated to bibentry()
as seen with R CMD check --as-cran
.
Bioconductor packages are not required to have a CITATION
file but it is useful both for users and for tracking Bioconductor project-wide metrics. Maintainers should update the CITATION
file once a preprint or publication is released. Packages that do not have a CITATION
file are flagged with a NOTE
.
BiocCheck
We make an effort to reduce package reviewer burden and to increase the quality of Bioconductor submissions via automated checks; therefore, BiocCheck
is a continually evolving package. Contributions are certainly most welcome. Consider opening a pull request on GitHub with unit tests and updates to both the NEWS
file and vignette. Thank you for being part of the community!
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4