In the beginning …
… there was a 9-track tape shipped from Bell Labs. S would be installed for “unix”. Extensions could be programmed in Fortran. I started using this around 1985.
S REPORT presaged Sweave and knitr: troff, eqn, pic could be interlarded with S computations.
Here’s a taste of inputs and outputs for S REPORT. In the left hand panel of Figure 1, .PP starts a paragraph, .EQ and .EN offset an equation markup, and backslash-left-bracket is used to start a block of S code that will be executed when the report is compiled. A right-bracket closes the code block. This markup generates the second paragraph in the right hand panel.
Fast forward:
In May 2002, Bioconductor 1.0 was released with 15 packages, developed for version 1.5 of R. By April of 2025 the collection has grown to over 2300 software packages, developed for version 4.5 of R. To give a sense of the components and processes in play, I made a sketch:
The “infinity signs” are used with “refresh” circular arrows to indicate that the associated resources can be modified at any time, while Bioconductor’s release is nominally “stable” for six months.
In sum
Figure 2 depicts many aspects of Bioconductor that are invisible to users as they do their work. “Ask and you shall receive” is not an overt commitment of Bioconductor’s core. I found it interesting to reflect on the ways in which new capabilities are introduced into the system. The user experience, particularly for those doing statistical analysis of numerical data, is not dramatically different from that of the user of S in the 1980s. The number of technologies involved, and the maintenance efforts required for those technologies and their interactions, is much greater.
It is natural, in my view, to refer to Bioconductor as an ecosystem, as it forms a home for many beings and processes engaged in biological data science. “Ecosystem”, for me, leads to “ecology”, and in my view, ecology is not just a study of interactions between beings and their environments, but includes a normative component. An ecology of Bioconductor will address how to manage its growth and change, identifying principles for expanding along new frontiers, for forming new connections, and for shedding materials that have lost value.
Upcoming
My purpose here is to set the stage for a series of posts examining some relatively new but underappreciated aspects of the project, including
- biocViews, EDAM, and the description of Bioconductor resources
- approaching GPU computing for genomics with Bioconductor
- containers and binaries to speed environment creation
- subcommunity dynamics: scverse, spatialdata, Bioconductor … how to relate?
- python interop
- OME-Zarr
- values derived from excursions involving WebAssembly
- “vendoring” sources of important libraries
- curating genomic annotation
- large arrays: sparsity and indexing
© 2025 Bioconductor. Content is published under Creative Commons CC-BY-4.0 License for the text and BSD 3-Clause License for any code. | R-Bloggers