Open source has its roots in the original pioneers of the computer revolution. Early pioneers in computing at organizations such as Bell Labs and MIT believed that sharing program code was essential to the advancement of computer technology.
It can be argued that these ideals crystallized in Richard Stallman’s GNU Project. The goal of GNU (an iterative acronym for GNU’s Not Unix) was to create a complete UNIX-compatible operating system. In 1989, the GNU General Public License (GPL) was born. This license, which is still in use today, gives users the rights to run, share, and modify the source code.
GNU’s vision of an open source, UNIX-compatible operating system was realized with the advent of Linux. Although it was not created by GNU, it is licensed under the GPL. Very quickly, open source offerings such as Apache Web Server and programming languages such as Python and Perl were combined to create a viable alternative to previously dominant commercial software stacks.
Open source feature
I like to think of open source adoption from an evolutionary perspective. Viruses like COVID-19 succeed initially because they are transmissible – they move quickly from one host to another. Over time, they’ve also succeeded because they evolve – the delta variant of the coronavirus has a stronger (more contagious) feature set than the alpha variant – so newer versions quickly replace older ones.
Open source software has many of these features. It is very portable because developers can use open source software without paying any license fee. Open source usually develops faster as well. If developers want a new feature, they can code it themselves and contribute it back to the project. These two features led to the development of open source projects outside of closed source alternatives during the 2000s.
The growth of web-based software has also helped drive this shift. LAMP – Linux Apache, MySQL, PHP / Perl / Python – presented a completely free integrated software package that can be used to economically build second-generation web programs (for example, Web 2.0). As web-based applications increasingly replace desktop-based Windows applications, it has led to a shift towards open source.
As open source products mature, they also become attractive to the organization for two reasons. First, they tend to be less expensive than their closed source counterparts. Second, the open source approach avoids vendor lockout because, at least in theory, an organization can move to an open source version without the vendor’s help or support from another organization that provides product support services.
open source databases
Open source databases came on the scene in the mid-to-late 1990s, closely behind the release of Linux.
Postgres has its roots in the relational database research projects led by Michael Stonebracker at the University of California-Berkeley in the mid-1980s, but when Postgres95 came out in 1995, with SQL support and under a permissive license, it saw massive adoption.
MySQL also appeared around the same time. While Postgres grew out of academia, MySQL arrived from a practical need for an easy-to-use SQL engine. While Postgres was correct, MySQL was functional. Although MySQL was technically less complex than Postgres, the developers found its ease of use compelling, and found a nice place as the “M” in the LAMP package.
Although databases like MySQL and Postgres changed the market landscape during the 2000s, they did not change the technical outlook. These open source databases have implemented feature subsets in large relational databases, but they are rarely unique feature sets.
However, when the biggest revolution in database technology occurred since the occurrence of the relational model, it was enabled directly through open source.
The open source database revolution
By the mid-2000s, the relational model had completely dominated the database market for more than 20 years. But by the end of that decade, an astonishing proliferation of alternative database models had emerged – almost all driven by open source.
The main drivers of this NoSQL trend have been the demands for a new breed of globally distributed databases that are always up and running and the increasing value and volume of data (as part of the rise of “Big Data”). However, if these are the drivers for the revolution, then open source is the enabler. The ability for developers to quickly iterate and innovate new database offerings based on open source has allowed the creation of dozens of new databases. Some of them – such as Cassandra, Neo4j, and MongoDB – are still around today and have been a huge hit. Others, such as Project Voldemort, Tokyo Cabinet, Dynamite, and Riak, are now gone. Among the “Cambrian explosion” of new databases in late 2010, the survival of the fittest resulted in the survival of the most powerful open source databases.
This explosion of new open source database technologies has dramatically demonstrated the innovation advantage of open source. Commercial vendors, such as Oracle and Microsoft, seemed frozen in place while the open source database community seemed to move at the speed of light. A time travel database specialist from 1995 will have no trouble getting familiar with Oracle RDBMS and SQL Server feature sets in 2020 but Cassandra, Neo4j and MongoDB feature sets will be completely unfamiliar.
Within a decade, open source databases had become completely mainstream. Of the top five database platforms listed by DB-Engines as of 2018, three of them – MongoDB, MySQL, and Postgres – were open source.
However, in recent years, some open source database platforms have felt under attack by massive cloud vendors, most notably by Amazon. Amazon has successfully monetized several open source database products in its own cloud. Amazon offers PostgreSQL and MySQL as commercial services in its own cloud; It is able to generate revenue from these open source products without having to invest in research and development in databases. Likewise, the Amazon versions of Elasticsearch and Redis were very popular.