Document databases like MongoDB have gained fast and strong support from application developers for various reasons. But one of those reasons is clearly to support flexible schemes.
In the world of RDBMS, the schema – the definition of the tables and columns that make up a database – is relatively static. Applications cannot simply add columns to tables or – usually – quickly create tables.
In contrast, MongoDB allowed developers to create groups implicitly and add attributes to those groups on demand. This worked well with rapid development and DevOps because when the database structure definition is stored in code, changes to the database schema can be implemented simply by committing new code to the version control system.
Schema design in MongoDB
However, it is a misconception to think that schema design is less important in MongoDB than in the RDBMS world. The performance limits of a MongoDB application are largely determined by the document model that the application implements. The amount of work an application needs to do to retrieve or process the information primarily depends on how that information is distributed across multiple documents. In addition, the size of the documents will limit the number of documents that MongoDB can cache in memory. These and many other trade-offs will determine how much physical work the database must do to satisfy the database request.
Therefore, designing a schema in MongoDB is just as important as an RDBMS. In fact, schema design can be more complex in MongoDB. At least in SQL databases, we have the “first normal model” which is the starting point for the well-designed first shard data model. In MongoDB, we have more options, but as a result, we have more potential pitfalls.
There are a variety of MongoDB schema design patterns, but they all involve differences in these two approaches: embed Everything in one document or link Groups using pointers to data in other groups. Most applications will use a combination of linking and embedding.
Create the perfect data model
It takes judgment, experience, and experimentation to create the perfect data model, and unfortunately, there are very few tools to help the MongoDB data designer.
External data modeling tool Hackolid Its popularity has increased over the past few years. It started as a MongoDB-specific tool to help visualize and implement MongoDB data models. In recent years, it has added support for other NoSQL databases, for relational databases that implement JSON support, and even for API definitions. Hackolade is especially popular in larger companies, where enterprise data teams seek to understand all enterprise data assets across many disparate databases.
Flexible schemas create an additional challenge for application designers. In a static schema database, all data in the database must correspond to the current version of the schema. However, in MongoDB, the collection may include documents that are organized differently, depending on the date of origin. As the schema code evolves, there may be “old” docs that still map to the older design.
There is no single solution to these release challenges. In some cases, bulk migration to a new scheme may be justified – even if it causes downtime or diminished performance. In other cases, the schema can be modified dynamically in the code layer. Or the code can simply understand how to handle older schema versions. Each of these approaches has advantages, but it is essential that the application engineer understands the approach that will be used.
MongoDB’s flexible schema has a lot of advantages and is clearly popular among modern application developers. However, a fluid schema does not mean no schema, and those who want to develop high-performance, maintainable applications using MongoDB need to exercise the same diligence in schema modeling as they would in any other database system.