Why monorepo?
I try to explain reasons why the mono-repo makes sense. As the first thing, let's look for real examples of a mono-repo approach:
- Google: all web services, 86 TB code-base, 40000 commits per day, 10000 developers, Git
- Facebook: uses Mercurial as source control
- Microsoft: one single code-base for the entire Windows OS in Git
- other big fishes: Twitter, Airbnb, Uber
Disadvantages
- worse scaling (slow on huge sizes) => large clones => longer build times
- unable to define access control per subfolder (git doesn't support read permissions for subtrees, only write rules)
The first disadvantage is valid only for colossal code bases. 4,2% of all mono-repos will need to solve this issue and put some tooling in place. The largest of them use a virtual file system (VFS) to tackle this problem. Google uses Perforce, and Microsoft builds open-source GVFS for Git. Facebook uses a different source control, the decentralized system called Mercurial, which is more suitable for an extremely large mono-repo.
The second point about access control doesn't have a simple solution, but the question should be: Do I need it? The most crucial benefit of a mono-repo is the visibility and accessibility of the code. Every developer should be able to see it all.
INFO
It is a particularity to accept and align in the culture and practices to avoid restricting accesses. Shared visibility being a strong point of a monorepo. -- Google
Suppose the different levels of access control are absolutely needed. In that case, I prefer to split the single mono-repo into a limited number of smaller mono-repos, which will simulate the other levels. Still, there is no reason for separate access to each project.
A possible workaround for the lack of access control
Suppose your setup still requires limited access to a specific sub-project. For example, an external worker needs to modify one project, and you don't want to give him full read access to the entire mono-repo. In this scenario, it is essential to keep all your projects as isolated as possible, and when the time comes, just use Git's subtree command to split and create a new repository with the requested sub-project. The separated repo can be configured in any desired way and easily shared with the external worker.
Advantages
- sharing, transparency, discoverability, and visibility
- better debugging + dev testing
- simplified dependency management or no dependencies at all
- reduce code duplication and complexity
- effective code reviews
- easy refactoring (cross-project changes)
- tooling and standards
Some advantages can be beneficial even for fully independent projects, as in this mono-repo. For example, tooling, standards, effective code reviews, and transparency would be applied to any project created under the mono-repo. It is much easier to create a folder than an entire repository and configure everything again.
Nothing in this beautiful world of ours is black or white. Everything can be achieved by custom tooling. All the above advantages can be transferred into a poly-repo environment. Still, starting with the setup where they will work efficiently and automatically is much easier than managing and spending resources on more complex tooling.
Example of poly-repos
Here, you can read about the poly-repo approach used in Netflix. They have a lot of excellent tooling for refactoring, dependencies, reviews, and mitigating the advantages of a mono-repo into a poly-repo environment. As you can see, both approaches are promising, and each has its pros and cons.
My point is that mono-repo has many built-in advantages and simplifications that will work out of the box, just for the small price of following a couple of rules and good habits.
Something more to read
- Why we believe mono repos are the right choice for teams that want to ship code faster? | by Pavan Belagatti | Medium
- How Google Does Monorepo - QE Unit
- 5 ways to configure a monorepo for DevSecOps efficiency - Bridgecrew Blog
- Airbnb's Monorepo Journey To Quality Engineering - QE Unit
- The largest Git repo on the planet
- Scaling Git (and some back story)
- The Hands-on Mainstream Repo Models You Need To Know - QE Unit
- Monorepos: Please don’t!. Here we are at the beginning of 2019… | by Matt Klein | Medium
- Towards true continuous integration: distributed repositories and dependencies
- Working with a Monorepo
- Mono-repo or multi-repo? Why choose one, when you can have both? | by Patrick Lee Scott | Medium
- What is monorepo? (and should you use it?) - Semaphore (semaphoreci.com)
- Scaling Mercurial at Facebook
Something to watch on YouTube
- Uber Technology Day: Monorepo to Multirepo and Back Again
- From Monorail to Monorepo: Airbnb's journey into microservices - GitHub Universe 2018
- Dependency Hell, Monorepos and beyond
- Why Google Stores Billions of Lines of Code in a Single Repository
- What Is A Monorepo And Why You Should Care - Monorepo vs. Polyrepo