BUIP044: Development Process
Proposer: Amaury SÉCHET
Submitted: 2017-01-09
Status: draft
Summary:
This BUIP aims to provide a world class development process for BU. The
current process has several shortcomings, such as the developer being
bound to become a bottleneck as the team grows, or development being
stalled when a release is prepared. The proposed process is a variation
fo what is used in many software companies and open source projects.
I propose we start moving toward this process after the next release so
the release process isn’t disturbed too much. If this BUIP is accepted,
the articles will need to be amended to reflect its content.
Definition:
A developer is anyone writing code with the intention of having it
included in BU. When using the term developer I’m not referring to the
developer of BU’s articles.
A committer is someone with commit right to the BU repository.
The release manager is the person deciding what goes into the next
release and when.
**Specification:
**
The developer as per the BU articles becomes the release manager and a
committer. Other trusted BU members can become committer as well, but
there is only one release manager.
The development happens in the branch “master”. All developers must base
their PR on this branch, the only exception is a merge conflict in the
release branch.
To be merged, a PR needs to be accepted by a committer. A developer who
is also a committer cannot accept its own PR.
PR are squashed and rebased on top of master. The only exception is when
backporting work from 3rd party client such as bitcoin core or bitcoin
classic, in which case a merge node is created and the commit structure
kept “as this”.
Release Process:
Once the release manager decides it is time, (s)he cuts a branch from
master called bu_X.Y , where X and Y are major and minor version
numbers. Any change made in master before the branch cut will be
included into the release. Any change made after the branch cut won’t,
unless a developer specifically request the release manager do merge
this specific change. The change must be already in master before such
request is made. If the change doesn’t merge cleanly into the release
branch, the developer needs to do another PR with the change made on the
release branch. The PR needs to be as close as possible as the original
one.
Once the release manager consider the release reached an appropriate
level of stability, (s)he creates a tag bu_X.Y.0 and publishes the
binaries.
Change continue to be merged into the branch bu_X.Y branch and the
release manager creates tag bu_X.Y.Z when (s)he judges a new revision
needs to be published.
The process continues in the bu_X.Y branch until the release manager
decides it is time to create the branch bu_X.(Y + 1), at which time the
process moves into that branch.
Guideline and rationale:
This section contains a bunch of guideline about how to use this process
successfully. It also contains various rationale to for above rules. The
rules mentioned above are not very numerous on purpose. Complex process
with many rules tends to get in the way by their lack of adaptability.
In our case, we want to have a set of guideline, but the only rule to
get something merged is to have a committer accept it. This way,
committers can chose to disregard some guideline when it doesn’t make
sense in this particular case. For instance, it would be undesirable
that the development work is stalled because the test infrastructure is
broken, so having a rule such as nothing can be committed unless tests
are green would be counter productive. On the other hand, we can trust
committers to enforce that tests are green, but they can make exception
when test do not run properly for some reason.
master needs to be in a state where a release branch can be cut pretty
much at all time. Any breakage happening in master must be treated with
the upmost importance. It is expected from the developer who made the
code responsible for the breakage to fix it promptly. If this isn’t
happening, either because the developer is not reachable or because the
problem is too complex to be fixed promptly, the diff will be
reverted.
PR must be kept on point. It makes code review easier and make history
easier to follow. In addition, this ensure that each diff can be
reverted cleanly with as little risk of conflict as possible. This is
important because we want to keep master in good shape at all time so we
need to be able to revert diff breaking something easily.
It is advisable to have a bot merge code rather than the committer
him/herself. The bot can rebase the code on top of master, squash it,
run the tests and commit if the process is successful, or report the
problem if it isn’t. This avoid merging code conflict with something
that was committed in between when the patch was created and when it was
accepted.
It isn’t because all tests are green that everything works properly. It
is advisable to have a build machine creating a new build on a regular
basis, run it in a monitored environment and report any abnormality.
It is a good idea to keep slow tests out of the test suite that run on
each PR. This slow down development and has diminishing return. A good
alternative is to have a machine pulling master and running the slow
tests on a regular basis. When one of these fails, the machine can
bisect to find what commit broke it and report it. This allow developer
to move faster, and, if PR are kept on point, it won’t be too hard to
revert the offending code.
Rebasing/squashing is a better alternative to merge. If a PR is made of
several commits, it either indicate that it is doing several thing and
should be split in several PRs, or it is doing one thing and each commit
present an incomplete picture of the change. Having a linear history
with focused commits makes it much easier to understand for newcomers.
It is also much better to work with when making tooling as the ones
mentioned above.
When something fails in master, while no tests found about it, it is
good to have an incident review. How did the problem happen, how come it
wasn’t detected, and coming up with actionable items to avoid or
mitigate the problem is the goal of the review. Good direction usually
are: make the problem impossible by design - for instance via
refactoring, add testing for this specific case, add monitoring in
pre-production for this kind of scenarii, etc…
It is often preferable to make failing cheap rather than prevent
failure. Technique which prevent failure tends to be expensive in other
ways such as development slowness and usually are self defeating in the
long run - for instance development slowness will cause the counter
measure to be implemented slowly, and may even cause the project to
slowly die compared to the competition. Slowing down dev is *ALWAYS* a
losing strategy.
Grilling the master and release branch will all the available tooling is
a good idea. We should try to setup instrumented clang build (ASAN,
UBSAN, TSAN) as soon as possible.
It is good to avoid time consuming tasks such as formatting. Providing a
clang-format config is a good idea. having a linter flag style problem
in PR before any reviewer waste time on it is also a good idea. The
developer gets immediate feedback and the committer do not need to spend
time on these problems.
Automate, automate, automate, automate, …
Do not accept PR that come without tests without a good reason to do so.
Try to figure out what the developer has done to test its changes.
“Shotgun diffs” (diff that touches many files) soon before the branch
cut. This will reduce the number of merge conflict into the release
branch.
Release early, release often. This keeps discrepancy between ongoing dev
and code in the wild low, and allow for real world feedback on the work
being done. Contrary to intuitive expectation, this also reduce the
number of defects. The more unique change there is in a release, the
more likely something surprising is going to happen and the harder it
will be to track it down.
Any committer can accept any PR. However, it is a good idea to let
committer with more expertise in the area review it. Making this a
guideline rather than a rule allow for these reviewers to be bypassed if
they can’t handle the load for some reason. This way, nobody becomes a
bottleneck.
The release manager should be working on the project full time.