Skip to main content

Augmenting mailing-lists with Patchwork - Another try

The mailing-list problem


Many software projects use mailing-lists, which usually means mailman, not only for discussions around that project, but also for code contributions. A lot of open source projects work that way, including the one I interact with the most, the Linux kernel. A contributor sends patches to a mailing list, these days using git send-email, and waits for feedback or for his/her patches to be picked up for inclusion if fortunate enough.

Problem is, mailing-lists are awful for code contribution.

A few of the issues at hand:
  • Dealing with patches and emails can be daunting for new contributors,
  • There's no feedback that someone will look into the patch at some point,
  • There's no tracking of which patch has been processed (eg. included into the tree). A shocking number of patches are just dropped as a direct consequence,
  • There's no way to add metadata to a submission. For instance, we can't assign a reviewer from a pool of people working on the project. As a result, review is only working thanks to the good will of people. It's not necessarily a bad thing, but it doesn't work in a corporate environment with deadlines,
  • Mailing-lists are all or nothing: one subscribes to the activity of the full project, but may only care about following the progress of a couple of patches,
  • There's no structure at all actually, it's all just emails,
  • No easy way to hook continuous integration testing,
  • The tools are really bad any time they need to interact with the mailing-list: try to send a patch as a reply to a review comment, addressing it. It starts with going to look at the headers of the review email to copy/paste its Message-ID, followed by an arcane incantation:
    $ git send-email --to=<mailing-list> --cc=<reviewer> \
                     --in-reply-to=<reviewer-mail-message-id> \
                     --reroll-count 2 -1 HEAD~2

Alternative to mailing-lists


Before mentioning Patchwork, it's worth saying that a project can merely decide to switch to using something else than a mailing-list to handle code contributions; To name a few: Gerrit, Phabricator, Github, Gitlab, Crucible.

However, there can be some friction preventing the adoption those tools. People have built their own workflow around mailing-lists for years and it's somewhat difficult to adopt anything else over night. Projects can be big with no clear way to make decisions, so sticking to mailing-lists can just be the result of inertia.

The alternatives also have problems of their own and there's no clear winner, nothing like how git took over the world.


Patchwork


So, the path of least resistance is to keep mailing-lists. Jemery Kerr had the idea to augment mailing-lists with a tool that would track the activity there and build a database of patches and their status (new, reviewed, merged, dropped, ...). Patchwork was born.

Here are some Patchwork instances in the wild:

The KMS and DRI Linux subsystems are using freedesktop.org to host their mailing-lists, which includes the i915 Intel driver, project I've been contributing to since 2012. We have an instance of Patchwork there, and, while somewhat useful, the tool fell short of what we really wanted to do with our code contribution process.

Patches are welcome!


So? it was time to do something about the situation and I started improving Patchwork to answer some of the problems outlined above. Given enough time, it's possible to help on all fronts.

The code can be found on github, along with the current list of issues and enhancements we have thought about. I also maintain freedesktop.org's instance for the graphics team at Intel, but also any freedesktop.org project that would like to give it a try.


Design, Design, Design


First things first, we improved how Patchwork looks and feels. Belén, of OpenEmbedded/Yocto fame, has very graciously spent some of her time to rethink how the interaction should behave.

Before, ...

... and after!

There is still a lot of work remaining to roll out the new design and the new interaction model on all of Patchwork. A glimpse of what that interaction looks like so far:



Series


One thing was clear from the start: I didn't want to have Patches as the main object tracked, but Series, a collection of patches. Typically, developing a  new feature requires more than one patch, especially with the kernel where it's customary to write a lot of orthogonal smaller commits rather than a big (and often all over the place) one. Single isolated commits, like a small bug fix, are treated as a series of one patch.

But that's not all. Series actually evolve over time as the developer answers review comments and the patch-set matures. Patchwork also tracks that evolution, creating several Revisions for the same series. This colour management series from Lionel shows that history tracking (beware, this is not the final design!).

I have started documenting what Patchwork can understand. Two ways can be used to trigger the creation of a new revision: sending a revised patch as a reply to the reviewer email or resending the full series with a similar cover letter subject.

There are many ambiguous cases and some others cases not really handled yet, one of them being sending a series as a reply to another series. That can be quite confusing for the patch submitter but the documented flows should work.

REST API


Next is dusting off Patchwork's XML-RPC API. I wanted to be able to use the same API from both the web pages and git-pw, a command line client.

This new API is close to complete enough to replace the XML-RPC one and already offers a few more features (eg. testing integration). I've also been carefully documenting it.

git-pw


Rob Clark had been asking for years for a better integration with git from the Patchwork's command line tool, especially sharing its configuration file. There also are a number of git "plugins" that have appeared to bridge git with various tools, like git-bz or git-phab.

Patchwork has now his own git-pw, using the REST API. There, again, more work is needed to be in an acceptable shape, but it can already be quite handy to, for instance, apply a full series in one go:

$ git pw apply -s 122
Applying series: DP refactoring v2 (rev 1)
Applying: drm/i915: Don't pass *DP around to link training functions
Applying: drm/i915: Split write of pattern to DP reg from intel_dp_set_link_train
Applying: drm/i915 Call get_adjust_train() from clock recovery and channel eq
Applying: drm/i915: Move register write into intel_dp_set_signal_levels()
Applying: drm/i915: Move generic link training code to a separate file
...

Testing Integration



This is what kept my busy the last couple of months: How to integrate patches sent to a mailing-list with Continuous Integration systems. The flow I came up with is not very complicated but a picture always helps:

Hooking tests to Patchwork


Patchwork is exposing an API so mailing-lists are completely abstracted from systems using that API. Both retrieving the series/patches to test and sending back test results is done through HTTP. That makes testing systems fairly easy to write.

Tomi Sarvela hooked our test-suite, intel-gpu-tools, to patches sent to intel-gfx and we're now gating patch acceptance to the kernel driver with the result of that testing.

Of course, it's not that easy. In our case, we've accumulated some technical debt in both the driver and the test suite, which means it will take time to beat both into be a fully reliable go/no-go signal. People have been actively looking at improving the situation though (thanks!) and I have hope we can reach that reliability sooner rather than later.

As a few words of caution about the above, I'd like to remind everyone that the devil always is in the details:
  • We've restricted the automated testing to a subset of the tests we have (Basic Acceptance Tests aka BATs) to provide a quick answer to developers, but also because some of our tests aren't well bounded,
  • We have no idea how much code coverage that subset really exercises, playing with the kernel gcov support would be interesting for sure,
  • We definitely don't deal with the variety of display sinks (panels and monitors) that are present in the wild.
This means we won't catch all the i915 regressions. Time will definitely improve things as we connect more devices to the testing system and fix our tests and driver.

Anyway, let's leave i915 specific details for another time. A last thing about this testing integration is that Patchwork can be configured to send emails back to the submitter/mailing-list with some test results. As an example, I've written a checkpatch.pl integration that will tell people to fix their patches without the need of a reviewer to do it. I know, living in the future.

For more in-depth documentation about continuous testing with Patchwork, see the testing section of the manual.

What's next?


This blog post is long enough as is, let's finish by the list of things I'd like to be in a acceptable state before I'll happily tag a first version:
  • Series support without without known bugs
  • REST API and git pw able to replace XML-RPC and pwclient
  • Series, Patches and Bundles web pages ported to the REST API and the new filter/action interaction.
  • CI integration
  • Patch and Series life cycle redesigned with more automatic state changes (ie. when someone gives a reviewed-by tag, the patch state should change to reviewed)
There are plenty of other exciting ideas captured in the github issues for when this is done.

Links




Comments

  1. Awesome! We will probably give it a try for our Osmocom.org projects. Will series/patches auto-close when they end in master? Have you considered using something like the Gerrit Change-Id to uniquely identify patches of a series?

    ReplyDelete
  2. Hey!

    Well, most patches should "auto-close" with the post-receive commit hook (there are still cases where the patch context has changed a tiny bit and the current hash-based algorithm doesn't match the patch to close), but series still don't have a status (the series status needs to be derived from the underlying status of its patches). That's totally part of what's next and I'll hopefully have some time to work on that soon-ish.

    As for Change-Id, the URL to series/patches uniquely identify those. As a matter of fact, we've started using them in i915's commits (see the Link tag in https://cgit.freedesktop.org/drm-intel/commit/?id=2abc525bf5c62fd1f2a2994e5231842221dfdddb)

    While we use the URL with the corresponding patch message-id, it's possible to just use the per-instance unique id. In the commit above, that'd be:

    Link: https://patchwork.freedesktop.org/patch/75893/

    If the patches are tagged this way, the git post-receive hook could easily auto-close the corresponding patch without any additional smart.

    ReplyDelete
  3. Nice work! This looks awesome. I'd love to have this for the libvirt project where patches seem to miss review on the regular

    But are there any plans for integrating this into upstream patchwork?

    ReplyDelete
  4. Unfortunately, what has become "upstream" decided my approach wasn't suitable and has started to rewrite what I've done:

    https://lists.ozlabs.org/pipermail/patchwork/2016-March/002431.html

    To be fair, I rewrote their testing integration because I found their approach underwhelming.

    Full disclosure then, there are 2 active maintainers and so 2 incompatible branches, in the pure Open Source tradition. As always the users suffer :(

    ReplyDelete

Post a Comment

Popular posts from this blog

Building and using coverage-instrumented programs with Go

tl;dr We can create coverage-instrumented binaries, run them and aggregate the coverage data from running both the program and the unit tests.

In the Go world, unit testing is tightly integrated with the go tool chain. Write some unit tests, run go test and tell anyone that will listen that you really hope to never have to deal with a build system for the rest of your life.

Since Go 1.2 (Dec. 2013), go test has supported test coverage analysis: with the ‑cover option it will tell you how much of the code is being exercised by the unit tests.

So far, so good.

I've been wanting to do something slightly different for some time though. Imagine you have a command line tool. I'd like to be able to run that tool with different options and inputs, check that everything is OK (using something like bats) and gather coverage data from those runs. Even better, wouldn't be neat to merge the coverage from the unit tests with the one from those program runs and have an aggregated view of …

A git pre-commit hook to check the year of copyright notices

Like every year, touching a source file means you also need to update the year of the copyright notice you should have at the top of the file. I always end up forgetting about them, this is where a git pre-commit hook would be ultra-useful, so I wrote one:# # Check if copyright statements include the current year # files=`git diff --cached --name-only` year=`date +"%Y"` for f in $files; do head -10 $f | grep -i copyright 2>&1 1>/dev/null || continue if ! grep -i -e "copyright.*$year" $f 2>&1 1>/dev/null; then missing_copyright_files="$missing_copyright_files $f" fi done if [ -n "$missing_copyright_files" ]; then echo "$year is missing in the copyright notice of the following files:" for f in $missing_copyright_files; do echo " $f" done exit 1 fiHope this helps!