代码审查 Rietveld - Google Code Review Tool

liuxue.gu@hotmail.com · 2011年11月18日 · 22 次阅读

An Open Source App: Rietveld Code Review Tool Guido van Rossum May 2008 Introduction My first project as a Google engineer was an internal web app for code review. According to Wikipedia, code review is "systematic examination (often as peer review) of computer source code intended to find and fix mistakes overlooked in the initial development phase, improving both the overall quality of software and the developers' skills." Not an exciting topic, perhaps, but the internal web app, which I code-named Mondrian after one of my favorite Dutch painters, was an overnight success among Google engineers (who evidently value software quality and skills development :-). I even gave a public presentation about it: you can watch the video on YouTube. I've always hoped that we could release Mondrian as open source, but so far it hasn't happened: due to its popularity inside Google, it became more and more tied to proprietary Google infrastructure like Bigtable, and it remained limited to Perforce, the commercial revision control system most used at Google. Fortunately, now that I work for the Google App Engine team, I've been able to write a new web app that incorporates many ideas (and even some code!) from Mondrian, and release it as open source. The Python open source community has been trying out Rietveld for the past few days, and has already been using it to do code reviews for Python (as well as providing valuable feedback in the form of bug reports and feature requests). Of course, the tool is not language-specific: you can use it for code reviews for any language! Introducing Rietveld To stick with the naming theme, I gave this new web app the code name Rietveld, after Gerrit Rietveld, one of my favorite Dutch architects and the designer of the Zig-Zag chair. However, because most English speakers have trouble spelling his name correctly, the "live" web app is known simply as codereview.appspot.com. The Rietveld app serves several purposes at once: it is a demo of fairly large-scale use of the popular web framework Django with App Engine, it makes some of the trickier (but portable) code we wrote for Mondrian available for reuse under the Apache 2.0 license, and it makes web-based code review available for many projects using Subversion repositories. Right now, any project hosted on Google Code can use Rietveld, as well as the Python subversion server. Support for arbitrary subversion servers is forthcoming. While a public instance of Rietveld is running at codereview.appspot.com, organizations are of course free to run their own instance restricted and/or tailored to their own needs. That's what open source is for! How Rietveld Manages Code Reviews So what can you do with Rietveld? The basic workflow is: Developer makes some changes in their Subversion workspace. Developer uploads a patch in the form of svn diff output to Rietveld, using a small script named upload.py. This creates a new issue for them on the Rietveld website. Developer goes to the issue that was just created on the Rietveld site, adds the email addresses of one or more reviewers, and causes Rietveld to send an email to the reviewer(s). Reviewer navigates to the issue on the Rietveld site, browses the side-by-side diffs linked from there. A side-by-side diff shows the old and new version of the source code side by side, with deleted text on the left marked with a light red background, and inserted text on the right marked with a light green background. (Two different shades of red and green each are used, to highlight the differences at a finer-grain level than blocks of lines. This helps find one-character changes and clarifies diffs that just reflow a lot of text.) Reviewer inserts inline comments directly into the side-by-side diffs, by double-clicking lines on which they want to comment. Inline comments are initially created in draft mode, which means that only the comment author can see (and edit) them. Reviewer publishes comments, making them visible to everyone else, and sending an email to the developer (and to other reviewers) summarizing the inline comments with a little bit of context. At this point, the developer can reply to inline comments directly on the Rietveld website using exactly the same mechanism as used by the reviewer. Replies simply become additional inline draft comments. The developer can also revise their code and upload a new version of the patch. The new version is attached to the same issue, and reviewers can choose to view the diffs afresh, or view the delta between the new and the old version of the patch. The latter feature is particularly helpful for large code reviews that require several iterations to reach agreement between developer and reviewer: the reviewer doesn't have to re-review stuff that didn't change between revisions and was already approved. Coming Developments I'm far from done with this application. Some features found in Mondrian that would be useful in Rietveld as well have not been implemented yet due to time constraints. The first users are also already asking for features I had never dreamed of, thanks to the many different styles of development found in the open source world. I have made the source code available as open source in this early stage in the hope to solicit outside contributions. My intention is to add outside developers to the project as soon as possible. As with Python, I am planning to remain in charge and review contributions carefully. Links Once more, codereview.appspot.com is the live web app, ready for your visits. In the project home directory you can find some documentation and the complete source code. For discussions, I've set up a Google Group: codereview-discuss. In the bug tracker you can submit bugs and suggest feature requests. Finally, here is the upload.py script mentioned above. Use Python 2.5.2 (or newer) to run it.

http://code.google.com/appengine/articles/rietveld.html

"Open source" literally means that the source code of each project can be accessed by project members and other users. It is a tradition in many open source projects to frequently review the source code as changes are made. This is a good practice because it helps catch software defects that might be hard to discover through testing and debugging. Code reviews also help the members of the project team stay aware of each others' changes and aligned with the goals of the project. Also, participating in code reviews can be a great way for people to improve their software development skills.

Project hosting on Google Code offers a code review feature that is integrated into source code browsing. It currently supports reviewing code after that code has been committed to the repository. Reviewing committed code naturally leads to discussions about further commits for further improvements.

Review comments You must be logged in to enter a code review comment. By default, the members of a project are the only people who may leave review comments. However, a project owner may choose to disable all reviews, or allow non-members to do reviews as well.

Review comments are made in the context of a source code revision or an assigned review issue. If you view the list of recent source code changes and drill into the detail page for one revision or assigned review, you will see any review comments entered there by other users. If you navigate to a different revision, you will see a different set of comments, if any have been entered there. Review comments are kept as part of the history of the project: if a problem is found during review of one revision, it might be fixed in a later revision of the code, but the comment that pointed it out will always remain on the earlier revision.

Review comments have three parts, and each part is optional:

A set of line-by-line comments that are made on individual lines of any source code file A general comment on the entire source code revision A score: positive, neutral, or negative To leave line-by-line comments, browse any source code file and then double-click on a source code line. You will see a text field where you can enter your comment on that line. For example, if you notice a while-loop that can become an infinite loop in certain situations, you could comment on the line that has the while-loop condition. Line-by-line comments are drafts until you publish them. That allows you to revise comments as you work through the review. For example, you might notice that the infinite loop situation is prevented by some other invariant condition, so you might revise or delete your initial draft.

Line-by-line comments can also be entered on the side-by-side diff page. You can double-click on any line of the old or new file to leave a comment there. You will most often want to leave comments on the new versions of modified lines on the right-hand side, but you may also decide to leave comments on the old version, or on unchanged lines that are indirectly affected by the change. All comments made while on the diff page are part of your review of the new revision, even if you make them on the left-hand side.

Both the source file browsing page and the diff page offer a drop-down menu that allows you to navigate among the files modified in a given revision. You will normally want to look at all files in a revision before publishing your comments.

Once you have entered and revised all your line-by-line comments, click the "Publish your comments" link to go to the revision details page. There you will see all your draft comments and have the opportunity to enter a general comment and to summarize your feelings about the revision with a score. Once your comments are published, you can no longer edit them, but you can delete or undelete them if needed.

Stars and notifications When one user publishes his or her review comments, an email notification is sent to all other users who have starred that revision or assigned review. A project owner may also configure the tool to send a notification of every review comment to one address, which would normally be a mailing list. Individual users may opt out of these notification messages by using the Settings tab of your Profile page. Users star a revision in three ways:

The author of each revision automatically stars that revision Users may click the star icon in the source changes list or revision detail page When submitting a comment on a revision that is not already starred, a checkbox offers to star the revision If you receive a code review comment notification email, and you want to respond, you can visit the web site to leave more comments. Alternatively, you can discuss it on your project mailing list. Replying to the notification email itself does not record your response in the code review tool.

Keyboard shortcuts On the revision detail page:

Key Action j Select next changed file in revision k Select previous changed file in revision o Open the diff view on the selected file On the source code diff page or source file browsing page:

Key Action j Select next changed file in revision k Select previous changed file in revision n Go to the next diff chunk or comment p Go to the previous diff chunk or comment u Go up to the revision detail page r Go up to the revision detail page and scroll to the review comments form Administrative options Project owners may use the "Administer" tab and the "Source" subtab to configure:

The types of users who may enter code review comments An address where all code review notification emails should be sent Effective code reviews It is important to keep in mind that code reviews are supposed to help advance the project. The best comments are ones that can be acted on to improve the code.

Some things to keep in mind:

Code reviews are about the code, not the author. Everyone makes mistakes sometimes. Use a written style guide to resolve matters of source code style. It's easiest if everyone just follows the style guide. Make a shared checklist of things to look for in all reviews, e.g., memory allocation conventions and input validation. Requesting Reviews Ask for 1 to 2 reviewers While it may seem more effective to cast a wide net to find a reviewer by requesting an entire mailing list or large sub-group to review a change, it actually tends to reduce the response rate. The more people included on a review, the more everyone thinks someone else will get to that review before they do. In the end, your review request is more likely to be ignored.

If you must have multiple reviewers, make it clear what the expectations are for each reviewer. Either noting the expectation in the review request, or contacting them through some other channel can be effective.

Find a reviewer familiar with the code Finding a good code reviewer can be a little tricky, especially if you are new to a particular project. Two ways to help narrow down good reviewer candidates are to look at the authors that have recently modified a file, using the svn log command, and to examine a file to find the last person to change related lines, using the svn blame command.

One other common place to look is the AUTHORS or README files within a project. These may not be quite as relevant, but can be useful in finding a point of contact.

Consider a reviewers history When selecting a reviewer, consider their past review history. Some reviewers will return a response faster than others, while others may be more thorough, and there may be reviewers which cover both. Finding a reviewer who cares about the particular area of the change is important and will generally lead to faster and more helpful reviews.

Warn reviewers about big changes If you will be sending the reviewer a large change to review, warn them ahead of time and set expectations on the amount of time required and when the reviewer will have time to perform the review. Suddenly receiving a large review without notice can be received as not respecting the reviewers time.

Factor time into reviews If your reviewer happens to be in a different time zone, be aware of the times they are likely to perform reviews. If the reviewer is multiple time zones away, sending them a review at the end of a coding session can work well since they can complete it before you return the next day. In the same sense, sending it earlier in the day to that same reviewer can result in a longer wait time for the review than someone in the same time zone.

Request smaller reviews If a change can be easily broken up into smaller changes, it will make it considerably easier on the reviewer to send multiple small changes than one large one. Don't split up changes if they are one cohesive change even if it is large.

When creating a new project on Google Code, you need to choose between Subversion, Mercurial, and Git as a version control system (VCS). There is no right answer to picking a VCS. They are all easy to use, have a large community behind them, and are used by popular projects. They have demonstrated themselves fit for their purpose. It is therefore likely that your project will be productive regardless of your choice. Typically, most projects pick a VCS based on their team members' preference and the team's perceived workflow.

Subversion Subversion is a well-known centralized VCS and the most widely used VCS for open source projects.

The largest benefit of using Subversion is that it's familiar to most developers, so they can start contributing to projects immediately without having to figure out a new VCS. Subversion can scale to support larger projects, but it works especially well for small/medium sized projects. Subversion also works well for teams where most software contributions are expected from other team members. Unlike distributed version control systems (DVCS), however, team members cannot make check-ins when they are offline. For more information on Subversion, you might start with the free online Subversion book.

Mercurial/Git

Mercurial and Git, like Bazaar, are Distributed Version Control Systems (DVCS) that enables developers to work offline and define more complex workflows such as peer-to-peer pushing/pulling of code.

DVCS makes it easier for outside contributors to contribute to projects, as cloning and merging of remote repositories is very easy. Large projects with multiple developers and external contributors benefit the most from DVCS because of the ease of branching and tagging. Smaller projects typically only experience the benefit of being able to work offline. For a great (and fun) tutorial on Mercurial, take a look at http://hginit.com. For Git, try http://www.kernel.org/pub/software/scm/git/docs/gittutorial.html.

Overview Post-Commit Web Hooks allow projects to setup web services that receive project commit notifications from Google Code. Such services could be used to integrate external tools including continuous build systems, bug trackers, project metrics, and social networks.

Details Project owners may enable this feature by specifying a target URL in the Administer/Source tab. If the URL contains the special patterns "%p" and "%r", those will be automatically replaced for each commit with the project name and comma-separated list of revisions, respectively.

The POST request payload describes the commit using the Web Hooks model, and consists of a UTF8-encoded JSON dictionary in the following format:

{ "project_name": "atlas-build-tool", "repository_path": "http://atlas-build-tool.googlecode.com/svn/", "revision_count": 1, "revisions": [ { "revision": 33, "url": "http://atlas-build-tool.googlecode.com/svn-history/r33/", "author": "mparent61", "timestamp": 1229470699, "message": "working on easy_install", "path_count": 4, "added": ["/trunk/atlas_main.py"], "modified": ["/trunk/Makefile", "/trunk/constants.py"], "removed": ["/trunk/atlas.py"] } ] }

While we will make a best effort to promptly deliver all Post-Commit Web Hook notifications, messages are not guaranteed to be delivered, may arrive multiple times, and may not arrive in order of commit. All requests have a 15 second timeout. If we fail to reach the specified URL, we will retry several times over a 24 hour period. This allows your services to be down for short maintenance windows and still receive all messages.

Web services should respond to the POST request with a 2XX response code to indicate successful delivery. Redirects (3XX response codes) are not followed, and no further delivery attempts will be made. All other response codes, as well as request timeouts, are treated as failures and will be retried.

Note: Notifications for commits via the 'svnsync' command are not yet supported.

Notification Format The payload's JSON dictionary contains the following items:

Field Type Description project_name String The name of the project. repository_path String The project's repository URL. revision_count Number Number of revisions contained in the 'revisions' list. revisions List A list of dictionaries describing 1 or more repository commits (see table below). Each revision contained in the 'revisions' list is a dictionary with the following items:

Field Type Description revision Number(SVN) / String(Hg) Repository identifier for this commit. url String URL to browse repository history for this revision. author String Username responsible for commit. timestamp Number Repository commit timestamp. message String Commit log message. path_count Number Total number of paths modified in this revision. Only a fixed number of paths will be included per revision, so this number is used to determine whether a partial list was sent. added List A list of String paths added by this revision. modified List A list of String paths modified by this revision. removed List A list of String paths removed by this revision. It is important to note that a revision's list of changed paths will be truncated for large commits in order to limit message sizes. Full commit information can be obtained using standard repository tools.

Example: Processing a notification using a Python AppEngine service import logging from django.utils import simplejson from google.appengine import webapp class Listener(webapp.RequestHandler): def post(self): payload = simplejson.loads(self.request.body) for revision in payload["revisions"]: logging.info("Project %s, revision %s contains %s paths", payload["project_name"], revision["revision"], revision["path_count"])

Authentication Post-Commit Web Hooks use HMAC-MD5 to authenticate requests. Every project has a unique post-commit 'secret key', visible to project owners in the Administer/Source tab. This key is used to seed the HMAC-MD5 algorithm. Each POST request header contains a HMAC used to authenticate the payload. This value is a 32-character hexadecimal string contained in the 'Google-Code-Project-Hosting-Hook-Hmac' header. By combining your project's secret key and the POST request's HMAC value, you can authenticate the request.

Example: Authentication using a Python AppEngine service import hmac import logging from google.appengine import webapp class Listener(webapp.RequestHandler): def post(self): project_secret_key = "0123456789abcdef" # From Administer/Source tab m = hmac.new(project_secret_key) m.update(self.request.body) digest = m.hexdigest() if digest == self.request.headers["Google-Code-Project-Hosting-Hook-Hmac"]: print "Authenticated" else: print "Authentication failed!"

Continuous Integration with Hudson Hudson, a continuous integration system, can be used to automate the build for Java projects (among others such as .NET projects). Projects built with Hudson can be triggered using Google Code's Post-Commit Web Hooks.

After Hudson has been setup, builds can be triggered on every commit by using the following url on your Hudson host as the Post-Commit URL:

http://YOURHOST/hudson/job/PROJECTNAME/build For more information, refer to the Hudson documentation.

需要 登录 后方可回复。