质量管理 谷歌如何测试

laofo · 2012年07月31日 · 2 次阅读

HuangLi 转自: http://sdet.org/?p=82

<谷歌如何测试> 译者序 Posted on2012年02月21日by HuangLi

吴军的<浪潮之巅>中有这么一章,他在里面历数了二十世纪五十年代以后的伟大的科技公司,无一例外,这些公司背后的驱动力都是技术,而技术的根本就是人。无论是基因科技还是蓝色巨人,里面都有一批做技术的人,可能是在这个行业已经工作了 20 年,但还在一直坚持做技术的工作,也正是这些人,推动着浪潮不断地抛向更高。

在国内,多数刚刚开始工作的同学,特别是应届生或者刚刚转行到计算机工种的同学,都是非常崇敬技术的,特别是在软件公司,感觉如果做出一套优雅的界面或者牛逼的系统,那将是一件非常屌的事情。但慢慢地随着时间的漂移,过了几年,随着一些年长的工程师的跳槽或者升迁,突然发现自己在项目组里已经是做技术里比较厉害的一个了,于是乎,伴随着某个 “机会”,从团队主管开始,慢慢地走上了 “管理” 的道路,与技术也渐行渐远,虽然偶尔回想起来心有不甘,有心再去追逐,但也无能为力,技术领域日新月异的速度超过了你的想象,这仿佛变成了一个规律似的,像两千年的王朝,不断地迭代。至今,在多数国内科技公司,很难找到工作 10 年以上技术老兵,看看一系列开发者大会上,嘉宾也多数是二三十岁的小伙。所以,对于那些坚持在一线做技术的人,非常欣赏和崇拜,相信这些人也将是推进国内公司走向浪潮之巅的根本动力。如果有一天,你有机会站在 Career Path 的十字路口,不要犹豫,选择技术,大致原因,

技术是推进社会、科技进步的根本原因; 问下自己内心是否真的喜欢现在的状态,还是被迫上位; 技术是了解一线问题的真正来源;听得见炮灰人,都是前线的工程师; 管理工作主要是在做人的管理,相比较和机器与程序打交道,务虚一些,想做好比较难,特别是在当下国内的大环境,容易变形; 招聘市场上的技术岗位远远比管理岗位多,人无远虑,必有近忧; 薪资待遇上,技术人员越来越受重视;

好了,还是步入正题,讨论测试的事情吧。软件开发模式,特别是互联网,最近几年都在追求快速迭代发布,例如 Gmail 或者新浪微博,都是 Beta 版本线上运营,知道软件开发流程的人都清楚,Beta 版意味着潜在 Bug 和 Issue,离正式的 Release 还有一段距离,但还是会带着潜在风险上线。为什么这样,难道是没有经过系统的测试?真正的原因还是风险的控制,详细的几点原因以后再讲。但却造就了一些对测试的误解,

这些应用没有经过很好地测试,好多功能使用上都有问题; 测试水平比较有限,没有能及时的发现潜在问题; 测试本身没有太多的技术,基本上是功能确认,点点鼠标、搭建环境验证下就可以; 只要认真仔细,有责任心就可以做好测试;

等等诸如此类的误解,让许多不明真相的群众,特别是应届生,都不会把测试作为自己的职业规划来考虑,让我们的招聘 HR 也伤透了脑筋。也想通过这个系列的讨论,让大家清楚测试工作如果做好,方法其实并不是想象中的那么简单和表面上的肤浅,是非常好有挑战的。

谷歌是众所周知的互联网技术的圣地,其搜索和大规模数据存储和计算方面的技术领先业内,也是目前站在互联网浪潮之巅上最大的一朵浪花,其背后的工程学技术也是工程师津津乐道的话题。James whittaker 是 Google 的测试总监,自2011年1月25号至2011年5月4号,在 Google Testing Blog (可惜被墙) 上持续发表了 7 篇系列文章来介绍谷歌是如何测试的。在 2011 年 10 月的 GTAC 大会期间,跟其有过一些交流,告诉了他将这系列文章翻译成中文的想法,他也非常支持,并要送我一本即将出版的纸制书。只可惜,2012 年 2.13 日他已经从谷歌离职,这本书还没有出版(Amazon 上介绍是 April 8, 2012 出版,由于他的离职,不知是否还能按时出版)。

总之,探求和揭示谷歌公司背后的测试方法也算是缓解测试工程师们在窥视癖方面某种程度的需求,但能力着实有限,翻译过程中会采用意译的方式,若有不同建议,还请指正。

另外一个想做<谷歌如何测试>翻译的原因,在各大搜索引擎上还没有搜索到这系列文章比较好的翻译,只找到《外刊 IT 评论》翻译的<谷歌如何测试>第三篇和<译言网>翻译的第一篇,内容有些不全,关键感觉翻译的质量感觉不是很好。总之,整个系列还没有人翻译,且我本人在软件测试领域工作了也近 8 年,对技术还算比较感兴趣,在这里做个尝试吧。

2012.02.21

<谷歌如何测试> 翻译第一篇

HuangLi http://sdet.org/?p=130

这是谷歌如何测试系列文章的第一篇。

Tuesday, January 25, 2011 9:08 AM

By James Whittaker 在所有我被问及的问题中,最多的就是关于谷歌是如何测试的。尽管在博客中【google testing blog】中有过零碎的解释说明,但还是需要更多的系统阐述。虽然谷歌的技术路线在执行的过程中不断地进化,但公司的测试策略却从来没有变化过。谷歌现在是一家拥有搜索、应用、广告、移动、操作系统等产品的公司,我们在这些涉及到的产品领域里发挥着非常有意义的作用。当我们涉及到一些新的领域或者在旧领域里快速成长的时候,必须要求我们的测试也在同步的扩张和改进。在这个系列文章中提及的测试技术,多数是我们当前正在使用的,还有一些是希望以后在不久的将来可以用到。

首先先介绍一下组织结构,这一部分也可能会让你感到惊奇。其实在谷歌没有真正的测试部门,测试依托在各个产品领域部门里,我们称之为 “工程生产力”【原文, Engineering Productivity】。工程生产力部门拥有数量不等的水平或者垂直的工程学科,测试是其中的大头。简单地说,工程生产力部门由以下几部分构成:

  1. 一个工具产品团队【a product team】,负责内部和外部开源的促进生产力的工具开发与维护,这些工具会被公司范围内的各种工程师使用。这些工具包括代码分析工具、IDE、测试用例管理系统、自动化测试工具、Build 系统、源码管理系统、代码审核调度系统、缺陷管理系统等等。 这些工具的都是为了提高工程师效率的,并且这些工具在策略上的目标多数是为了防止问题的发生,而不是发现问题本身。

  2. 一个服务团队【a services team】,给产品部门【注:这里的产品部门团队是和工程生产力部门平级的,例如 Search、Gamil、Chrome 等产品部门】提供一些专业的建议,包括一系列工具、文档、测试、发布管理、培训等方面,这些专家建议涵盖可靠性、安全、国际化等,甚至包括产品团队面对的功能问题。所有的其他产品领域也都会得到这样的建议指导。

  3. 嵌入式的工程师【Embedded engineers】,在需要的时候被产品部门高效地 “借” 去使用,这些工程师有些会和产品部门的团队坐在一起工作数年,另外一些当他们被需要的时候会被借调到其他的产品团队。谷歌鼓励所有他们的工程师更换产品团队,用以保持团队忙绿且不断有新面孔,并如实地不带有任何偏见与政治。测试人员也是这样,但是可以有节奏地选择更换产品团队的频率。我的下属里,有测试人员已经在 Chrome 团队工作了好几年,也有一些待了一年半后就换到了其他团队。对于测试经理来说,必须在团队的产品经验和新鲜度上做出很好的平衡。

所以这意味着测试同学向工程生产力部门的经理汇报,但是他们会把自己看成产品部门团队的一员,像搜索、邮箱、和 Chrome 部门。从组织架构上看,测试都是两个团队的一部分。测试和产品团队坐在一起,参与计划,一起吃饭,共享奖金,享受像全职的产品团队成员一样的待遇。这种单独的组织汇报关系的好处是可以给测试人员之间提供良好的共享信息的讨论机会,好的测试思路可以很容易的在工程生产力部门内部蔓延,无论公司内的哪条产品线,都可以很快地使用这些最好的测试技术。

测试人员的这种项目分离和汇报组织结构也有它的缺点,目前来看,最大的问题是测试人员被看做外部资源。产品部门团队不能对测试人员有太多的依赖,他们自己必须要合理地控制产品质量。是的,没错,在谷歌,是产品部门团队对产品质量负责,而不是测试人员。每个产品部门的开发人员都需要做测试工作,测试人员的任务是为产品部门团队搭建自动化测试基础设施和流程,测试人员让开发可以自给自足地、独立地做完成测试工作。

在这种模式下,我比较喜欢的是,开发和测试将有相同的地位。在质量方面,开发和测试成为了真正的伙伴,最大的质量重担交给本应属于的开发人员,开发的职责就是正确地实现产品功能。另外这样可以保持多对一的开发测试比率,开发人员在数量上远超测试人员,并且测试工作做的越多,开发测试比率就会越大。产品部门团队也会对这样的高开发测试比率而感到骄傲。

好,现在好像大家都是好朋友了,对吧? 相信你已经看到这种模式的一个问题,开发人员不能很好地驱动缺陷【Bug】的运转,开发不会测试。难道我要否认这一点么?不管怎样,我都不会否认,特别是去年在 GTAC talk【GTAC 2010: Turning Quality on its Head,link】上做了一个开发和测试对抗的游戏后。(友情提示:测试赢了游戏)【这里感觉翻译的不好,原文是: No amount of corporate kool-aid could get me to deny it, especially coming off my GTAC talk last year where I pretty much made a game of developer vs. tester (spoiler alert: the tester wins).】

在谷歌,解决这个问题的办法是将角色再细分,我们通过设立不同的测试角色来解决这两种不同的测试问题。在下一篇文章里,我将详细阐述这些测试角色和谷歌是怎样将测试问题分成两部分来分别解决的。

公直 2012/3/8

英文原文

http://googletesting.blogspot.com/2011/01/how-google-tests-software.html Tuesday, January 25, 2011 9:08 AM By James Whittaker

This is the first in a series of posts on this topic.

The one question I get more than any other is “How does Google test?” It’s been explained in bits and pieces on this blog but the explanation is due an update. The Google testing strategy has never changed but the tactical ways we execute it has evolved as the company has evolved. We’re now a search, apps, ads, mobile, operating system, and so on and so forth company. Each of these Focus Areas (as we call them) have to do things that make sense for their problem domain. As we add new FAs and grow the existing ones, our testing has to expand and improve. What I am documenting in this series of posts is a combination of what we are doing today and the direction we are trending toward in the foreseeable future.

Let’s begin with organizational structure and it’s one that might surprise you. There isn’t an actual testing organization at Google. Test exists within a Focus Area called Engineering Productivity. Eng Prod owns any number of horizontal and vertical engineering disciplines, Test is the biggest. In a nutshell, Eng Prod is made of:

  1. A product team that produces internal and open source productivity tools that are consumed by all walks of engineers across the company. We build and maintain code analyzers, IDEs, test case management systems, automated testing tools, build systems, source control systems, code review schedulers, bug databases… The idea is to make the tools that make engineers more productive. Tools are a very large part of the strategic goal of prevention over detection.

  2. A services team that provides expertise to Google product teams on a wide array of topics including tools, documentation, testing, release management, training and so forth. Our expertise covers reliability, security, internationalization, etc., as well as product-specific functional issues that Google product teams might face. Every other FA has access to Eng Prod expertise.

  3. Embedded engineers that are effectively loaned out to Google product teams on an as-needed basis. Some of these engineers might sit with the same product teams for years, others cycle through teams wherever they are needed most. Google encourages all its engineers to change product teams often to stay fresh, engaged and objective. Testers are no different but the cadence of changing teams is left to the individual. I have testers on Chrome that have been there for several years and others who join for 18 months and cycle off. Keeping a healthy balance between product knowledge and fresh eyes is something a test manager has to pay close attention to.

So this means that testers report to Eng Prod managers but identify themselves with a product team, like Search, Gmail or Chrome. Organizationally they are part of both teams. They sit with the product teams, participate in their planning, go to lunch with them, share in ship bonuses and get treated like full members of the team. The benefit of the separate reporting structure is that it provides a forum for testers to share information. Good testing ideas migrate easily within Eng Prod giving all testers, no matter their product ties, access to the best technology within the company.

This separation of project and reporting structures has its challenges. By far the biggest is that testers are an external resource. Product teams can’t place too big a bet on them and must keep their quality house in order. Yes, that’s right: at Google it’s the product teams that own quality, not testers. Every developer is expected to do their own testing. The job of the tester is to make sure they have the automation infrastructure and enabling processes that support this self reliance. Testers enable developers to test.

What I like about this strategy is that it puts developers and testers on equal footing. It makes us true partners in quality and puts the biggest quality burden where it belongs: on the developers who are responsible for getting the product right. Another side effect is that it allows us a many-to-one dev-to-test ratio. Developers outnumber testers. The better they are at testing the more they outnumber us. Product teams should be proud of a high ratio!

Ok, now we’re all friends here right? You see the hole in this strategy I am sure. It’s big enough to drive a bug through. Developers can’t test! Well, who am I to deny that? No amount of corporate kool-aid could get me to deny it, especially coming off my GTAC talk last year where I pretty much made a game of developer vs. tester (spoiler alert: the tester wins).

Google’s answer is to split the role. We solve this problem by having two types of testing roles at Google to solve two very different testing problems. In my next post, I’ll talk about these roles and how we split the testing problem into two parts.

<谷歌如何测试> 翻译第二篇 by HuangLi
http://sdet.org/?p=149

Wednesday, February 09, 2011 6:36 PM

By James Whittaker

为了实现” 谁的屁股谁自己擦” 这句名言所说的那样,在传统的软件开发人员的之上,有必要增加了几个角色,特别是需要工程技术方面的特殊角色,这种角色可以让开发更高效低做测试。在谷歌,这样角色的职责是让其他人工作的更有效率,这样的工程师通常会把自己当做测试人员,但他们真正的使命是提高生产力/生产率。他们的存在是为了让开发人员效率提升,特别是在质量方面的提升,因为产品质量是生产率中最重要的一部分。这里是这些角色的总结:

【注,“you build it, you break it”, you build it ,you break it , you fix it, 原意指在 Build Lab 的人永远不会去修复 build break 的问题,只有开发人员自己才能修复。这里的意思是开发人员自己要对自己写的代码负责,比专职的测试人员更适合做测试工作。这里意译为” 谁拉的 shi,谁的屁股谁自己擦”】

软件开发工程师【SWE,Software Engineer】, 就是传统的开发人员。软件工程师实现一些功能代码并把最终产品提供给用户使用,他们创建设计文档、设计数据结构和总体的架构搭建,他们大多数时间都在写代码和评审代码。同时,他们也会写很多的测试代码,包括测试驱动设计,单元测试,并参与后面的文章会讲到的小、中、大型测试的创建工作。软件工程师需要对他们自己写的代码、修复缺陷的代码、改进的代码,只要是他们接触过的代码的质量负责。

软件测试开发工程师【SET or Software Engineer in Test】,和软件开发工程师一样是开发工程师,主要负责软件的可测试性。他们参与设计评审,近距离地关注代码质量和风险,对代码做重构为了系统有更好的可测试性,同时他们负责写单元测试框架和自动化测试的框架。在代码级别上他们和软件开发工程师是合作伙伴,但如果和增加新功能或提升性能相比较,他们更关心产品的质量和测试覆盖率的提升。

软件测试工程师【Test Engineer】,和软件测试开发工程师【SET】恰恰相反,他得主要工作是做测试而不是开发。许多谷歌的软件测试工程师会花很多的时间在写测试代码上,包括自动化脚本、使用场景的代码、甚至模拟最终用户的操作方面的代码。他们对软件开发工程师和软件测试开发工程师的测试工作做一些组织安排,解释测试结果、驱动测试的执行,特别是在项目即将发布的后期将起到非常重要的作用。软件测试工程师既是产品专家也是质量顾问更是风险分析师。

从质量的角度来看,软件开发工程师对功能开发和质量负有全责。同时,他们还负责容错设计、故障恢复、TDD、单元测试、和在软件测试开发工程师的帮助下写测试代码,这些测试代码会验证开发的功能。

软件测试开发工程师是提供测试支持的开发人员。他们提供一种能够将新添加的代码通过模拟其依赖的方式做功能验证的技术框架,并应用在代码提交之前的提交队列管理之中【注,这样可以在代码 check in 的时候保证新代码的功能完备】。可以这样说,软件测试开发工程师就是为了让软件工程师可以测试他们的功能代码,所有真正的测试都是软件开发工程师完成的,软件测试开发工程师是保证这些功能有很好的可测试性,这样可以让软件开发工程师很积极地参与到测试用例代码的编写中去。

现在所有的一切很清楚了,软件测试开发工程师就是服务人员,他们的主要职责就让开发人员很方便简单的做测试并保证模块级别的产品质量。读者可能已经意识到一个问题,在这样的研发流程下,使用软件的最终用户会怎样?

在谷歌,软件测试工程师的职责就是最终用户级别的测试。如果软件开发工程师和软件测试开发工程师很好地做了模块级别的功能测试,下一个工作就是看许多功能集成和数据的组合是否能够满足最终用户的使用需求。软件测试工程师在这里就是扮演一个双重确认开发工程师测试工作的角色,任何明显的缺陷都会说明之前一轮的开发自测不够或比较草率,如果没有出现这种情况之后,软件测试工程师会将注意力转移到普通用户的使用场景测试上,保证性能、安全、国际化等方面没有问题。软件测试工程师们需要做大量的测试工作,并在测试工程师、测试合同工、吃狗粮的尝鲜者、beta 测试用户、早期最终用户之间做很多沟通交流的工作,他们会和基础设计、功能复杂度、避免错误的方法等方面遇到问题的人做确认。一旦软件测试工程师开始介入,总是会没完没了。

好了,现在大家对这些角色应该有比较好的理解了,接下来我会在他们之间是怎么分工合作做更详尽的剖析。下次再聊。。。谢谢您的关注。

公直

2012/3/11

英文原文,

How Google Tests Software – Part Two

Wednesday, February 09, 2011 6:36 PM

http://googletesting.blogspot.com/2011/02/how-google-tests-software-part-two.html

By James Whittaker

In order for the “you build it, you break it” motto to be real, there are roles beyond the traditional developer that are necessary. Specifically, engineering roles that enable developers to do testing efficiently and effectively have to exist. At Google we have created roles in which some engineers are responsible for making others more productive. These engineers often identify themselves as testers but their actual mission is one of productivity. They exist to make developers more productive and quality is a large part of that productivity. Here’s a summary of those roles:

The SWE or Software Engineer is the traditional developer role. SWEs write functional code that ships to users. They create design documentation, design data structures and overall architecture and spend the vast majority of their time writing and reviewing code. SWEs write a lot of test code including test driven design, unit tests and, as we explain in future posts, participate in the construction of small, medium and large tests. SWEs own quality for everything they touch whether they wrote it, fixed it or modified it.

The SET or Software Engineer in Test is also a developer role except their focus is on testability. They review designs and look closely at code quality and risk. They refactor code to make it more testable. SETs write unit testing frameworks and automation. They are a partner in the SWE code base but are more concerned with increasing quality and test coverage than adding new features or increasing performance.

The TE or Test Engineer is the exact reverse of the SET. It is a a role that puts testing first and development second. Many Google TEs spend a good deal of their time writing code in the form of automation scripts and code that drives usage scenarios and even mimics a user. They also organize the testing work of SWEs and SETs, interpret test results and drive test execution, particular in the late stages of a project as the push toward release intensifies. TEs are product experts, quality advisers and analyzers of risk.

From a quality standpoint, SWEs own features and the quality of those features in isolation. They are responsible for fault tolerant designs, failure recovery, TDD, unit tests and in working with the SET to write tests that exercise the code for their feature.

SETs are developers that provide testing features. A framework that can isolate newly developed code by simulating its dependencies with stubs, mocks and fakes and submit queues for managing code check-ins. In other words, SETs write code that allows SWEs to test their features. Much of the actual testing is performed by the SWEs, SETs are there to ensure that features are testable and that the SWEs are actively involved in writing test cases.

Clearly SETs primary focus is on the developer. Individual feature quality is the target and enabling developers to easily test the code they write is the primary focus of the SET. This development focus leaves one large hole which I am sure is already evident to the reader: what about the user?

User focused testing is the job of the Google TE. Assuming that the SWEs and SETs performed module and feature level testing adequately, the next task is to understand how well this collection of executable code and data works together to satisfy the needs of the user. TEs act as a double-check on the diligence of the developers. Any obvious bugs are an indication that early cycle developer testing was inadequate or sloppy. When such bugs are rare, TEs can turn to their primary task of ensuring that the software runs common user scenarios, is performant and secure, is internationalized and so forth. TEs perform a lot of testing and test coordination tasks among TEs, contract testers, crowd sourced testers, dog fooders, beta users, early adopters. They communicate among all parties the risks inherent in the basic design, feature complexity and failure avoidance methods. Once TEs get engaged, there is no end to their mission.

Ok, now that the roles are better understood, I’ll dig into more details on how we choreograph the work items among them. Until next time…thanks for your interest.

<谷歌如何测试> 翻译第三篇 by HuangLi http://sdet.org/?p=160

Wednesday, February 16, 2011 2:47 AM By James Whittaker

经过前两篇的介绍之后,评论里留下许多问题。并没有逐一回复,当然不是想把这些评论置之不理,而是希望在这里和后面的文章中做详细介绍和解释这些问题。从这一篇开始,我将开始讲谷歌是如何测试软件的了。

在谷歌,质量不等于测试,是的,我确定在其他所有的公司也都是这样。“质量不是被测出来的”,这句陈词滥调是再正确不过的了。不管汽车制造还是软件开发,如果在最初的设计建造的时候就有问题,那它永远都会有问题。试问任何一家曾经被迫大量召回汽车的公司,逃避质量问题的代价是多么的昂贵。

但是,“质量不是被测出来的” 这句话本身并不像它听起来的那么简单和准确。虽然质量并不是被测出来的,但同样也有证据表明,未经过测试,也不可能开发出有质量的产品。你连测试都没有做过,又是怎么知道产品功能是否正确,并有高质量呢?

对于这种难题,最简单的办法是不要区分开发和测试,不要把他们当成对立的活动。测试和开发【注,两种行为,不是人】最好能手牵手的并行,写一点代码就立刻进行测试,写的越多,测的就要越多。最好是,在编码的同时,甚至在编码之前,就考虑清楚这些代码将如何被测试。测试不是一个单独的工作,测试就是开发的一部分。所以,质量并不等同于测试,当把开发和测试混在一起,搅拌直到分不清他们彼此的时候,就得到了质量。

这就是谷歌的想法,把开发和测试工作混在一起,不分彼此。写点代码,就必须测试,多写一些就多测一些。关键的问题就是谁来做测试工作? 由于谷歌的专职测试人员非常的少,唯一的答案就只能是开发人员。还有比实际写代码的开发人员更适合来测试这些代码的人吗?还有比程序的作者更懂得怎样去发现程序 bug 的吗?是谁更想知道程序在第一次运行时是否有没有问题呢?谷歌之所以用这么少的专职测试人员的原因就是开发对质量负全责。实际上,如果一个团队在过于依赖测试的时候,通常情况下这个团队在开发上一定也会有问题。如果在这个团队里,测试人员比较多,这也是一个强烈的信号,这表明开发和测试融入到一起的程度还不够,失衡了,缺乏测试,单纯地增加测试人员并不能解决任何问题。

这意味着,对于质量来说,预防问题比发现问题本身更重要。质量是开发人员的问题,而不是测试人员的问题。通过把测试工作融入到开发过程中,我们能降低那些富产 Bug 的人的出错机会,不仅可以避免了大量最终用户的使用问题,而且还可以极大地降低测试人员报无效 Bug 的数量。在谷歌软件测试工程师的工作目标就是检查这种预防措施是否有效,软件测试工程师不停地寻找一些证据来证明作为 bug 的作者和预防者的 “软件开发工程师 - 软件测试开发工程师” 组合是否存在问题,一旦发现任何不正常,就会拉响警笛。

这种开发和测试一体的场景随处可见,不管是在代码审核的时候问 “你的测试呢?”,还是在厕所蹲坑里张贴着的最佳测试实践–臭名昭著的马桶测试指南【译者注,参见 google test blog,有关于” Testing On The Toilet“的更多介绍】。测试是开发过程中必不可少的一环,质量是开发和测试合体的产物。软件开发工程师,软件测试开发工程师,软件测试工程师,所有的人都是测试人员。

如果你所在的公司也想要做这种开发和测试的统一,请也给大家分享一下其中经验和教训。如果没有,这将是一个帮助你公司的机会:让开发和质量划等号。你大概知道谚语里说的,鸡和猪为了一顿有培根和鸡蛋的早餐都乐于奉献自己,但是猪却牺牲了。好吧,这就是事实,尝试跑到开发工程师那里,对他们” 哼哼 “(猪叫声)两声,看他们是否也用” 哼哼 “来回应,如果他们” 咯咯哒 “(鸡叫声)来回应,那就说明有问题了。【译者注,崩溃了,这里比较难懂。James 这里引用了一个猪和鸡的英语谚语,(参见,http://en.wikipedia.org/wiki/The_Chicken_and_the_Pig ),谚语的意思大概是,猪和鸡都参与制作培根鸡蛋早餐,猪变成了猪肉(培根),鸡只下了一个蛋,说明对于早餐,猪和鸡的奉献程度是不同的。并在这里把测试工程师比喻成鸡,开发工程师比喻成猪,早餐就是质量,猪的奉献大。测试人员跑到开发人员那里,如果发现他们没有做猪的事情,早餐将做不成,那说明质量也将会有问题。】

公直 2012/3/16

英文原文,

How Google Tests Software – Part Three Wednesday, February 16, 2011 2:47 AM http://googletesting.blogspot.com/2011/02/how-google-tests-software-part-three.html By James Whittaker

Lots of questions in the comments to the last two posts. I am not ignoring them. Hopefully many of them will be answered here and in following posts. I am just getting started on this topic.

At Google, quality is not equal to test. Yes I am sure that is true elsewhere too. “Quality cannot be tested in” is so cliché it has to be true. From automobiles to software if it isn’t built right in the first place then it is never going to be right. Ask any car company that has ever had to do a mass recall how expensive it is to bolt on quality after-the-fact.

However, this is neither as simple nor as accurate as it sounds. While it is true that quality cannot be tested in, it is equally evident that without testing it is impossible to develop anything of quality. How does one decide if what you built is high quality without testing it?

The simple solution to this conundrum is to stop treating development and test as separate disciplines. Testing and development go hand in hand. Code a little and test what you built. Then code some more and test some more. Better yet, plan the tests while you code or even before. Test isn’t a separate practice, it’s part and parcel of the development process itself. Quality is not equal to test; it is achieved by putting development and testing into a blender and mixing them until one is indistinguishable from the other.

At Google this is exactly our goal: to merge development and testing so that you cannot do one without the other. Build a little and then test it. Build some more and test some more. The key here is who is doing the testing. Since the number of actual dedicated testers at Google is so disproportionately low, the only possible answer has to be the developer. Who better to do all that testing than the people doing the actual coding? Who better to find the bug than the person who wrote it? Who is more incentivized to avoid writing the bug in the first place? The reason Google can get by with so few dedicated testers is because developers own quality. In fact, teams that insist on having a large testing presence are generally assumed to be doing something wrong. Having too large a test team is a very strong sign that the code/test mix is out of balance. Adding more testers is not going to solve anything.

This means that quality is more an act of prevention than it is detection. Quality is a development issue, not a testing issue. To the extent that we are able to embed testing practice inside development, we have created a process that is hyper incremental where mistakes can be rolled back if any one increment turns out to be too buggy. We’ve not only prevented a lot of customer issues, we have greatly reduced the number of testers necessary to ensure the absence of recall-class bugs. At Google, testing is aimed at determining how well this prevention method is working. TEs are constantly on the lookout for evidence that the SWE-SET combination of bug writers/preventers are screwed toward the latter and TEs raise alarms when that process seems out of whack.

Manifestations of this blending of development and testing are all over the place from code review notes asking ‘where are your tests?’ to posters in the bathrooms reminding developers about best testing practices, our infamous Testing On The Toilet guides. Testing must be an unavoidable aspect of development and the marriage of development and testing is where quality is achieved. SWEs are testers, SETs are testers and TEs are testers.

If your organization is also doing this blending, please share your successes and challenges with the rest of us. If not, then here is a change you can help your organization make: get developers fully vested in the quality equation. You know the old saying that chickens are happy to contribute to a bacon and egg breakfast but the pig is fully committed? Well, it’s true…go oink at one of your developer and see if they oink back. If they start clucking, you have a problem.

<谷歌如何测试> 翻译第四篇 HuangLi http://sdet.org/?p=164

Wednesday, March 02, 2011 10:11 AM By James Whittaker

爬,走,跑。

在比其他公司少很多测试人员的情况下,谷歌做的还不错的一个关键原因是,很少尝试在一次发布中包含很多的功能。实际上,谷歌经常反其道而行之,在一个产品的核心模块被开发后,如果有一定数量的受益人群就立刻发布,然后不断的得到用户反馈再迭代开发新功能。这也是我们在 Gmail 上的做法,Gmail 被贴上 Beta 版本的标签在线上运营了四年。通过这个 Beta 标签也可以来警示用户,Gmail 还并非完美的产品,有出错的可能。只有当邮件数据达到 99.99% 的时间都是可用的时候,我们目标就算达到了,这个 Beta 标签才会被去除。很明显,质量是一个不断改进的过程。

这里的这个改进过程,并不像西部牛仔那样,一下子什么都能搞出来。实际上,一个产品为了达到我们称之为 Beta 的版本,也要经历一系列的过程,并在过程中证明其价值。Chrome 是我加入谷歌前两年一直所工作的团队,它同样经历了多个版本。在每个版本里,基于对产品质量的信心和不断寻求的客户反馈才让我们进入到下一个版本。这些版本历程大致如下:

金丝雀版本(Canary Channel),不太可靠的版本,并不适用于发布。就像一只金丝雀在煤堆里一样,如果不幸身亡,那说明还有工作要去做。只有超强容忍能力的用户才有可能使用金丝雀版本来试验运行,你不能依赖这样的应用能把实际工作完成。

开发版本(Dev Channel),是开发工程师们日常工作中使用的版本。所有的同一产品组的工程师都需要去安装这个版本,并在真正的工作中使用他们。

测试版本(Test Channel),是给内部的狗食【译者注,dog food,一般指自己团队的产品,给自己或者公司内部的人尝试使用的中间产品】,如果能够有持续不错的性能表现的话,也可能会是 beta 版本的候选。

Beta 版本或发布版本 (The Beta Channel or Release Channel),是给外部用户使用的第一个版本。只有在之前的各种版本历程中通过了测试和真实用户的枪林弹雨般的验证后,才会成为 beta 版。

上述的这种从爬到走、走到跑的模式,让我们在运行一些测试同时又可以对我们的应用系统做一些试验调整,并从真实用户和每个版本的每日自动化测试那里得到及时的反馈。

对于这样的过程,还有一些分析角度的益处。例如,发现了一个 bug,测试人员可以根据这个 bug 创建一个测试用例,并针对所有的每一个版本都运行这个测试用例,从而可以验证这个 bug fix 是否在所有的版本中都真正得到了修复。

【译者注:关于 Chrome 与 Chrome OS 各版本的称呼,可以参见http://www.chromi.org/chromedownload ,其中也涉及到本文中各个版本的称呼,但并不是与 James 文中完全一致,实际上像金丝雀版本,一些喜欢尝鲜的外部用户也在使用】

公直 2012/3/20

英文原文,

How Google Tests Software – Part Four

http://googletesting.blogspot.com/2011/03/how-google-tests-software-part-four.html Wednesday, March 02, 2011 10:11 AM By James Whittaker

Crawl, walk, run.

One of the key ways Google achieves good results with fewer testers than many companies is that we rarely attempt to ship a large set of features at once. In fact, the exact opposite is often the goal: build the core of a product and release it the moment it is useful to as large a crowd as feasible, then get their feedback and iterate. This is what we did with Gmail, a product that kept its beta tag for four years. That tag was our warning to users that it was still being perfected. We removed the beta tag only when we reached our goal of 99.99% uptime for a real user’s email data. Obviously, quality is a work in progress!

It’s not as cowboy a process as I make it out to be. In fact, in order to make it to what we call the beta channel release, a product must go through a number of other channels and prove its worth. For Chrome, a product I spent my first two years at Google working on, multiple channels were used depending on our confidence in the product’s quality and the extent of feedback we were looking for. The sequence looked something like this:

Canary Channel is used for code we suspect isn’t fit for release. Like a canary in a coalmine, if it failed to survive then we had work to do. Canary channel builds are only for the ultra tolerant user running experiments and not depending on the application to get real work done.

Dev Channel is what developers use on their day-to-day work. All engineers on a product are expected to pick this build and use it for real work.

Test Channel is the build used for internal dog food and represents a candidate beta channel build given good sustained performance.

The Beta Channel or Release Channel builds are the first ones that get external exposure. A build only gets to the release channel after spending enough time in the prior channels that is gets a chance to prove itself against a barrage of both tests and real usage.

This crawl, walk, run approach gives us the chance to run tests and experiment on our applications early and obtain feedback from real human beings in addition to all the automation we run in each of these channels every day.

There are analytical benefits to this process as well. If a bug is found in the field a tester can create a test that reproduces it and run it against builds in each channel to determine if a fix has already been implemented.

<谷歌如何测试> 翻译第五篇 by HuangLi
http://sdet.org/?p=170

PS,最近 2 周家中有些私事,翻译进度滞后一些,见谅。

正文,

Wednesday, March 23, 2011 8:27 PM By James Whittaker

对于测试范围的形式,谷歌并没有使用通用的代码测试、集成测试、系统测试这些常用术语来做区分,而是使用小规模测试、中等规模测试、大规模测试这样的称呼【译者注:代码测试 (code testing), 通常指单元测试和 API 级别的测试,一般使用 XUnit、Gtest 框架,但谷歌并没有使用代码级别测试这种说法】。小规模测试就是针对小量代码的测试,中等规模测试、大规模测试以此类推。所有的三种工程师角色【译者注,软件开发工程师、软件测试开发工程师、软件测试工程师,参见本系列第二篇】,都会去执行上面的三类测试,可能是自动化的测试,也可能是手动测试。

小规模测试,通常(但也并非所有)是自动化的,一般是针对一个单独的函数或者模块。这种测试一般由软件开发工程师【SWE】或者软件测试开发工程师【SET】来实现,通常在运行的时候会依赖模拟环境,当软件测试工程师【TEs】需要去诊断定位一个特定错误时,会去筛选一些小规模测试集合并运行来验证特定问题。对于小规模测试,主要集中在常见功能问题验证上,例如数据损坏、错误边界、发生错误时如何结束等。小规模测试尝试去解决的问题是,代码是否按照其假定的方式运行。

中等规模测试,可以是自动化的或者手动的,涉及到 2 个及以上功能模块,特别是要覆盖这些功能模块之间交互的地方。有不少软件测试开发工程师【SET】把这种测试描述成 “测试一个函数,以及它最近的邻居们”【” testing a function and its nearest neighbors.”】。软件测试开发工程师在独立的功能模块开发完毕后会驱动进行这种测试,软件开发工程师是写这些测试代码、并调试和维护这些测试的主要力量。如果一个测试用例运行失败或者运行错误,相应的开发会自动地跳出来查看处理。在开发周期的后期,软件测试工程师会运行这些中等规模测试,可能是手动的方式(如果很难或者需要投入比较大成本去自动化的时候)或者自动化的方式去运行。中等规模测试尝试去解决的问题是,一些相近的交互功能模块组合在一起是否和预期一致。

大规模测试,涵盖三个及以上(通常更多)功能模块,描述最终用户的使用场景及其可能扩展。所有的功能模块集成为一个整体的时候需要去关心许多问题,但在谷歌,对于大规模测试,更倾向于着重结果,例如,这个软件是用户期望的那样么?所有的工程师都会参与到大规模测试中,无论是使用自动化还是探索性测试方法。大规模测试尝试去解决的问题是,这个产品运行地是否是最终用户期望的那样。

小规模测试、中等规模测试、大规模测试这些术语本身其实并不重要,你可以给它们取任何你想的名称。对于谷歌的测试人员来说,有了这样一个统一的称谓后,就可以使用这些称谓来讨论正在进行什么样的测试以及其测试范围。有一些雄性勃勃的测试人员也会谈到第四种测试,被称为超级大规模测试,公司的其他测试人员可以认为这样的测试是一个非常大的系统级别的测试,涵盖到几乎所有的功能而且会持续很长的时间,其他的解释都会比较多余了。

哪些需要被测试及测试范围的确定,这是一个动态变化的过程,在不同的产品之间会有比较大的差异。谷歌更倾向于频繁发布,从产品的外面用户那里得到反馈之后再迭代开发。如果谷歌开发了一些产品,或者在已有产品上增加了新功能,会尽可能早地对外发布并让外部用户能使用并从中受益。在这个过程中需要较早地把用户和外部开发者牵扯进来,并要有一个很好的处理规则来验证是否满足发布条件。

最后,自动化测试和手动测试,对于所有的三种类型测试【小规模、中等规模、大规模测试】来说当然更喜欢前者。如果能够被自动化,而且不需要任何人智力和直觉判断,那就应该把它变成自动化的。只有在特别需要人为判断的时候,例如用户的界面是否漂亮、或暴漏一些涉及用户隐私的内容时,在这些情况下应该保留手动测试。

话虽如此,对于谷歌来说非常重要的是仍然使用了大量的手动测试,不管是使用文本记录的方式还是使用探索性测试,虽然有些已经进入了自动化测试的视线。业界使用的录制技术将手动测试转变成自动化测试,可以在每个版本后自动地重复运行,这样保证了最少的回归工作,并把手动测试的重点放在新问题上。而且,谷歌已经将提交 BUG 的过程和一些手动测试的日常工作也自动化了,例如,如果一个自动化测试运行失败,系统会自动检测到最后一次代码变更的信息,一般来说这是引起测试失败的原因,系统会给这次代码提交的作者发送一封通知邮件同时自动创建一个 BUG 来记录这个问题。在测试上,“人类智慧的最后一英寸” 体现在测试设计上,谷歌的下一代测试工具也正在这个方向上努力尝试,将其自动化。

这些工具在以后的文章中会被提及强调。不过,下一篇文章还是会将重点放在软件测试开发工程师【SET】的工作上。希望能得到你的持续关注。

公直

2012/4/2

英文原文,

How Google Tests Software – Part Five

http://googletesting.blogspot.com/2011/03/how-google-tests-software-part-five.html

Wednesday, March 23, 2011 8:27 PM By James Whittaker

Instead of distinguishing between code, integration and system testing, Google uses the language of small, medium and large tests emphasizing scope over form. Small tests cover small amounts of code and so on. Each of the three engineering roles may execute any of these types of tests and they may be performed as automated or manual tests.

Small Tests are mostly (but not always) automated and exercise the code within a single function or module. They are most likely written by a SWE or an SET and may require mocks and faked environments to run but TEs often pick these tests up when they are trying to diagnose a particular failure. For small tests the focus is on typical functional issues such as data corruption, error conditions and off by one errors. The question a small test attempts to answer is does this code do what it is supposed to do?

Medium Tests can be automated or manual and involve two or more features and specifically cover the interaction between those features. I’ve heard any number of SETs describe this as “testing a function and its nearest neighbors.” SETs drive the development of these tests early in the product cycle as individual features are completed and SWEs are heavily involved in writing, debugging and maintaining the actual tests. If a test fails or breaks, the developer takes care of it autonomously. Later in the development cycle TEs may perform medium tests either manually (in the event the test is difficult or prohibitively expensive to automate) or with automation. The question a medium test attempts to answer is does a set of near neighbor functions interoperate with each other the way they are supposed to?

Large Tests cover three or more (usually more) features and represent real user scenarios to the extent possible. There is some concern with overall integration of the features but large tests tend to be more results driven, i.e., did the software do what the user expects? All three roles are involved in writing large tests and everything from automation to exploratory testing can be the vehicle to accomplish accomplish it. The question a large test attempts to answer is does the product operate the way a user would expect?

The actual language of small, medium and large isn’t important. Call them whatever you want. The important thing is that Google testers share a common language to talk about what is getting tested and how those tests are scoped. When some enterprising testers began talking about a fourth class they dubbed enormousevery other tester in the company could imagine a system-wide test covering nearly every feature and that ran for a very long time. No additional explanation was necessary.

The primary driver of what gets tested and how much is a very dynamic process and varies wildly from product to product. Google prefers to release often and leans toward getting a product out to users so we can get feedback and iterate. The general idea is that if we have developed some product or a new feature of an existing product we want to get it out to users as early as possible so they may benefit from it. This requires that we involve users and external developers early in the process so we have a good handle on whether what we are delivering is hitting the mark.

Finally, the mix between automated and manual testing definitely favors the former for all three sizes of tests. If it can be automated and the problem doesn’t require human cleverness and intuition, then it should be automated. Only those problems, in any of the above categories, which specifically require human judgment, such as the beauty of a user interface or whether exposing some piece of data constitutes a privacy concern, should remain in the realm of manual testing.

Having said that, it is important to note that Google performs a great deal of manual testing, both scripted and exploratory, but even this testing is done under the watchful eye of automation. Industry leading recording technology converts manual tests to automated tests to be re-executed build after build to ensure minimal regressions and to keep manual testers always focusing on new issues. We also automate the submission of bug reports and the routing of manual testing tasks. For example, if an automated test breaks, the system determines the last code change that is the most likely culprit, sends email to its authors and files a bug. The ongoing effort to automate to within the “last inch of the human mind” is currently the design spec for the next generation of test engineering tools Google is building.

Those tools will be highlighted in future posts. However, my next target is going to revolve around The Life of an SET. I hope you keep reading.

<谷歌如何测试> 翻译第六篇 HuangLi http://sdet.org/?p=184

Monday, May 02, 2011 12:05 PM By James Whittaker

软件测试开发工程师【SET】的生命

软件测试开发工程师【Software Engineers in Test】是软件工程师,专注在测试实现。首先,软件测试开发工程师是开发角色,在招聘和内部晋升资料中被我们奉为 100% 的编码角色。当在招聘面试软件测试开发工程师的时候,对于编码的要求几乎和对软件开发工程师的要求是一模一样的,而且更强调如何去测试自己写的代码。换句话说,软件开发工程师和软件测试开发工程师都需要回答编码问题,而且软件测试开发工程师会被问到一系列测试相关的问题。

正如你可能想到的,这是一个很难满足的角色。软件测试开发工程师的数量如此之少的最有可能的原因是,事实具备软件测试开发工程师所需技能的人非常难找,而不是我们刻意使用了什么神奇的生产率公式【译注,开发测试比率公式】, 这也是我们适应当前工程实践的一个必然结果。我们还在优化我们的工程实践–这是一个非常重要的任务,并且为所有参与的人构建一些流程。

通常来说,软件测试开发工程师不会在早期设计阶段就介入。不是故意这样做,而是和多数谷歌的产品是如何诞生的有关。一个常见的新产品诞生的场景是这样,已有的谷歌产品的员工会投入 20% 时间来开始新的产品。Gmail 和 Chrome OS 这 2 个产品,从一个简单的想法开始,并非正式地由谷歌授权去做的,慢慢地随着时间的推移,越来越多的开发和测试加入进来并把产品发布。在这种情况下,早期的开发要关注的重心并不是质量,更关注提供一些理念,在解决了扩展性和性能的问题之后,再更多地关注质量。如果你创建了一个 web service,但是不具有可扩展性时,测试这时候还并不是你最大的问题。

一旦这个产品已经明确未来可以发布的时候,研发团队就开始寻求测试的介入了。

你可以想象这样一个过程,某个人有了一个好主意,他开始思考并写了一些试验性质的代码,从其他人那里获取一些建议然后对这些代码做了改进,并劝说更多的人加入,写了越来越多的代码,然后意识到他们做的事情很重要,通过更多的代码把这个想法变成可以呈现给他人并得到反馈的模型… 这个项目在谷歌的项目库中就创建了,这个项目慢慢地变成了一个真实的项目,测试也只有在项目变成真实的项目之后才会介入进来。

所有真实的项目都有专职的测试人员么? 默认情况下是没有的。小型项目和只有受限用户使用的项目一般是开发人员自己做测试。其他的一些对个人或者企业用户有潜在风险的项目,测试会介入。

当开发团队寻求测试团队参与并帮助他们时,他们有责任使测试人员相信他们的项目是令人兴奋并充满潜力的。开发总监会给测试总监解释他们的项目、进度、发布计划,一起讨论测试工作如何划分,并就开发需要满足的单元测试水平及开发参与测试工作程度上达成一致,发布流程中开发与测试的责任也需要明确。软件测试开发工程师在项目初期可能不会参与进来,一旦项目变成真实的项目后,测试人员将在软件开发过程中发挥巨大的影响力。

当我说 “测试” 时,并不是仅仅意味着单纯的检查验证代码路径。测试人员不是从开始就参与进来的,但 “测试” 却至始至终都有。实际上,一个开发尝试去 check in 代码的时,测试人员的影响力在这个时刻可能就已经显现出来了。【译,这里指软件测试开发工程师会对开发人员的测试程度做一些要求,开发人员在 check in code 的时候需要想一下自己是否满足这些要求,这就是测试人员的影响力】。欢迎继续收听并尝试理解我所说的这些东西。

公直

2012/6/28

英文原文,

How Google Tests Software – Part Six

http://googletesting.blogspot.com/2011/05/how-google-tests-software-part-six.html

Monday, May 02, 2011 12:05 PM By James Whittaker

The Life of an SET

SETs are Software Engineers in Test. They are software engineers who happen to write testing functionality. First and foremost, SETs are developers and the role is touted as a 100% coding role in our recruiting literature and internal job promotion ladders. When SET candidates are interviewed, the “coding bar” is nearly identical to the SWE role with more emphasis that SETs know how to test the code they create. In other words, both SWEs and SETs answer coding questions. SETs are expected to nail a set of testing questions as well.

As you might imagine, it is a difficult role to fill and it is entirely possible that the low numbers of SETs isn’t because Google has created a magic formula for productivity but more of a result of adapting our engineering practice around the reality that the SET skill set is really hard to find. We optimize on this very important task and build processes around the people who do it.

It is usually the case that SETs are not involved early in the design phase. Their exclusion is not so much purposeful as it is a by-product of how a lot of Google projects are born. A common scenario for new project creation is that some informal 20% effort takes a life of its own as an actual Google branded product. Gmail and Chrome OS are both projects that started out as ideas that were not formally mandated by Google but over time grew into shipping products with teams of developers and testers working on them. In such cases early development is not about quality, it is about proving out a concept and working on things like scale and performance that must be right before quality could even be an issue. If you can’t build a web service that scales, testing is not your biggest problem!

Once it is clear that a product can and will be built and shipped, that’s when the development team seeks out test involvement.

You can imagine a process like this: someone has an idea, they think about it, write experimental code, seek out opinions of others, write some more code, get others involved, write even more code, realize they are onto something important, write more code to mold the idea into something that they can present to others to get feedback … somewhere in all this an actual project is created in Google’s project database and the project becomes real. Testers don’t get involved until it becomes real.

Do all real projects get testers? Not by default. Smaller projects and those meant for limited users often get tested exclusively by the people who build it. Others that are riskier to our users or the enterprise (much more about risk later) get testing attention.

The onus is on the development teams to solicit help from testers and convince them that their project is exciting and full of potential. Dev Directors explain their project, progress and ship schedule to Test Directors who then discuss how the testing burden is to be shared and agree on things like SWE involvement in testing, expected unit testing levels and how the duties of the release process are going to be shared. SETs may not be involved at project inception, but once the project becomes real we have vast influence over how it is to be executed.

And when I say “testing” I don’t just mean exercising code paths. Testers might not be involved from the beginning … but testing is. In fact, an SET’s impact is felt even before a developer manages to check code into the build. Stay tuned to understand what I am talking about.

需要 登录 后方可回复。