faker 测试数据生成
Testing is an iterative part of the development process that we carry out to ensure the quality of our code. A large portion of this entails writing test cases and testing each unit of our application using random test data.
测试是我们执行以确保代码质量的开发过程的迭代部分。 其中很大一部分需要编写测试用例,并使用随机测试数据来测试应用程序的每个单元。
Actual data for our application comes in when we release it to production, but during the development process we need fake data similar to real data for testing purposes. The popular open source library Faker provides us with the ability to generate different data suitable for a wide range of scenarios.
我们的应用程序的实际数据是在我们将其发布到生产环境时输入的,但是在开发过程中,我们需要与真实数据相似的虚假数据来进行测试。 流行的开源库Faker使我们能够生成适用于各种场景的不同数据。
Here we’ll focus on generating random test data using Faker for testing our test cases.
在这里,我们将重点介绍使用Faker生成随机测试数据来测试我们的测试用例。
Faker comes with a set of built-in data providers which can be easily accessed to generate test data. Additionally, we can define our own test data types making it highly extensible. But first, let’s look at a basic example that shows how Faker works:
Faker带有一组内置的数据提供程序,可以方便地访问它们以生成测试数据。 另外,我们可以定义我们自己的测试数据类型,使其高度可扩展。 但首先,让我们看一个显示Faker如何工作的基本示例:
<?php require "vendor/autoload.php"; $faker = FakerFactory::create(); // generate data by accessing properties for ($i = 0; $i < 10; $i++) { echo "<p>" . $faker->name . "</p>"; echo "<p>" . $faker->address . "</p>"; }The example assumes Faker was installed using Composer and uses the Composer autoloader to make the class definitions available. You can also use Faker by cloning it from its GitHub repository and using its included autoloader if you’re not using Composer.
该示例假定使用安装了Composer的 Faker并使用Composer自动加载器使类定义可用。 您还可以通过从Faker的GitHub存储库中克隆Faker来使用Faker,如果不使用Composer,则可以使用其随附的自动加载器。
To use Faker, we first need to obtain an instance from FakerFactory. All of the default data providers are loaded automatically into the $faker object. Then we generate random data just by calling a formatter name. The final output of the above code will list ten random person names and addresses from the available data sources.
要使用Faker,我们首先需要从FakerFactory获取实例。 所有默认数据提供程序都会自动加载到$faker对象中。 然后,我们仅通过调用格式化程序名称即可生成随机数据。 上面代码的最终输出将列出来自可用数据源的十个随机人员姓名和地址。
Providers are classes that hold the data and the necessary data generation formatter methods. Formatters are methods inside provider classes that generates test data directly from a source or using a combination of other formatters. Faker comes with the following built-in providers: Person, Address, PhoneNo, Company, Lorem, Internet, DateTime, Miscellaneous, and UserAgent.
提供程序是保存数据和必要的数据生成格式化程序方法的类。 格式化程序是提供程序类中的方法,它们直接从源中或使用其他格式化程序的组合直接生成测试数据。 Faker带有以下内置提供程序: Person , Address , PhoneNo , Company , Lorem , Internet , DateTime , Miscellaneous和UserAgent 。
Let’s take a look at the Person class to get a better understanding of what the structure of a Faker provider looks like.
让我们看一下Person类,以更好地了解Faker提供程序的结构。
<?php namespace FakerProvider; class Person extends FakerProviderBase { protected static $formats = array( "{{firstName}} {{lastName}}", ); protected static $firstName = array("John", "Jane"); protected static $lastName = array("Doe"); public function name() { $format = static::randomElement(static::$formats); return $this->generator->parse($format); } public static function firstName() { return static::randomElement(static::$firstName); } }Person acts as the provider, extending the base provider class FakerProviderBase. firstName() is a formatter which retrieves a random data element directly from the internal firstName data array. Formatters may combine other formatters and return the data in a specific format as well, which is what name() does. All of the providers and formatters work based on this structure.
Person充当提供者,扩展了基础提供者类FakerProviderBase 。 firstName()是一个格式化程序,可直接从内部firstName数据数组中检索随机数据元素。 格式化程序可以合并其他格式化程序,并以特定格式返回数据,这就是name()作用。 所有提供程序和格式化程序均基于此结构工作。
The built-in providers contain basic formatters with very limited data. If you are using Faker to automate the process of generating test data, you may need to create your own data sets and formatter implementations by extending the base providers.
内置提供程序包含基本格式化程序,其数据非常有限。 如果使用Faker自动化生成测试数据的过程,则可能需要通过扩展基本提供程序来创建自己的数据集和格式化程序实现。
<?php namespace FakerProvider; class Student extends FakerProviderPerson { protected static $formats = array( "{{lastName}} {{firstName}}", "{{firstName}} {{lastName}}" ); protected static $firstName = array("Mark", "Adam"); protected static $lastName = array("Clark", "Stewart"); private static $prefix = array("Mr.", "Mrs.", "Ms.", "Miss", "Dr."); public static function prefix() { return static::randomElement(static::$prefix); } public static function firstName() { return static::prefix() . " " . static::randomElement(static::$firstName); } }Since Student is not a default provider, we have to manually add it to the Faker generator. If the same method is defined on more than one provider, the latest added provider takes precedence over the others.
由于Student不是默认提供程序,因此我们必须手动将其添加到Faker生成器中。 如果在多个提供者上定义了相同的方法,则最新添加的提供者优先于其他提供者。
<?php $faker = new FakerGenerator(); $faker->addProvider(new FakerProviderStudent($faker)); echo $faker->firstName; // invokes Student::firstName()The built-in providers contain basic data types for testing, but real world use cases are often require more complexity. In such situations we need to create our own data providers and custom data sets to automate the testing procedure. Let’s build a Faker provider from scratch catering to a real world scenario.
内置提供程序包含用于测试的基本数据类型,但是现实世界中的用例通常需要更高的复杂性。 在这种情况下,我们需要创建自己的数据提供程序和自定义数据集以使测试过程自动化。 让我们从零开始构建一个Faker提供程序,以适应实际情况。
Assume we’re developing an email marketing service which sends thousands of emails containing various kinds of advertisements from clients. What data fields will we need for testing? Basically we need a to email, subject, name. and content to test an email.
假设我们正在开发一种电子邮件营销服务,该服务可以发送数千封包含来自客户的各种广告的电子邮件。 我们需要哪些数据字段进行测试? 基本上,我们需要一个电子邮件,主题,名称。 和内容以测试电子邮件。
Let’s also assume there are three types of email templates:
我们还假设存在三种类型的电子邮件模板:
advertisement with text/HTML based content 带有基于文本/ HTML内容的广告 advertisements with a single full-size image 一张完整尺寸图片的广告 advertisements containing links to other sites 包含指向其他网站的链接的广告The content field will be one of these templates, so we’ll also need the testing fields text content, image, and links.
content字段将是这些模板之一,因此我们还需要测试字段的文本内容,图像和链接。
Having understood the main requirements, we can create the provider as follows:
了解了主要要求之后,我们可以如下创建提供程序:
<?php namespace FakerProvider; class EmailTemplate extends FakerProviderBase { protected static $formats = array( '<p>Hello {{name}} </p> <p>{{text}}</p> <p>Newsletter by Exmaple</p>', '<p>{{adImage}}</p> <p>Newsletter by Exmaple</p>', '<p>Hello {{name}} </p> <p>{{link}}</p> <p>{{link}}</p> <p>{{link}}</p> <p>Newsletter by Exmaple</p>' ); protected static $toEmail = array( "test@example.com", "test1@example.com" ); protected static $name = array("Mark", "Adam"); protected static $subject = array("Subject 1", "Subject 2"); protected static $adImage = array("img1.png", "img2.jpg"); protected static $link = array("link1", "link2"); protected static $text = array("text1", "text2"); public static function toEmail() { return static::randomElement(static::$toEmail); } public static function name() { return static::randomElement(static::$name); } public function template() { $format = static::randomElement(static::$formats); return $this->generator->parse($format); } }We have defined three formats to match the three different templates, and then we created data sets for each of the fields we are using in the test data generation process. All the fields should contain formatter methods similar to toEmail() and name() in the above code. The template() method takes one of the formats randomly and fills the necessary data using formatters.
我们定义了三种格式来匹配这三个不同的模板,然后为测试数据生成过程中使用的每个字段创建了数据集。 所有字段均应包含类似于上述代码中的toEmail()和name()格式化程序方法。 template()方法随机采用其中一种格式,并使用格式化程序填充必要的数据。
We can get the test data using the code below and passing it to our email application.
我们可以使用下面的代码获取测试数据,并将其传递给我们的电子邮件应用程序。
<?php $faker = new FakerGenerator(); $faker->addProvider(new FakerProviderEmailTemplate($faker)); $email = $faker->toEmail; $subject = $faker->subject; $template = $faker->template;The advantage of the above technique is that we can test all three formats randomly using a single provider with direct formatter function calling. But what if one these format methods is broken or we have a scenario where we need to test only one of the formats continuously? Commenting out or removing the formats manually isn’t an appealing option.
上述技术的优势在于,我们可以使用具有直接格式化程序功能调用的单个提供程序来随机测试所有三种格式。 但是,如果这些格式方法之一被破坏了,或者我们有一种情况,只需要连续测试一种格式,那该怎么办? 手动注释或删除格式不是一个不错的选择。
In this case I would recommend creating separate implementations for each format. We can define a base EmailTemplate class with one format and all of the formatter methods, and then create three different child implementations by extending it. Child classes will only contain the unique format and the formatters will be inherited from the parent class. We can then use each email template differently by loading it separately to the Faker generator.
在这种情况下,我建议为每种格式创建单独的实现。 我们可以使用一种格式和所有格式化程序方法定义基本的EmailTemplate类,然后通过扩展它来创建三个不同的子实现。 子类将仅包含唯一格式,格式化程序将从父类继承。 然后,我们可以通过将每个电子邮件模板分别加载到Faker生成器中来不同地使用它们。
Generally we’ll run tests many times and record the data and results. We check the database or log files to figure out what the respective data was when an error is encountered. Once we’ve fixed the error, it is important to run the test cases with the same data that caused the error. Faker uses seeding so we can replicate the previous data by seeding it’s random number generator.
通常,我们会多次运行测试并记录数据和结果。 我们检查数据库或日志文件以找出遇到错误时相应的数据。 修复错误后,使用导致错误的数据运行测试用例非常重要。 Faker使用播种,因此我们可以通过为随机数生成器播种来复制以前的数据。
Consider the following code:
考虑以下代码:
<?php $faker = FakerFactory::create(); $faker->seed(1000); $faker->name;We’ve assigned a seed value of 1000. Now, no matter how many times we execute the above script, the names will be the same sequence of random values for all the tests.
我们已将种子值指定为1000。现在,无论执行上述脚本多少次,所有测试的名称都将是相同的随机值序列。
In application testing you should assign a seed for each test case and record in your logs. Once the errors are fixed, you can get the seed numbers of the test cases which caused the errors and test it again with the same data using the seed number to make it consistent.
在应用程序测试中,应为每个测试用例分配一个种子,并记录在日志中。 修复错误后,您可以获取导致错误的测试用例的种子编号,并使用相同的数据再次使用种子编号对其进行一致的测试。
Generating test data is something you should automate to prevent wasting time unnecessarily. Faker is a simple and powerful solution for generating random test data. The real power of Faker comes with its ability to extend default functionalities to suit more complex implementations.
您应该自动化生成测试数据,以防止不必要地浪费时间。 Faker是用于生成随机测试数据的简单而强大的解决方案。 Faker的真正功能在于它可以扩展默认功能以适应更复杂的实现。
So what is your test data generation strategy? Do you like to use Faker to automate test data generation? Let me know through the comments section.
那么您的测试数据生成策略是什么? 您是否想使用Faker自动生成测试数据? 让我通过评论部分知道。
Image via Fotolia
图片来自Fotolia
翻译自: https://www.sitepoint.com/simplifying-test-data-generation-with-faker/
faker 测试数据生成
相关资源:Python 随机生成测试数据的模块:faker基本使用方法详解