Rather recently, Microsoft released an app using AI to detect a dog’s breed. When I tested it on my beagle, though…
相反,最近,微软发布了使用AI来检测狗的品种的应用程序 。 不过,当我在小猎犬上测试它时……
Hmm, not quite, app. Not quite.
嗯,不是,app。 不完全的。
In my non-SitePoint time, I also work for Diffbot – the startup you may have heard of over the past few weeks – who also dabble in AI. To test how they compare, in this tutorial we’ll recreate Microsoft’s application using Diffbot’s technology to see if it does a better job at recognizing the adorable beasts we throw at it!
在我不使用SitePoint的时间里,我还为Diffbot (过去几周您可能听说过的初创公司)工作,他还涉猎AI。 为了测试它们之间的比较,在本教程中,我们将使用Diffbot的技术重新创建Microsoft的应用程序,以了解它在识别我们扔给它的可爱野兽方面是否做得更好!
We’ll build a very primitive single-file “app” for uploading images and outputting the information about the breed under the form.
我们将构建一个非常原始的单文件“应用”,用于上传图像并以该形式输出有关品种的信息。
If you’d like to follow along, please register for a free 14-day token at Diffbot.com, if you don’t have an account there yet.
如果您想继续,请在Diffbot.com上免费注册14天令牌(如果您还没有帐户)。
To install the client, we use the following composer.json file:
要安装客户端,我们使用以下composer.json文件:
{ "require": { "swader/diffbot-php-client": "^2", "php-http/guzzle6-adapter": "^1.0" }, "minimum-stability": "dev", "prefer-stable": true, "require-dev": { "symfony/var-dumper": "^3.0" } }Then, we run composer install.
然后,我们运行composer install 。
The minimum stability flag is there because a part of the Puli package is still in beta, and it’s a dependency of the PHP HTTP project now. The prefer stable directive is there to make sure the highest stable version of a package is used if available. We also need an HTTP client, and in this case I opted for Guzzle6, though the Diffbot PHP client supports any modern HTTP client via Httplug, so feel free to use your own favorite.
存在最低稳定性标志是因为Puli软件包的一部分仍处于beta中,并且它现在是PHP HTTP项目的依赖项。 首选的稳定指令用于确保使用软件包的最高稳定版本(如果有)。 我们还需要一个HTTP客户端,尽管Diffbot PHP客户端通过Httplug支持任何现代HTTP客户端,但在本例中我选择了Guzzle6 ,所以请随时使用自己喜欢的。
Once these items have been installed, we can create an index.php file, which will contain all of our application’s logic. But first, bootstrapping:
安装完这些项目后,我们可以创建一个index.php文件,其中将包含我们应用程序的所有逻辑。 但首先,引导:
<?php require 'vendor/autoload.php'; $token = 'my_token';Let’s build a primitive upload form above the PHP content of our index.php file.
让我们在index.php文件PHP内容之上构建一个原始的上传表单。
<form action="/" method="post" enctype="multipart/form-data"> <h2>Please either paste in a link to the image, or upload the image directly.</h2> <h3>URL</h3> <input type="text" name="url" id="url" placeholder="Image URL"> <h3>Upload</h3> <input type="file" name="file" id="file"> <input type="submit" value="Analyze"> </form> <?php ...We’re focusing on the PHP side only here, so we’ll leave out the CSS. I apologize to your eyes.
我们仅在此处侧重于PHP方面,因此我们将省略CSS。 我向你道歉。
We’ll be using Imgur to host the images, so that we don’t have to host the application in order to make the calls to Diffbot (the images will be public even if our app isn’t, saving us hosting costs). Let’s first register an application on Imgur via this link:
我们将使用Imgur来托管图像,这样我们就不必托管应用程序即可调用Diffbot(即使我们的应用程序未公开,图像也会公开,从而节省了托管成本)。 首先,通过此链接在Imgur上注册应用程序:
This will produce a client ID and a secret, though we’ll only be using the client ID (anonymous uploads), so we should add it to our file:
尽管我们只会使用客户端ID(匿名上传),但这会产生一个客户端ID和一个秘密,因此我们应该将其添加到文件中:
$token = 'my_token'; $imgur_client = 'client';So, how will the analysis happen, anyway?
那么,分析将如何进行呢?
As described in the docs, Diffbot’s Image API can accept a URL and then scans the page for images. All found images are additionally analyzed and some data is returned about them.
如文档中所述 ,Diffbot的Image API可以接受URL,然后在页面上扫描图像。 还会对所有找到的图像进行额外分析,并返回有关它们的一些数据。
The data we need are the tags Diffbot attaches to the image entries. tags is an array of JSON objects, each of which contains a tag label, and a link to http://dbpedia.org for the related resource. We won’t be needing these links in this tutorial, but we will be looking into them in a later piece. The tags array takes a form similar to this:
我们需要的数据是Diffbot附加到图像条目的tags 。 tags是一个JSON对象数组,每个对象都包含一个标签标签,以及指向相关资源的http://dbpedia.org的链接。 在本教程中,我们不需要这些链接,但是在以后的文章中,我们将对其进行研究。 tags数组采用类似于以下形式:
"tags": [ { "id": 4368, "label": "Beagle", "uri": "http://dbpedia.org/resource/Beagle" }, { "id": 2370241, "label": "Treeing Walker Coonhound", "uri": "http://dbpedia.org/resource/Treeing_Walker_Coonhound" } ]As you can see, each tag has the aforementioned values. If there’s only one tag, only one object will be present. By default, Diffbot returns up to 5 tags per entry – so each image can have up to 5 tags, and they don’t have to be directly related (e.g. submitting an image of a running shoe might return both the tag Nike and the tag shoe).
如您所见,每个标签都有上述值。 如果只有一个标签,则只会出现一个对象。 默认情况下,Diffbot 每个条目最多返回5个标签-因此每个图片最多可以包含5个标签,而且它们不必直接相关(例如,提交跑步鞋的图片可能会同时返回Nike和该标签鞋 )。
It is these tag labels we’ll be using as suggested guesses of dog breeds. Once the request goes through and returns the tags in the response, we’ll print the suggested labels below the image.
我们将使用这些标签来作为狗品种的建议猜测。 请求通过并返回响应中的标签后,我们将在图像下方打印建议的标签。
To process the form, we’ll add some basic logic below the token declaration. :
为了处理表单,我们将在令牌声明下方添加一些基本逻辑。 :
if ($_SERVER['REQUEST_METHOD'] == 'POST') { if (isset($_FILES['file']['tmp_name']) && !empty($_FILES['file']['tmp_name'])) { $filename = $_FILES['file']['tmp_name']; $c = new Client(); $response = $c->request('POST', 'https://api.imgur.com/3/image.json', [ 'headers' => [ 'authorization' => 'Client-ID ' . $imgur_client ], 'form_params' => [ 'image' => base64_encode(fread(fopen($filename, "r"), filesize($filename))) ] ]); $body = json_decode($response->getBody()->getContents(), true); $url = $body['data']['link']; if (empty($url)) { echo "<h2>Upload failed</h2>"; die($body['data']['error']); } } if (!isset($url) && isset($_POST['url'])) { $url = $_POST['url']; } if (!isset($url) || empty($url)) { die("That's not gonna work."); } $d = new Swader\Diffbot\Diffbot($token); /** @var Image $imageDetails */ $imageDetails = $d->createImageAPI($url)->call(); $tags = $imageDetails->getTags(); echo "<img width='500' src='{$url}'>"; switch (count($tags)) { case 0: echo "<h4>We couldn't figure out the breed :(</h4>"; break; case 1: echo "<h4>The breed is probably " . labelSearchLink($tags[0]['label']) . "</h4>"; echo iframeSearch($tags[0]['label']); break; default: echo "<h4>The breed could be any of the following:</h4>"; echo "<ul>"; foreach ($tags as $tag) { echo "<li>" . labelSearchLink($tag['label']) . "</li>"; } echo "</ul>"; echo iframeSearch($tags[0]['label']); break; } }We first check if a file was selected for upload. If so, it takes precedence over a link-based submission. The image is uploaded to Imgur, and the URL Imgur returns is then passed to Diffbot. If only a URL was provided, it’s used directly.
我们首先检查是否选择了要上传的文件。 如果是这样,它将优先于基于链接的提交。 图像上传到Imgur,然后Imgur返回的URL传递到Diffbot。 如果仅提供URL,则直接使用。
We used Guzzle as the HTTP client directly because we’ve already installed it so the Diffbot PHP client can use it to make API calls.
我们直接将Guzzle用作HTTP客户端,因为我们已经安装了它,因此Diffbot PHP客户端可以使用它进行API调用。
After the image data is returned, we grab the tags from the Image object and output them on the screen, along with a link to Bing search results for that breed, and an iframe displaying those results right then and there.
返回图像数据后,我们从Image对象获取标签并将其输出到屏幕上,以及指向该品种的Bing搜索结果的链接,以及一个iframe随即显示这些结果。
The functions building the search-link and iframe HTML element are below:
构建搜索链接和iframe HTML元素的功能如下:
function labelSearchLink($label) { return '<a href="http://www.bing.com/images/search?q='.urlencode($label).'&qs=AS&pq=treein&sc=8-6&sp=1&cvid=92698E3A769C4AFE8C6CA1B1F80FC66D&FORM=QBLH" target="_blank">'.$label.'</a>'; } function iframeSearch($label) { return '<iframe width="100%" height="400" src="http://www.bing.com/images/search?q='.urlencode($label).'&qs=AS&pq=treein&sc=8-6&sp=1&cvid=92698E3A769C4AFE8C6CA1B1F80FC66D&FORM=QBLH" />'; }Again, please excuse the design of both the code and the web page – as this is just a quick prototype, CSS and frameworks would have been distracting.
再次,请原谅代码和网页的设计–因为这只是一个快速的原型,所以CSS和框架本来会分散注意力。
As we can see from the image above, Diffbot has misidentified the hound as well – but not as grossly as Microsoft. In this case, my beagle really does look more like a treeing walker coonhound than a typical beagle.
从上图可以看出,Diffbot也误认了猎犬-但不如Microsoft严重。 在这种情况下,我的小猎犬确实比典型的小猎犬更像是树上漫步的猎狗。
Let’s see some more examples.
让我们看看更多示例。
Ah, curses! Microsoft wins this round – Diffbot thought it had a better chance of guessing between a basset hound and a treeing walker coonhound, but missed on both. How about another?
啊,诅咒! 微软在本轮比赛中获胜– Diffbot认为,在贝塞猎狗和绿树成荫的助行器猎狗之间进行猜测的机会更大,但两者都被错过了。 另一个怎么样?
Bingo! Both are spot on, though Diffbot is playing it safe by, again, suggesting the walker as an alternative. Okay, that one was a bit too obvious – how about a hard one?
答对了! 两者都可以找到,尽管Diffbot再次保证安全,建议步行者作为替代者。 好的,那有点太明显了-很难的一个呢?
Hilariously, this derpy image seems to remind both AIs of a Welsh corgi!
可笑的是,这张令人毛骨悚然的图像似乎让两个AI都想起了威尔士柯基犬图片!
What if there’s more than one dog in the image, though?
但是,如果图像中有不止一只狗怎么办?
Adorable, Diffbot, but no cigar – well done Microsoft!
可爱,Diffbot,但没有雪茄–微软做得很好!
Okay, last one.
好,最后一个
Excellent work on both fronts. Obviously, the “dog detecting AI” is maxed out! Granted, Diffbot does have a small advantage in that it is also able to detect faces, text, brands, other animal types and more in images, but their “dog recognition” is toe to toe.
两方面的出色工作。 显然,“狗检测AI”已达到极限! 当然,Diffbot确实具有一个小优势,因为它还能够检测到面部,文字,品牌,其他动物类型以及图像中的更多内容,但是它们的“狗识别”是从头到脚。
In this tutorial, we saw how easy it is to harness the power of modern AI to identify dog breeds at least somewhat accurately. While both engines have much room to improve, the more content we feed them, the better they’ll become.
在本教程中,我们看到了利用现代AI的力量至少在某种程度上准确地识别犬种是多么容易。 虽然两个引擎都有很大的改进空间,但我们为它们提供的内容越多,它们就会变得越好。
This was a demonstration of the ease of use of powerful remote machine learning algorithms, and an introduction into a more complex topic we’ll be exploring soon – stay tuned!
这展示了强大的远程机器学习算法的易用性,并介绍了我们将很快探索的更复杂的主题-请继续关注!
翻译自: https://www.sitepoint.com/building-microsofts-what-dog-ai-in-under-100-lines-of-code/