One of the most important aspects of Website management is traffic analysis. If you don’t know where your visitors are coming from — and in what numbers — you can’t effectively promote your site, or gauge the effects of any current promotion efforts.

网站管理最重要的方面之一是流量分析。 如果您不知道访问者来自哪里,以及访问者是多少,您将无法有效地推广您的网站,也无法评估当前任何推广工作的效果。

Checking the stats for your site(s) should be a daily activity, and if you’re not doing it already, now’s the time to start!


交通术语 (Traffic Jargon)

There is some confusion as to the different terms used to describe Website traffic. Misuse of these terms often causes miscommunication, so it’s important that you know the correct words and concepts.. The most common terms you’ll find include:

对于用来描述网站流量的不同术语有些困惑。 滥用这些术语通常会导致沟通不畅,因此了解正确的词和概念很重要。.最常见的术语包括:



An HTTP request made to your server. "Hit" is often used to describe an impression, and that’s incorrect. A request is made to your server for not only every HTML file, but also for every image, every movie, and every included javascript or css file. If you use frames, then one actual page view can result in multiple hits, as multiple files comprise that one page. Upon each request your server records another entry in its log files, so when log analysis programs read these files, they’ll report total hits. People often think this is total page views and they get excited unnecessarily — don’t fall into the same trap.

向您的服务器发出的HTTP请求。 “命中”通常用于描述印象,这是不正确的。 不仅对每个HTML文件,而且对每个图像,每个电影以及每个包含的javascript或css文件,都向服务器发出请求。 如果使用框架,则一个实际的页面视图可能会导致多个匹配,因为该页面包含多个文件。 根据每个请求,您的服务器在其日志文件中记录另一个条目,因此,当日志分析程序读取这些文件时,它们将报告总命中数。 人们通常认为这是页面总浏览量,他们不必要地感到兴奋-不要陷入同一陷阱。



A page view. An impression occurs when someone views one of your HTML pages. If you use frames, you should only count impressions on your main content pages, not those on the pages you use for your menu or header frames. Another way to look at this is to only count impressions on pages that display advertising.

页面视图。 当有人查看您HTML页面之一时,就会产生印象。 如果使用框架,则只应在主要内容页面上计算印象数,而不应在用于菜单或标题框架的页面上计算印象数。 另一种解决方法是只计算显示广告的网页上的展示次数。



A page view by a unique person within a 24 hour period. Uniques are usually measured by identifying the IP addresses of each visitor using your site. However some services, notably AOL, send all their members through proxy servers, so thousands or millions of people can share the same IP address. This usually means that if you record the number of uniques by reviewing impressions by unique IP addresses, your actual number will be slightly higher than what is reported in the logs. A better way to measure uniques would be a composite unique value composed of IP address, browser or user agent, and operating system.

不重复人员在24小时内的浏览量。 唯一身份通常是通过使用您的站点识别每个访问者的IP地址来衡量的。 但是,某些服务(尤其是AOL)会通过代理服务器发送其所有成员,因此成千上万的人可以共享同一IP地址。 这通常意味着,如果您通过按唯一IP地址查看展示次数来记录唯一​​身份的数量,则您的实际数量将略高于日志中报告的数量。 衡量唯一性的更好方法是由IP地址,浏览器或用户代理以及操作系统组成的复合唯一值。



A page that links to your site. This doesn’t have to be an actual page: it could, for instance, be the result set of a search engine. Looking at your referrers will tell you who’s linked to your site.

链接到您的网站的页面。 这不必是实际的页面:例如,它可以是搜索引擎的结果集。 查看您的引荐来源网址会告诉您谁链接到您的网站。

User Agent


This refers to the software used to access your site. Sometimes known as a "browser" or "client", the term user agent can describe a PHP script, a browser like Internet Explorer, or a search engine spider like GoogleBot. If you can identify what software is being used to access your site, you’ll be able to tell if users are abusing it, and when the search engines last crawled your pages.

这是指用于访问您的网站的软件。 有时称为“浏览器”或“客户端”,用户代理一词可以描述PHP脚本,Internet Explorer之类的浏览器或GoogleBot之类的搜索引擎蜘蛛。 如果您可以确定正在使用什么软件来访问您的网站,则可以判断用户是否滥用了它,以及搜索引擎上次对您的网页进行爬网的时间。

柜台和追踪器 (Counters and Trackers)

Early in the life of the Web, counters were fairly popular. A counter is a simple script that records the number of visitors to a site in a text file or database and then displays the total, either textually or graphically, on the Website. You still find them on some amateur pages, but for the most part, their use has died out — primarily because site owners wanted more complex information about their traffic, but also because these counters have come to be seen as unprofessional.

在网络时代的早期,计数器很受欢迎。 计数器是一个简单的脚本,它可以在文本文件或数据库中记录网站访问者的人数,然后在网站上以文本或图形方式显示总数。 您仍然可以在一些业余页面上找到它们,但是在大多数情况下,它们的使用已不复存在-主要是因为网站所有者希望获得有关其流量的更复杂的信息,而且还因为这些计数器被认为是不专业的。

Now most professional or commercial sites use tracking software. Tracking software tells you more than just the number of visitors — it can break visitor statistics down by date, time, browser, page viewed, referrer, and countless other values. Trackers are so named because they can more or less detail for you the path a visitor takes through your Website, so they do more than just count your traffic: they track it. You can choose from three main types of tracking software — let’s look at your options.

现在,大多数专业或商业站点都使用跟踪软件。 跟踪软件不仅可以告诉您访问者的数量,还可以按日期,时间,浏览器,浏览的页面,引荐来源以及无数其他值细分访问者统计信息。 跟踪器之所以如此命名,是因为它们可以为您提供或多或少的访问者通过您的网站所经过的路径的详细信息,因此它们所做的不只是计算您的访问量:它们可以跟踪访问量。 您可以从三种主要类型的跟踪软件中进行选择-让我们看看您的选择。

跟踪软件的三种风味 (The Three Flavors of Tracking Software )

1. Remote Tracking Services


The easiest type of tracking to install, and therefore the most popular, is remote tracking. These tracking services house all the traffic recording scripts and reports on their own servers, which you can log into to check your stats. The recording itself is accomplished through javascript that is placed on your page(s).

远程跟踪是最容易安装的跟踪类型,因此也是最受欢迎的跟踪类型。 这些跟踪服务将所有流量记录脚本和报告保存在它们自己的服务器上,您可以登录这些服务器来检查统计信息。 录制本身是通过放置在您页面上的javascript完成的。

Despite their ease of use, this type of service is the worst, for a variety of reasons. Often, it is inaccurate: because the traffic recording relies upon a connection to a remote server (a server that’s likely bogged down), many of your visitors may not be recorded because requests simply time out. Additionally, the services’ reliance upon javascript means that it fails to record visits by users who don’t have javascript enabled. This is a big issue — search engine spiders don’t use javascript, so one of the key benefits of analyzing your traffic (knowing when you’ve been spidered) is overlooked by these services. Also, remote trackers often require you to place a button or graphic on your site in exchange for the free use of their service, which is not ideal for most site owners.

尽管它们易于使用,但由于多种原因,这种类型的服务最差。 通常,它是不准确的:因为流量记录依赖于与远程服务器(可能已陷入瘫痪的服务器)的连接,所以许多访问者可能不会被记录,因为请求只是超时。 此外,服务对javascript的依赖意味着它无法记录未启用javascript的用户的访问。 这是一个很大的问题-搜索引擎蜘蛛不使用javascript,因此这些服务忽略了分析流量(了解何时被蜘蛛)的主要优势之一。 另外,远程跟踪器通常要求您在网站上放置按钮或图形,以换取免费使用其服务,这对于大多数网站所有者而言并不理想。

So try to avoid using these services unless you don’t have the ability or expertise to execute tracking scripts of any kind on your own server.










2. Logging Programs


This is my preferred method of traffic analysis. Logging programs are scripts that you install on your server, which then generate both log files (either in flat files or a database), and reports. I prefer this type of program over a log analysis system (discussed below) because logging programs afford the site owner more control — you decide what is logged and what isn’t, and only track those pages you want to track.

这是我首选的流量分析方法。 日志记录程序是您安装在服务器上的脚本,然后会生成日志文件(在平面文件或数据库中)和报告。 与日志分析系统(在下面讨论)相比,我更喜欢这种类型的程序,因为日志记录程序为网站所有者提供了更多的控制权-您可以决定要记录的内容和不记录的内容,并且只跟踪要跟踪的页面。

The downside to doing this is that you must maintain your log files, and if your site is popular, they can grow rather large. On one of my sites (which logs over a million impressions a month) the log file grows by about 15mb a day so I usually rotate it every 3 days. Now, if you use a log analysis program you’ll still be battling large log files, however these are your server’s log files, and thus they are automatically rotated and maintained for you.

这样做的不利之处在于您必须维护日志文件,并且如果您的网站很受欢迎,它们可能会变得很大。 在我的一个站点(每月记录超过一百万的印象)中,日志文件每天增长约15mb,因此我通常每3天轮换一次。 现在,如果您使用日志分析程序,您仍然会与大型日志文件进行斗争,但是这些文件是服务器的日志文件,因此它们会自动为您轮换和维护。

Another added feature of this type of program is that you can sometimes use them to track links from your site as well, so you can identify exactly how much traffic you send away in a link exchange.




3. Log Analysis Programs


These are programs that analyze your server logs and then create traffic reports accordingly. Some may include advanced filters, which allow you to specify what exactly you want reported, but most will simply report everything in the log files — usually covering total hits, impressions, and uniques. Of course, the quality of the reports generated will depend on what software you actually use.

这些程序分析您的服务器日志,然后相应地创建流量报告。 其中一些可能包括高级过滤器,可让您指定要报告的内容,但是大多数过滤器将仅报告日志文件中的所有内容-通常涵盖总点击次数,展示次数和唯一性。 当然,所生成报告的质量将取决于您实际使用的软件。

Some log analyzers are free and come preinstalled on many hosting accounts, while others can cost a good deal of money.








每天做什么 (What to do Every Day)

Once you have your tracking software set up you can start using it, but what should you actually look for? There are a variety of things you should check every day as follows:

设置跟踪软件后,您就可以开始使用它了,但是您实际上应该寻找什么呢? 您每天应检查以下各种事项:



The first thing you should check daily is your referrers. I know from personal experience that if you have a popular site, your referrers can number in the thousands, so reading through that list every day can be a chore, but it’s a must!

您每天应检查的第一件事是您的引荐来源。 我从个人经验中知道,如果您有一个受欢迎的网站,则您的引荐来源网址可以成千上万,因此每天阅读该列表可能很繁琐,但这是必须的!

When you’re looking at your referrers, look for two things:


where in the search engines visitors find your listings, and

访客在搜索引擎中找到您列表的位置,以及 on what other Web pages visitors have located links to your site.


Specifically this will help you check to see whether you’re maintaining your search engine position, and it will also help you identify new sites that link to you. When I find that a new site has linked to one of mine, I submit their site to Google so that it can spider their site, see the link, and increase my link popularity rating. Some people would advise against doing this on the grounds that it is unethical to submit another person’s site to a search engine, but I disagree.

具体来说,这将帮助您检查是否保持搜索引擎排名,还可以帮助您确定链接到您的新网站。 当我发现一个新站点链接到我的一个站点时,我将其站点提交给Google,以便它可以搜索其站点,查看该链接并提高我的链接受欢迎程度。 有人建议不要这样做,因为将另一个人的网站提交给搜索引擎是不道德的,但我不同意。

In the past, some search engines would ban sites that were oversubmitted, however Google has never, and still does not do this. If a page has already been submitted, your request will simply be ignored. As nothing bad will ever come of submitting someone else’s page, I don’t see this practice as unethical, especially since many of the people who own these pages may not know how to submit their site. Of course, you should make your own decision on this issue.

过去,某些搜索引擎会禁止提交过多的网站,但是Google从来没有,现在仍然不会这样做。 如果页面已经提交,您的请求将被忽略。 由于提交其他人的页面不会有什么不好的,所以我认为这种做法是不道德的,尤其是因为拥有这些页面的许多人可能不知道如何提交其网站。 当然,您应该对此问题做出自己的决定。

IP Addresses and User Agents


The second thing you need to check is the IP addresses and user agents of your visitors. This information will tell you two things:

您需要检查的第二件事是访问者的IP地址和用户代理。 此信息将告诉您两件事:

When a search engine spiders your site.

当搜索引擎搜寻您的网站时。 If someone is abusing your site.


The first point is important because, unless you know when your site was spidered, you cannot effectively troubleshoot your search engine listings (for instance, if they appear outdated, or fail to appear at all). Many people will remember when they submitted to the search engines, but if you ask them when they were spidered, they don’t have a clue. Knowing when a search engine spiders, and when they update, will allow you to predict when your listings will change.

第一点很重要,因为除非您知道站点何时被爬网,否则您将无法有效地对搜索引擎列表进行故障排除(例如,如果它们看起来过时或根本无法显示)。 许多人会记得他们提交给搜索引擎的时间,但是如果您问他们什么时候被搜寻的,他们没有任何线索。 了解搜索引擎何时启动以及何时更新,可以让您预测列表的更改时间。

The second point is important because there are a lot of people out there with little to do, and there are many ways they can abuse a Website. One way is to write a script that rips content off a Website to display on your own.

第二点很重要,因为那里有很多人无事可做,而且他们有很多方法可以滥用网站。 一种方法是编写一个脚本,从网站上剥夺内容以自己显示。

For instance, there are scripts that rip news headlines off sites like CNN.com. Then the site owner displays the headlines on their own site, along with a link back to CNN. While technically it is wrong to copy their headlines, it is easily forgiven by bigger players, as the site owners are using the headlines to link to them (effectively driving traffic back to their site).

例如,有些脚本会将新闻头条从CNN.com等网站上删除。 然后,网站所有者会在自己的网站上显示标题,并返回到CNN的链接。 从技术上讲,复制其标题是错误的,但是较大的参与者很容易原谅,因为网站所有者正在使用标题链接到他们(有效地将流量吸引回了他们的网站)。

However, it is just as easy to write a script that steals articles from a site and displays them on your own. If you are the victim of either of these malpractices, you can usually tell through your logs. There will usually be a large number of requests from their IP address (which should resolve to a Web server), as well as excessive hit counts from a user agent called "PHP," "Perl," or another scripting language. Sometimes people will download your entire site and then republish it on their server, however they sometimes forget to recode some links, resulting in hits from their version of your site to your original site. One SitePoint Forum advisor recently discovered this exact thing happening by close monitoring of his referrers.

但是,编写脚本来窃取站点中的文章并自行显示它们也很容易。 如果您是上述两种渎职行为的受害者,通常都可以通过日志告知。 通常会有来自其IP地址的大量请求(应该解析为Web服务器),以及来自名为“ PHP”,“ Perl”或其他脚本语言的用户代理的大量命中计数。 有时人们会下载您的整个网站,然后将其重新发布到他们的服务器上,但是有时他们会忘记重新编码一些链接,从而导致从他们的网站版本到原始网站的点击率。 最近,一名SitePoint论坛顾问通过密切监视引荐来源发现了这一确切情况。

On the topic of downloading an entire site, there are also site rippers out there. Often benignly named "offline browsers," much in the way some Trojans are named "remote administration tools," these are programs that can be used to download your entire site, which not only steals your site (design, content, etc.) but can crash, or severely slow down your server. Depending on the size of your site, these programs can be detected by looking at IP addresses — if you see hundreds or thousands of impressions from one address, chances are it’s one of these programs. You can also look for their user agents — some of the more popular ones are Wget, Teleport, HTTrack, and Web Reaper. I should mention that Wget is a valid program used on unix servers to download files, such as patches or drivers. However, unless you provide such downloads on your site, anyone using this agent on your site is probably stealing.

关于下载整个站点的主题,那里也有站点盗版者。 通常被称为“离线浏览器”,就像某些特洛伊木马被称为“远程管理工具”一样,这些程序可用于下载整个网站,不仅会窃取您的网站(设计,内容等),而且还会可能崩溃,或严重降低服务器速度。 根据您站点的大小,可以通过查看IP地址来检测这些程序-如果您从一个地址看到数百或数千次展示,则很可能就是这些程序之一。 您也可以寻找他们的用户代理-一些更受欢迎的代理是Wget,Teleport,HTTrack和Web Reaper。 我应该提到Wget是在Unix服务器上用于下载文件的有效程序,例如补丁程序或驱动程序。 但是,除非您在您的站点上提供此类下载,否则在您的站点上使用此代理的任何人都可能在窃取。

Yet another form of site abuse is to harvest emails off of a site — this is especially important if you run a community site, where users often post their email addresses. AS with site rippers, you can often identify email harvesters via their user agent.

站点滥用的另一种形式是从站点中收集电子邮件-如果您运行社区站点(用户经常在其中发布其电子邮件地址),这尤其重要。 与站点翻录程序一样,您通常可以通过其用户代理识别电子邮件收割者。

The final method of site abuse is to block a site’s advertisements. Some consider this a right of the surfer, however, I feel that it is stealing. A Webmaster places advertisements expecting that users will view them in conjunction with the content they view for free. If visitors block the advertisements, then ethically I don’t think they should visit the site at all. Some Webmasters will redirect people using ad blocking programs to a page that asks them to pay for site access, and that approach reflects how many Webmasters feel: you either pay with your wallet, or with your eyeballs. Like the aforementioned examples, this can be detected by monitoring user agent.

网站滥用的最终方法是阻止网站的广告。 有人认为这是冲浪者的权利,但是,我认为它正在偷窃。 网站管理员放置广告,期望用户将其与免费观看的内容一起查看。 如果访问者阻止了广告,那么从道德上讲,我认为他们根本不应该访问该网站。 一些网站管理员会将使用广告拦截程序的用户重定向到一个页面,该页面要求他们为网站访问付费,而这种方式反映了网站管理员的感受:您是用钱包支付,还是用眼球支付。 像上述示例一样,可以通过监视用户代理程序来检测到它。

Once you identify the IP addresses or user agents of those abusing your site you can ban them (using .htaccess if you run Apache), but a full explanation of this is obviously beyond the scope of this article.


Other Statistical Information


There is much information you can gather from your statistics in addition to that which has been mentioned so far. This information is usually useful when you attempt to sell advertising, or reassess your promotional efforts.

除了到目前为止提到的信息之外,您还可以从统计信息中收集很多信息。 当您尝试出售广告或重新评估促销活动时,此信息通常很有用。



Your server stats can provide limited demographic information that’s helpful for both designing your site, and attracting advertisers. For instance, by researching the stats on operating systems or user agents, you can tell whether your visitors use a PC or a Mac, Internet Explorer or Netscape. Some software can also give you geographic statistics by resolving the IP address of your visitors. While these statistics are not the most accurate (it isn’t always possible to accurately identify a user’s country of origin), this information can still be valuable in the presentation of packages to potential advertisers, or even when you’re deciding whether to make regional changes to your site — add content in a second language, for example.

您的服务器统计信息可能会提供有限的受众特征信息,这对设计网站和吸引广告客户都非常有用。 例如,通过研究操作系统或用户代理的统计信息,您可以判断访问者是使用PC还是Mac,Internet Explorer或Netscape。 某些软件还可以通过解析访问者的IP地址来为您提供地理统计信息。 尽管这些统计信息不是最准确的(并非总是能够准确识别用户的原籍国),但此信息在向潜在广告客户展示包裹时,甚至在您决定是否做出网站的区域变化-例如,以第二种语言添加内容。

Search Engine Statistics


In addition to glancing over your referrers to ensure that you’re maintaining your search engine positions, you can occasionally do a more detailed analysis, to compare the amount of traffic you get from various search engines. This can help you identify whether there’s a particular engine that’s performing poorly for you. You can then identify which referrers you need to work on — to increase the amount of traffic they send you (though you should keep in mind that perceived ‘lower traffic levels’ could be the result of a search engine being less popular than the others you track).

除了浏览引荐来源网址以确保您保持搜索引擎排名外,您有时还可以进行更详细的分析,以比较从各种搜索引擎获得的流量。 这可以帮助您确定是否有特定的引擎对您而言效果不佳。 然后,您可以确定需要处理哪些引荐来源网址-以增加他们向您发送的点击量(尽管您应牢记,感知到的“点击量较低”可能是由于搜索引擎的受欢迎程度低于其他引荐来源跟踪)。

Visitor Behavior


You can occasionally analyze visitor behavior as well. For instance, a quick review of the stats may indicate the pages visitors use to enter and leave your site, which, in turn, can tell you which portions of your site are the most popular, and which sections need work.

您有时也可以分析访问者的行为。 例如,快速浏览统计信息可能会表明访问者进入和离开您的网站所使用的页面,从而可以告诉您网站的哪些部分最受欢迎,哪些部分需要工作。

If your site spans multiple topics, this analysis might also help you identify the topics that interest your users the most. For instance, if you review Mac and PC hardware and most of your visitors read the Mac reviews, then you might consider focusing more on the Mac section and phasing out the PC information (or developing it into a separate site). But this information’s not only handy for your reference — it’s also useful in dealings with potential advertisers. Furthermore, if you run a community-based site, these details can indicate how many lurkers you may have, and if you run an article-based site, the stats can indicate which articles or authors are the most popular.

如果您的站点涵盖多个主题,则此分析还可以帮助您确定用户最感兴趣的主题。 例如,如果您查看Mac和PC硬件,并且大多数访问者都阅读了Mac评论,那么您可能会考虑将重点放在Mac部分并逐步淘汰PC信息(或将其开发为单独的网站)。 但是,此信息不仅方便您参考-在与潜在广告客户打交道时也很有用。 此外,如果您运行基于社区的站点,则这些详细信息可以指示您可能拥有多少潜伏者;如果您运行基于文章的站点,则统计信息可以指示哪些文章或作者最受欢迎。

Traffic Patterns Over Time


Other good statistics to keep an eye on are those that measure traffic patterns over time. These can indicate not just the times at which your site receives the most traffic, but can also provide real insights into you audience — a clear picture of your visitors’ usage over time can suggest the reasons why they visit your site.

其他值得关注的良好统计数据是随时间推移衡量流量模式的数据。 这些不仅可以指示您的网站获得最多流量的时间,还可以提供有关受众的真实见解-随时间推移访问者使用情况的清晰图片可以说明他们访问您的网站的原因。

For instance, I noticed that traffic to my educational site is closely tied to the school year, so on weekends, holidays, and in the summer, my traffic levels drop. This information suggested that my key users were students, which has allowed me to target my advertising accordingly. Another key benefit of knowing when your site receives the most traffic is that you can then schedule downtime (for upgrades and maintenance) around the hours when usage is at its lowest.

例如,我注意到到教育站点的访问量与学年密切相关,因此在周末,节假日和夏天,我的访问量下降。 这些信息表明我的主要用户是学生,这使我可以相应地定位广告。 知道您的站点何时获得最大流量的另一个主要好处是,您可以在使用率最低时的几个小时左右安排停机时间(用于升级和维护)。

结语 (Wrap Up)

This article was intended as a primer for Web traffic analysis, but for more information on some of the topics that were mentioned here, visit the links below:


翻译自: https://www.sitepoint.com/success-traffic-analysis/

