php 开发 谷歌扩展程序
You’ve built a website. It was fun, and it feels rewarding to see all those visitors pour in. The traffic increases slowly, until one day, someone posts a link to your app to Reddit and Hacker News, the planets somehow align, GitHub is down or something and for some reason, people notice the post and storm in, breaking all barriers of reason and logic.
您已经建立了一个网站。 这很有趣,看到所有这些访客涌入感到很高兴。流量缓慢增加,直到有一天,有人将指向您应用程序的链接发布到Reddit和Hacker News,行星以某种方式对齐,GitHub宕机等等。出于某种原因,人们注意到了职位空前的风暴,打破了理性和逻辑的所有障碍。
Your server chokes, and everything dies. Instead of getting new customers or regular visitors during this epic peak (epeak?), you’re now left with a blank page, scrambling about as you try to get it up and running again, to no avail – even after restarting, the server can do nothing differently to survive the load. You lost traffic – for good.
您的服务器阻塞,一切消亡。 现在,您不再需要在此史诗般的高峰期间吸引新客户或常规访客(说话?),而是空白页面,在尝试重新启动并再次运行时无所事事-即使重新启动服务器也无济于事不能做任何其他事情来承受负载。 您失去了流量-永远。
No one can anticipate these traffic spikes. Very few of us plan so far in advance, unless we’re setting up to build a highly funded project that’s expected to do very well in a fixed time frame. How, then, does one avoid these problems? Two aspects need to be considered: optimization and scaling.
没有人能预料到这些流量高峰。 到目前为止,只有极少数人会提前计划,除非我们要建立一个资金雄厚的项目,而且该项目预计会在固定的时间内完成。 那么,如何避免这些问题呢? 需要考虑两个方面: 优化和缩放 。
We’ve written about this before, and a more advanced article is coming up next week, but the usual advice applies – upgrade to the latest version of PHP (currently 5.5, has a built in OpCache), index your database, cache your static content (seldom changed pages like About, FAQ and similar), etc.
我们之前已经写过有关此内容的文章,下周将发表更高级的文章,但通常的建议适用–升级到最新版本PHP(当前为5.5,具有内置的OpCache),索引数据库,缓存静态内容(很少更改的页面,例如“关于”,“常见问题”和类似内容)等。
One particular aspect of optimization that can be done is not only caching static resources, but also serving anything static through a non-Apache server like Nginx, optimized for serving static content. You put a layer of Nginx in front of your Apache, tell it to intercept the requests for static resources (i.e. *.jpg, *.png, *.mp4, *.html…) and serve them directly instead of letting the request move on to the Apache application layer. Such a setup is called a reverse proxy (also sometimes identified with a software load balancer – see below), and you can find out more about implementing it here.
优化的一个特定方面不仅可以缓存静态资源,还可以通过非Apache服务器(如Nginx)为静态内容提供任何静态服务,而Nginx为静态内容进行了优化。 您在Apache的前面放置了一层Nginx,告诉它拦截对静态资源的请求(即*.jpg , *.png , *.mp4 , *.html …)并直接提供服务,而不是让请求移动到Apache应用程序层。 这种设置称为反向代理(有时也由软件负载均衡器标识-参见下文),您可以在此处找到有关实现它的更多信息。
That said, there’s nothing like scaling.
就是说,没有什么比缩放更合适了。
There are two types of scaling – horizontal and vertical.
缩放有两种类型:水平缩放和垂直缩放。
We say that a website is scalable when it can manage increases in traffic without needing software changes.
我们说,网站可以在无需软件更改的情况下管理流量的增长,从而具有可扩展性。
Imagine having a server serving the web app in question. The server has 4GB of RAM, an i5 CPU, and a 1TB HDD. It performs its function well, but to better tolerate a higher influx of traffic, you decide to replace the 4GB of RAM with 16GB, you put in an i7 CPU, and you add a PCIe SSD/HDD Hybrid drive. The server is now much more powerful and can handle a higher load. This is known as vertical scaling, or “scaling up” – you improved the machine to make it more powerful. In other words, this happens:
想象一下,有一个服务器正在处理该Web应用程序。 该服务器具有4GB的RAM,i5 CPU和1TB的HDD。 它可以很好地执行其功能,但是为了更好地承受更大的流量,您决定将4GB的RAM替换为16GB,安装i7 CPU,然后添加PCIe SSD / HDD混合驱动器。 服务器现在功能更强大,可以处理更高的负载。 这称为垂直缩放或“放大” –您对机器进行了改进以使其更强大。 换句话说,发生这种情况:
On the other side of the spectrum, we have horizontal scaling. In the example above, the upgrade itself will likely cost as much as, if not more than, the starting machine on its own. This is costly, and often doesn’t produce the benefits we need – most of the scaling problems are related to concurrency, and if there aren’t enough cores to perform the logic fast enough, no matter how strong the CPU, the server will grind to a halt and force some visitors to wait.
在频谱的另一端,我们有水平缩放。 在上面的示例中,升级本身的成本可能会高达启动计算机本身的费用,甚至更多。 这是昂贵的,并且通常不能产生我们需要的好处–大多数扩展问题与并发性有关,如果没有足够的内核来足够快地执行逻辑,那么无论CPU多么强大,服务器都会停下来,迫使一些访客等待。
Horizontal scaling is when you build a cluster of (often weaker) machines linked together to serve the website. In this case, a load balancer is used – a machine or program whose only role is determining to which machine it should send the request it intercepted. The machines in the cluster then automatically divide the workflow among themselves without even being aware of one another, and your site’s traffic capacity increases immeasurably. This is also known as “scaling out”.
水平缩放是指您建立由(通常较弱的)机器链接在一起以为网站提供服务的集群。 在这种情况下,将使用负载均衡器–唯一的角色是确定将拦截的请求发送到哪台机器或程序的机器或程序。 然后,群集中的机器会自动将工作流分配到彼此之间,甚至彼此之间不会被察觉,并且站点的通信量将无可估量地增加。 这也称为“向外扩展”。
There are two main types of load balancers – hardware and software. Software load balancers are installed on a regular machine and accept all traffic, routing it to the appropriate handler. Nginx can be one such load balancer in the case above under “Optimization” – it intercepts requests for static files, and serves them on its own, without burdening Apache with them. Another popular software load balancer is Squid, one I’ve personally used in my company extensively and one which provides truly deep control of all aspects via a user friendly interface.
负载均衡器主要有两种类型:硬件和软件。 软件负载平衡器安装在常规计算机上,并接受所有流量,并将其路由到适当的处理程序。 在上述“优化”下的情况下,Nginx可以是这样的负载均衡器之一–它拦截对静态文件的请求,并单独提供请求,而不会给Apache造成负担。 另一个流行的软件负载平衡器是Squid ,我在公司中广泛使用过它,并且它通过用户友好的界面提供对各个方面的真正深度控制。
Hardware balancers are dedicated machines with the sole purpose of being load balancers – no other software is usually installed on them. Some of the most popular ones designed for handling immense amounts of traffic can be read about in this list.
硬件平衡器是专用机器,其唯一目的是成为负载平衡器–通常不安装其他软件。 在此列表中,可以了解一些设计用于处理大量流量的最受欢迎的方法。
In horizontal scaling, this happens:
在水平缩放中,会发生以下情况:
Note that the two are not mutually exclusive – you can scale up a machine (also called a node) in a scaled out system, too. In this article, we’ll be focusing on HZ scaling due to it generally being the better (both cheaper and more efficient) choice, albeit more difficult to implement.
请注意,两者不是互斥的-您也可以在横向扩展系统中纵向扩展计算机(也称为节点)。 在本文中,我们将重点关注HZ缩放,因为它通常是更好的选择(既便宜又有效),尽管更难以实现。
There are several tricky issues to overcome when scaling PHP applications. One such issue is database bottlenecking (something we’ll cover in Part 2) and another is managing session data – surely if you log in on one machine, you’ll be logged out if the load balancer redirects you to another machine in your next request, right? There’s a way around this – you can either share the local data between machines, or you can use a persistent load balancer.
扩展PHP应用程序时,有几个棘手的问题需要克服。 其中一个问题是数据库瓶颈(第二部分中将介绍),另一个是管理会话数据–当然,如果您登录到一台计算机上,那么如果负载平衡器将您重定向到下一台计算机上,您将被注销。请求,对不对? 有一种解决方法–您可以在计算机之间共享本地数据,也可以使用持久性负载平衡器。
A persistent load balancer remembers where it previously redirected the client, and does the same thing on his next request. So if I visit SitePoint and log in, the load balancer redirects me to, say, Server1, remembers me, and my next click after logging in will also be redirected to Server1. Naturally, this all happens transparently. What if Server1 goes down, though? Yes, all session data is lost – I’m logged out, and I need to start over on another server. This is a needless interruption of user experience. What’s more, the load balancer has so much to do now (not only redirecting hundreds of thousands of people to various servers but also remembering where it sent each of them), it has become the bottleneck and might benefit from some scaling out on its own. But if one LB then crashes, is the remembered data about clients and the servers they were sent to lost as well? WHO WATCHES THE WATCHMEN? The situation reeks of an obvious catch 22.
持久负载平衡器会记住它先前将客户端重定向到的位置,并在其下一个请求中执行相同的操作。 因此,如果我访问SitePoint并登录,负载平衡器会将我重定向到Server1,记住我,并且登录后的下一次单击也将重定向到Server1。 自然,这一切都是透明发生的。 但是,如果Server1出现故障怎么办? 是的,所有会话数据都丢失了–我已注销,并且需要在另一台服务器上重新开始。 这是不必要的用户体验中断。 而且,负载均衡器现在还有很多工作要做(不仅将成千上万的人重定向到各种服务器,而且还记住了将它们发送到各个服务器的位置),它已经成为瓶颈,并且可能会受益于一些自行扩展。 但是,如果一个LB随后崩溃了,关于客户端和发送给它们的服务器的记住数据也会丢失吗? 谁看守看门人? 情况明显,渔获22。
Sharing the session data across the entire cluster definitely seems like the go-to approach, and while it does require some architecture changes in your app, it’s well worth it – there is no bottleneck, and the entire cluster is fault tolerant – one server’s demise is completely irrelevant to the rest and isn’t even noticed (by the machines, of course – the humans in charge of them hastily replace the hardware as soon as the fault occurs).
在整个集群上共享会话数据绝对是一种可行的方法,尽管它确实需要在您的应用程序中进行一些架构更改,但这是值得的–没有瓶颈,并且整个集群都具有容错能力–一台服务器的灭亡与其余部分完全无关,甚至都没有注意到(当然,机器会出问题,负责机器的人员会尽快更换硬件)。
Now, we know session data is stored in the $_SESSION superglobal in PHP, and we know that the $_SESSION superglobal reads and writes from and to a file on disk. If said disk is in one server, though, it’s obvious that other servers have no access to it. How, then, do we make it available across several machines?
现在,我们知道会话数据存储在PHP的$_SESSION超级全局变量中,并且我们知道$_SESSION超级全局变量在磁盘上的文件中进行读写操作。 但是,如果所说的磁盘在一个服务器中,则显然其他服务器无法访问它。 那么,如何在多台计算机上使用它呢?
First, note that session handlers can be overridden in PHP – you can define your own class/function to handle session management. For more information on how that’s done, please see the docs.
首先,请注意,可以在PHP中重写会话处理程序-您可以定义自己的类/函数来处理会话管理。 有关此操作的更多信息,请参阅docs 。
Using a custom session handler, we can make sure the session data is always stored in a database. The database should be on a separate server (or cluster of its own!), so the servers being load balanced from the original story are serving just the business logic. While this approach often works well, on truly high traffic incidents the database becomes not only the single point of failure (lose that, and you lost everything), it also causes a significant connection overhead due to having to connect to the various servers writing session data to it at all times. It becomes the new bottleneck, and could use some scaling out, which is another problem when using traditional databases like MySQL, Postgre and similar (covered in Part 2).
使用自定义会话处理程序,我们可以确保会话数据始终存储在数据库中。 数据库应位于单独的服务器(或其自己的群集!)上,因此与原始故事进行负载平衡的服务器仅用于业务逻辑。 尽管这种方法通常可以很好地起作用,但是在真正的高流量事件中,数据库不仅成为单点故障(丢失了点,而且您丢失了所有东西),而且由于必须连接到编写会话的各个服务器,因此还会导致大量连接开销随时向其发送数据。 它成为新的瓶颈,并且可能需要一些扩展,这在使用MySQL,Postgre等类似的传统数据库(在第2部分中介绍了)时是另一个问题。
You might be tempted to set up a network file system to which all servers can write their session data. Don’t. This is the absolute worst approach, prone to corruption and data drops and is extremely slow. It’s also a single point of failure, much like the database aspect above. Activating it is as simple as changing the session.save_path value in php.ini, but it’s highly recommended you use a different approach. If you really insist on using a shared file system, it’s much better to use a solution like GlusterFS.
您可能很想建立一个网络文件系统,所有服务器都可以向其写入会话数据。 别。 这是绝对最糟糕的方法,容易损坏和丢失数据,并且非常慢。 这也是单点故障,非常类似于上面的数据库方面。 激活它就像更改php.ini的session.save_path值一样简单,但是强烈建议您使用其他方法。 如果您真的坚持使用共享文件系统,那么最好使用GlusterFS之类的解决方案。
You can use memcached to store session data in RAM. This is arguably unsafe because data in memcached gets overwritten as space runs out, and there’s no persistence – remembering someone’s login will only last for as long as the memcached server is running or has room to remember it. You might be wondering – but isn’t RAM separate on each machine? How does that apply to a cluster? Memcached has the ability to virtually pool the available RAM from multiple machines into one large whole.
您可以使用memcached将会话数据存储在RAM中。 可以说这是不安全的,因为随着空间用完,memcached中的数据将被覆盖,并且没有持久性–记住某人的登录仅会在memcached服务器正在运行或有记忆空间的情况下持续。 您可能想知道–但是不是每台机器上的RAM都是分开的吗? 这如何适用于集群? Memcached能够虚拟地将多台计算机中的可用RAM池化为一个大的整体。
The more machines you have, the bigger the pool gets as you dedicate extra RAM to it. You don’t have to give a machine’s RAM to the pool, but you can, and you can give arbitrary amounts from each. So a good chunk of RAM remains on the machine for regular use, while the rest is donated to cache, helping you not only store session data cluster-wide, but also allowing you to cache other content as you see fit – as long as there’s room. Memcached is a great solution, and this approach has widespread adoption.
您拥有的计算机越多,则由于要为其分配额外的RAM,因此池越大。 你不必给一台机器的内存池,但你可以,你可以给每个任意金额。 因此,计算机上会保留大量内存供常规使用,而其余的则用于缓存,这不仅可以帮助您在整个群集范围内存储会话数据,还可以让您缓存您认为合适的其他内容-只要有房间。 Memcached是一个很好的解决方案,这种方法已被广泛采用。
Usage in PHP apps is as simple as changing some php.ini values:
在PHP应用程序中的用法就像更改某些php.ini值一样简单:
session.save_handler = memcache session.save_path = "tcp://path.to.memcached.server:port"Redis is an in-memory NoSQL data store, much like Memcached, but supports persistence and more complex data types than just string based key => value pairs. It doesn’t have cluster support yet, though, so implementing it in HZ scaling solution is not as straightforward as one might think, but it’s getting there. In fact, an alpha version of their clustering solution is already out and can be used: http://redis.io/topics/cluster-tutorial. For a more in depth comparison between Memcached and Redis, see this StackOverflow answer. Compared to a typical caching solution like Memcached, Redis is more like a Memcached-turned-proper-database.
Redis是一个内存中NoSQL数据存储,非常类似于Memcached,但它支持持久性和更复杂的数据类型,而不仅仅是基于字符串的key => value对。 但是,它还没有群集支持,因此在HZ扩展解决方案中实现它并不像人们想象的那么简单,但是它已经实现了。 实际上,他们的集群解决方案的alpha版本已经发布并可以使用: http : //redis.io/topics/cluster-tutorial 。 有关Memcached和Redis之间更深入的比较,请参阅此StackOverflow答案 。 与典型的缓存解决方案(如Memcached)相比,Redis更像是由Memcached转换为专有数据库的数据库。
ZSCM from Zend is an alternative, but requires Zend Server on every node in the cluster.
Zend的ZSCM是替代方案,但在群集中的每个节点上都需要Zend Server。
Other NoSQL stores and caching systems would work – try solutions like Scache, Cassandra or Couchbase, all blazingly fast and reliable.
其他NoSQL存储和缓存系统也可以使用-尝试使用诸如Scache , Cassandra或Couchbase之类的解决方案,这些解决方案都非常快捷,可靠。
As you can see, horizontally scaling PHP web apps is no picnic. There are various hurdles to overcome, and the solutions are not easily interchangeable – more often than not it’s all about picking one and sticking with it for better or worse, because by the time the traffic rolls in, it’s too late to make a smooth transition to something else.
如您所见,水平扩展PHP Web应用程序并不是一件容易的事。 有许多障碍需要克服,解决方案不易互换-通常情况下,选择一个方案并坚持或多或少地要解决所有问题,因为当流量增加时,进行平滑过渡已经为时已晚去别的东西。
I hope this short guide helped you decide on the best approach for your company, and if you’ve got alternative solutions or suggestions, we’d love to hear about them in the comments below. In Part 2, we’ll cover database scaling.
我希望这份简短的指南可以帮助您确定适合您公司的最佳方法,如果您有其他解决方案或建议,我们很乐意在下面的评论中听到。 在第2部分中,我们将介绍数据库扩展。
翻译自: https://www.sitepoint.com/horizontal-scaling-php-apps/
php 开发 谷歌扩展程序