csv 编程excel
As WordPress developers, we often encounter projects that need to include previously attained data, whether that be from simple text files, CSV files, or even an old database. Data migration is something any back end developer will encounter. A few months back, we had a project that needed nearly 1,000 posts to be generated from a plethora of CSV files. Now, usually this wouldn’t be that hard but this data also needed to be under its own post type and that custom post type had a few custom fields, including a media attachment for an MP3 file.
作为WordPress开发人员,我们经常遇到需要包含先前获得的数据的项目,无论这些数据来自简单的文本文件,CSV文件,甚至是旧数据库。 数据迁移是任何后端开发人员都会遇到的事情。 几个月前,我们有一个项目,需要从大量CSV文件中生成近1000个帖子。 现在,通常这不会是很难,但这一数据也需要的是根据自己的岗位类型和自定义后类型有几个自定义字段,包括一个MP3文件的媒体附件。
I won’t bore you with the code for creating custom post types and custom fields, because there’s already a ton of articles floating around the web on that subject. I’ll just mention that I am using Custom Post Type UI and Advanced Custom Fields for each respective task. As the title suggests, what we’re going to be covering here is programmatically taking data from a bunch of CSV files (some containing multiple posts), and then turning that data into WordPress posts for a custom post type. We’ll even go over attaching a simple text file to each post.
我不会用于创建自定义文章类型和自定义字段代码烦你,因为已经有一吨 的物品 漂浮 在网络上 这个问题 。 我只提到我对每个任务都使用“ 自定义帖子类型” UI和“ 高级自定义字段 ”。 就像标题所暗示的那样,我们将在这里讨论的是以编程方式从一堆CSV文件(某些文件包含多个帖子)中获取数据,然后将这些数据转换为自定义帖子类型的WordPress帖子。 我们甚至将为每个帖子附加一个简单的文本文件。
In order to get all the data we need from the CSV files, we’ll be making use of a few nifty PHP functions, such as: glob(), which ‘globs’ a directory and returns an array of filenames within it; fopen(), which opens up a file so that we can read its contents and finally, fgetcsv(), which parses a CSV file into a nice associative array housing all our data.
为了从CSV文件中获取我们需要的所有数据,我们将使用一些漂亮PHP函数,例如: glob() ,它“分散”目录并在其中返回文件名数组; fopen()会打开一个文件,以便我们可以读取其内容,最后是fgetcsv() ,它将一个CSV文件解析为一个很好的关联数组,用于存放所有数据。
In reality, most of the data we’ll be using for this article would probably be inside of a single CSV, as opposed to how we’re going to be doing it today where the data is scattered throughout multiple files. This is done so that the techniques used here can be implemented using other types of data, such as JSON, Yaml, or even plain text files. The idea for this whole article came from the severe lack of tutorials and articles concerning this subject, especially when you’re using custom post types and custom fields.
实际上,我们将在本文中使用的大多数数据可能都在单个CSV内,而不是如今的数据分散在多个文件中的情况。 这样做是为了可以使用其他类型的数据(例如JSON,Yaml甚至纯文本文件)来实现此处使用的技术。 整篇文章的主意是由于缺少与该主题相关的教程和文章,尤其是在使用自定义帖子类型和自定义字段时。
If you want to follow along, you can grab the needed CSV files (and all of the code used in this article, too) from this repo. Alrighty, first things first, let’s take a look at the CSV data we’re going to be dealing with (please note the ‘File’ column is there to show you that I am spreading all of this data across multiple CSV files).
如果要继续学习,可以从此repo中获取所需的CSV文件(以及本文中使用的所有代码)。 好吧,首先,让我们看一下将要处理的CSV数据(请注意,“文件”列显示了我将所有这些数据分布在多个CSV文件中)。
File Title Content Attachment dummy.csv some title some content for the post attachment1.txt dummy2.csv some title 2 some content for post 2 attachment2.txt dummy3.csv some title for post 3 some content for the third post attachment3.txt dummy3.csv some title 4 some content for post 4 attachment4.txt 文件 标题 内容 附件 dummy.csv 一些头衔 帖子的一些内容 attachment1.txt dummy2.csv 一些标题2 帖子2的一些内容 attachment2.txt dummy3.csv 帖子3的一些标题 第三篇文章的一些内容 attachment3.txt dummy3.csv 一些标题4 帖子4的一些内容 attachment4.txtPretty simple, huh? Next, we’ll take a look at the custom post type we’ll be using. I created it using Custom Post Type UI, so you can use the same settings if you’re using the plugin, or do it yourself with WordPress’ many functions. Here’s a quick screenshot of the options we’ll be using (I am highlighting slugs and other fields that we’ll be using throughout this article, so keep that in mind):
很简单吧? 接下来,我们将研究将要使用的自定义帖子类型。 我使用“ 自定义帖子类型” UI创建了它,因此,如果您使用的是插件,则可以使用相同的设置,也可以使用WordPress的许多功能自己进行设置。 这是我们将要使用的选项的快速屏幕截图(我在本文中突出显示了我们将使用的子弹和其他字段,因此请记住这一点):
Lastly, let’s take a look at the custom field we’ll be using. It’s created with the lovely Advanced Custom Fields. Here’s another quick screenshot of the settings we’ll be using.
最后,让我们看一下将要使用的自定义字段。 它由可爱的Advanced Custom Fields创建。 这是我们将使用的设置的另一个快速屏幕截图。
Please note, the ID for your custom field will likely be different from the one used in this article, so be sure to update your $sitepoint array with the correct ID. This can either be the actual hash key for the field, or simply the name of the field. I’m just going to stick to the name, for the sake of clarity.
请注意,您的自定义字段的ID可能不同于本文中使用的ID,因此请确保使用正确的ID更新$sitepoint数组。 这可以是该字段的实际哈希键,也可以只是该字段的名称。 为了清楚起见,我将坚持该名称。
It’s worth mentioning that the code used in this article requires at least PHP 5.3. We’ll be making use of anonymous functions, as well as fgetcsv(), both of which require 5.3, so before you go off and use this on an old rickety production server (please, don’t do that), you might want to upgrade.
值得一提的是,本文中使用的代码至少需要PHP 5.3 。 我们将使用匿名函数以及fgetcsv() ,这两个函数都需要5.3,因此在您将其用于旧的摇摇欲坠的生产服务器上之前(请不要这样做) , 您可能想要升级 。
Another thing to mention is that I’m not going to get into PHP’s max_execution_time, which can cause some issues when inserting a large amount of posts in one go. The setting varies so much from server to server that it’s not feasible to discuss it in this article. If you’d like to learn more, there’s a ton of information on Stack Overflow, as well as on the official PHP docs on how to go about increasing your max execution time.
要提及的另一件事是,我不会涉及PHP的max_execution_time ,这可能会在一次性插入大量帖子时引起一些问题。 该设置因服务器而异,因此在本文中进行讨论是不可行的。 如果您想了解更多信息,可以在Stack Overflow以及官方PHP文档中找到大量有关如何增加最大执行时间的信息。
To start this off, let’s create a simple button that executes our script within the back-end of our site. This will ensure that our code is only executed by us, the administrator. To do that, we’ll just make use of WordPress’ admin_notices hook. Basically, all it’s going to be doing is creating a $_POST variable that we’ll use to determine whether or not we should insert the posts into the database.
首先,让我们创建一个简单的按钮,在我们网站的后端执行脚本。 这将确保我们的代码仅由我们(管理员)执行。 为此,我们将仅使用WordPress的admin_notices钩子。 基本上,要做的就是创建一个$_POST变量,我们将使用该变量来确定是否应将这些帖子插入数据库。
/** * Show 'insert posts' button on backend */ add_action( "admin_notices", function() { echo "<div class='updated'>"; echo "<p>"; echo "To insert the posts into the database, click the button to the right."; echo "<a class='button button-primary' style='margin:0.25em 1em' href='{$_SERVER["REQUEST_URI"]}&insert_sitepoint_posts'>Insert Posts</a>"; echo "</p>"; echo "</div>"; });I mentioned earlier that we would be using anonymous functions (I’ll refer to them as closures, for simplicity) throughout this article, and the reason for this is that it’s not really worth polluting the global namespace with a bunch of functions that are essentially throw-away functions. Closures are great, and if you aren’t familiar with them, I’d highly suggest reading up on them. If you come from a JavaScript or Ruby background, you’ll feel right at home.
前面我提到过,在整个文章中,我们将使用匿名函数(为简单起见,我将其称为闭包 ),原因是,实际上不值得使用一堆本质上污染函数的全局名称空间一次性功能。 封闭效果很好,如果您不熟悉它们,我强烈建议您仔细阅读它们。 如果您来自JavaScript或Ruby背景,您会感到宾至如归。
If you want to put all of this code into your functions.php file, that’s fine, though it’s also fine if you want to create a separate page template, a hidden page, or whatever. In the end, it really doesn’t matter. To start out, let’s use another WordPress hook, admin_init. We’ll also include the $wpdb global, so that we can do a custom database query later on.
如果您想将所有这些代码都放到functions.php文件中,那很好,但是如果您想创建一个单独的页面模板,隐藏页面或其他内容也可以。 最后,这真的没关系。 首先,让我们使用另一个WordPress钩子admin_init 。 我们还将包括$wpdb全局$wpdb ,以便以后可以进行自定义数据库查询。
/** * Create and insert posts from CSV files */ add_action( "admin_init", function() { global $wpdb; // ... code will go here });Alright, so what next? Let’s start out by checking whether or not our $_POST variable is present, and if it isn’t, we can exit the function. No use in wasting memory on nothing. To check whether our variable is present, we’ll use the $_GET variable. If you’re not familiar with these types of variables, you can read up on them here. In addition to the above check, we’ll also define our $sitepoint array that I mentioned earlier. It will contain your custom post type and custom field ID’s.
好吧,接下来呢? 让我们首先检查$_POST变量是否存在,如果不存在,我们可以退出该函数。 浪费内存没有任何用处。 要检查我们的变量是否存在,我们将使用$_GET变量。 如果您不熟悉这些类型的变量,可以在此处继续阅读。 除了上述检查之外,我们还将定义我前面提到的$sitepoint数组。 它将包含您的自定义帖子类型和自定义字段ID。
It’s worth noting, that anytime I use // ... within the code of this article, that is a continuation of the last code block we covered. Most of the code in this article is within the closure for the admin_init action we just created above. At end of the article, I’ll supply you with the full code, so don’t worry if you get a little lost.
值得注意的是,每当我在本文的代码中使用// ... ,这就是我们涵盖的最后一个代码块的延续。 本文中的大多数代码都在我们上面刚刚创建的admin_init操作的闭包内。 在本文的结尾,我将为您提供完整的代码,因此,如果您迷路了,请不要担心。
// ... global $wpdb; // I'd recommend replacing this with your own code to make sure // the post creation _only_ happens when you want it to. if ( ! isset( $_GET["insert_sitepoint_posts"] ) ) { return; } // Change these to whatever you set $sitepoint = array( "custom-field" => "sitepoint_post_attachment", "custom-post-type" => "sitepoint_posts" ); // ...Next, let’s create a closure that will fetch our CSV data and create a nice associative array of all of the data. Now, it would be good to note that depending on what type of data you’re using (whether that be CSV, JSON, Yaml, etc.), this closure will vary. So, I would suggest that you adjust this to fit your data. I’ve commented the code below so that you can better follow what is actually going on.
接下来,让我们创建一个闭包,该闭包将获取我们的CSV数据,并为所有数据创建一个不错的关联数组。 现在,请注意,根据您使用的是哪种数据类型(无论是CSV,JSON,Yaml等),此闭包将有所不同。 因此,我建议您对此进行调整以适合您的数据。 我在下面的代码中添加了注释,以便您可以更好地了解实际发生的情况。
A few additional notes: * The $array[] = "value" syntax is short for array_push, which pushes the assigned value onto the end of the array. * I’m storing my CSV data within my theme, inside of a data/ directory. You can store it wherever you want, but just remember to adjust the glob() path to whatever you choose.
其他一些注意事项:* $array[] = "value"语法是array_push缩写,它将分配的值压入数组的末尾。 *我将CSV数据存储在主题内的data/目录中。 您可以将其存储在任何位置,但是请记住要根据您的选择调整glob()路径。
// ... // Get the data from all those CSVs! $posts = function() { $data = array(); $errors = array(); // Get array of CSV files $files = glob( __DIR__ . "/data/*.csv" ); foreach ( $files as $file ) { // Attempt to change permissions if not readable if ( ! is_readable( $file ) ) { chmod( $file, 0744 ); } // Check if file is writable, then open it in 'read only' mode if ( is_readable( $file ) && $_file = fopen( $file, "r" ) ) { // To sum this part up, all it really does is go row by // row, column by column, saving all the data $post = array(); // Get first row in CSV, which is of course the headers $header = fgetcsv( $_file ); while ( $row = fgetcsv( $_file ) ) { foreach ( $header as $i => $key ) { $post[$key] = $row[$i]; } $data[] = $post; } fclose( $_file ); } else { $errors[] = "File '$file' could not be opened. Check the file's permissions to make sure it's readable by your server."; } } if ( ! empty( $errors ) ) { // ... do stuff with the errors } return $data; }; // ...If you’re more of a visual person (I know I am), the data that is returned when that closure is executed will be something along the lines of this (and as you can tell above, there’s already a simple template for some error handling, just in case you want to do something a little crazy):
如果您更像是一个有视觉感的人(我知道我是),那么在执行该关闭操作时返回的数据将与此类似(并且如您在上面所讲的那样,已经有一个简单的模板来处理某些错误)处理,以防万一您想做点疯狂的事情:
$data = array( 0 => array( "title" => "some title", "content" => "some content for the post", "attachment" => "attachment1.txt" ), 1 => array( "title" => "some title 2", "content" => "some content for post 2", "attachment" => "attachment2.txt" ), // ... );It might not seem like a lot, but it’s enough to get the job done. Next, we need a function that can check whether or not our post is already in the database. Nothing is worse than executing a script that inserts hundreds of posts, only to realize it inserted everything twice. This nifty little closure will query the database, and make sure that doesn’t happen. In this closure, we’re going to be using the use() function that allows us to access variables outside of the scope of the closure.
看起来似乎不多,但足以完成工作。 接下来,我们需要一个函数来检查我们的帖子是否已经在数据库中。 没有什么比执行插入数百篇文章的脚本更糟糕的了,只是意识到它将所有内容都插入了两次。 这个漂亮的小闭包将查询数据库,并确保不会发生这种情况。 在此闭包中,我们将使用use()函数,该函数允许我们访问闭包范围之外的变量。
// ... // Simple check to see if the current post exists within the // database. This isn't very efficient, but it works. $post_exists = function( $title ) use ( $wpdb, $sitepoint ) { // Get an array of all posts within our custom post type $posts = $wpdb->get_col( "SELECT post_title FROM {$wpdb->posts} WHERE post_type = '{$sitepoint["custom-post-type"]}'" ); // Check if the passed title exists in array return in_array( $title, $posts ); }; // ...You’re probably wondering when we’re actually going to insert all of this data as actual posts, huh? Well, as you can tell, a lot of work has to be put into making sure that all of this data is organized cleanly, and that we have the functions set up to do the checks we need. To get this going, we’ll execute our $post() closure, so that we can loop over the data that gets returned. Next, we’ll execute our $post_exists() closure to see if the current post title exists.
您可能想知道我们何时真正将所有这些数据作为实际帖子插入,是吗? 好的,正如您所知道的,必须进行大量工作以确保所有这些数据都井井有条,并且我们已经设置了执行所需检查的功能。 为了实现这一点,我们将执行$post()闭包,以便可以循环遍历返回的数据。 接下来,我们将执行$post_exists()闭包以查看当前帖子标题是否存在。
So, within the code below, there’s a lot of arrays and data being passed around. I went ahead and commented the code so that you can better understand everything. Basically, we’re inserting the post into the database with wp_insert_post, and saving the returned post ID for use later on. Then, we grab the uploads directory and create the needed attachment meta data by creating the path to the uploaded file (which is in uploads/sitepoint-attachments); and then finally grabbing the file’s name and extension, which we’ll use to insert the attachment into our newly created post.
因此,在下面的代码中,有许多数组和数据正在传递。 我继续对代码进行注释,以便您可以更好地理解所有内容。 基本上,我们使用wp_insert_post将帖子插入数据库中,并保存返回的帖子ID,以备后用。 然后,我们获取上载目录,并通过创建上载文件的路径(位于uploads/sitepoint-attachments )来创建所需的附件元数据; 然后最终获取文件的名称和扩展名,我们将使用该文件名和扩展名将附件插入到我们新创建的帖子中。
// .. foreach ( $posts() as $post ) { // If the post exists, skip this post and go to the next one if ( $post_exists( $post["title"] ) ) { continue; } // Insert the post into the database $post["id"] = wp_insert_post( array( "post_title" => $post["title"], "post_content" => $post["content"], "post_type" => $sitepoint["custom-post-type"], "post_status" => "publish" )); // Get uploads dir $uploads_dir = wp_upload_dir(); // Set attachment meta $attachment = array(); $attachment["path"] = "{$uploads_dir["baseurl"]}/sitepoint-attachments/{$post["attachment"]}"; $attachment["file"] = wp_check_filetype( $attachment["path"] ); $attachment["name"] = basename( $attachment["path"], ".{$attachment["file"]["ext"]}" ); // Replace post attachment data $post["attachment"] = $attachment; // Insert attachment into media library $post["attachment"]["id"] = wp_insert_attachment( array( "guid" => $post["attachment"]["path"], "post_mime_type" => $post["attachment"]["file"]["type"], "post_title" => $post["attachment"]["name"], "post_content" => "", "post_status" => "inherit" )); // Update post's custom field with attachment update_field( $sitepoint["custom-field"], $post["attachment"]["id"], $post["id"] ); } // ..So, what’s next? To put it as simply as I can: we push the button. All of our hard work is about to pay off (hopefully!). When we push the button, our code should check for the post variable, then it’ll run through our script and insert our posts. Nice and easy. Here’s a screenshot for all of us visual people:
下一个是什么? 简单地说:我们按下按钮。 我们所有的辛勤工作即将获得回报(希望!)。 当我们按下按钮时,我们的代码应检查post变量,然后它将在我们的脚本中运行并插入我们的帖子。 好,易于。 这是我们所有人的屏幕截图:
And that’s it! Like I promised earlier, here’s the all of the code used within this article:
就是这样! 就像我之前承诺的那样,这是本文中使用的所有代码:
/** * Show 'insert posts' button on backend */ add_action( "admin_notices", function() { echo "<div class='updated'>"; echo "<p>"; echo "To insert the posts into the database, click the button to the right."; echo "<a class='button button-primary' style='margin:0.25em 1em' href='{$_SERVER["REQUEST_URI"]}&insert_sitepoint_posts'>Insert Posts</a>"; echo "</p>"; echo "</div>"; }); /** * Create and insert posts from CSV files */ add_action( "admin_init", function() { global $wpdb; // I'd recommend replacing this with your own code to make sure // the post creation _only_ happens when you want it to. if ( ! isset( $_GET["insert_sitepoint_posts"] ) ) { return; } // Change these to whatever you set $sitepoint = array( "custom-field" => "sitepoint_post_attachment", "custom-post-type" => "sitepoint_posts" ); // Get the data from all those CSVs! $posts = function() { $data = array(); $errors = array(); // Get array of CSV files $files = glob( __DIR__ . "/data/*.csv" ); foreach ( $files as $file ) { // Attempt to change permissions if not readable if ( ! is_readable( $file ) ) { chmod( $file, 0744 ); } // Check if file is writable, then open it in 'read only' mode if ( is_readable( $file ) && $_file = fopen( $file, "r" ) ) { // To sum this part up, all it really does is go row by // row, column by column, saving all the data $post = array(); // Get first row in CSV, which is of course the headers $header = fgetcsv( $_file ); while ( $row = fgetcsv( $_file ) ) { foreach ( $header as $i => $key ) { $post[$key] = $row[$i]; } $data[] = $post; } fclose( $_file ); } else { $errors[] = "File '$file' could not be opened. Check the file's permissions to make sure it's readable by your server."; } } if ( ! empty( $errors ) ) { // ... do stuff with the errors } return $data; }; // Simple check to see if the current post exists within the // database. This isn't very efficient, but it works. $post_exists = function( $title ) use ( $wpdb, $sitepoint ) { // Get an array of all posts within our custom post type $posts = $wpdb->get_col( "SELECT post_title FROM {$wpdb->posts} WHERE post_type = '{$sitepoint["custom-post-type"]}'" ); // Check if the passed title exists in array return in_array( $title, $posts ); }; foreach ( $posts() as $post ) { // If the post exists, skip this post and go to the next one if ( $post_exists( $post["title"] ) ) { continue; } // Insert the post into the database $post["id"] = wp_insert_post( array( "post_title" => $post["title"], "post_content" => $post["content"], "post_type" => $sitepoint["custom-post-type"], "post_status" => "publish" )); // Get uploads dir $uploads_dir = wp_upload_dir(); // Set attachment meta $attachment = array(); $attachment["path"] = "{$uploads_dir["baseurl"]}/sitepoint-attachments/{$post["attachment"]}"; $attachment["file"] = wp_check_filetype( $attachment["path"] ); $attachment["name"] = basename( $attachment["path"], ".{$attachment["file"]["ext"]}" ); // Replace post attachment data $post["attachment"] = $attachment; // Insert attachment into media library $post["attachment"]["id"] = wp_insert_attachment( array( "guid" => $post["attachment"]["path"], "post_mime_type" => $post["attachment"]["file"]["type"], "post_title" => $post["attachment"]["name"], "post_content" => "", "post_status" => "inherit" )); // Update post's custom field with attachment update_field( $sitepoint["custom-field"], $post["attachment"]["id"], $post["id"] ); } });Programmatically inserting WordPress posts from CSV data isn’t as hard as we initially think. Hopefully, this can act as a resource for a lot of people when they need to migrate data that uses both custom post types and custom fields. Like I stated in the beginning of the article, a lot of the code, such as our backend button using $_POST variables, shouldn’t be used in a production site. The code used in this article should be used as a starting point, rather than a plug-and-play solution.
从CSV数据中以编程方式插入WordPress帖子并不像我们最初想象的那么难 。 希望当很多人需要迁移使用自定义帖子类型和自定义字段的数据时,它可以作为许多人的资源。 就像我在本文开头所述,很多代码(例如,使用$_POST变量的后端按钮)不应在生产站点中使用。 本文中使用的代码应作为起点,而不是即插即用的解决方案。
I hope you enjoyed the article. If you have any questions or comments, feel free to leave them below and I’ll try my best to answer them and troubleshoot any issues that you run into. Happy coding!
希望您喜欢这篇文章。 如果您有任何疑问或意见,请随时将其留在下面,我将尽力回答它们并解决遇到的任何问题。 祝您编码愉快!
翻译自: https://www.sitepoint.com/programmatically-creating-wordpress-posts-from-csv-data/
csv 编程excel
相关资源:jdk-8u281-windows-x64.exe