{"id":2442,"date":"2014-11-25T16:39:57","date_gmt":"2014-11-25T12:39:57","guid":{"rendered":"http:\/\/nayarweb.com\/blog\/?p=2442"},"modified":"2017-04-16T21:59:08","modified_gmt":"2017-04-16T17:59:08","slug":"mass-importing-into-sql-database-from-csf","status":"publish","type":"post","link":"https:\/\/nayarweb.com\/blog\/2014\/mass-importing-into-sql-database-from-csf\/","title":{"rendered":"Mass Importing into SQL Database from .csv File"},"content":{"rendered":"<p>If you have to import let&#8217;s say for example Country Name and County Code into a database from a CSV file like this: (<a href=\"http:\/\/blog.plsoucy.com\/wp-content\/uploads\/2012\/04\/countries-20140629.csv\">source<\/a>)<\/p>\n<p><code>Code,English Name,French Name<br \/>\nAD,Andorra,Andorre<br \/>\nAE,United Arab Emirates,\u00c9mirats arabes unis<br \/>\nAF,Afghanistan,Afghanistan<br \/>\nAG,Antigua and Barbuda,Antigua-et-Barbuda<\/code><\/p>\n<p>The database schema is as follows:<\/p>\n<blockquote><p>country_code {<u>country_code<\/u>,country_name}<\/p><\/blockquote>\n<p>In PHP, you could have done: <\/p>\n<h3>(Method 1)<\/h3>\n<p><code><\/p>\n<pre>\r\n$file_handle = fopen(dirname(__FILE__) .\r\n               \"\/countries.csv\", \"r\");\r\nwhile (!feof($file_handle) ) {\r\n\t$line = fgetcsv($file_handle, 1024);\r\n\t$line[1] = mysql_real_escape_string($line[1]); \r\n\t$sql = \"INSERT INTO country_code \r\n\t(country_code,country_name) \r\n\tVALUES ('{$line[0]}','{$line[1]}');\";\r\n\t$db->query($sql);\r\n}\r\nfclose($file_handle);\r\n<\/pre>\n<p><\/code><br \/>\nBut it takes about 3 seconds to import the CSV attached. If you have a file with lets say with 50,000 records, it might take hours to run. It is really annoying to have Core i7 computers with lots of RAM and yet you run a process which takes hours to run using only 0 &#8211; 1% CPU. <\/p>\n<p>I\/O from the Hard Drive is limiting the speed tremendously as each time a record is being read, the HDD is accessed, MySQL then accesses the HDD to save and so on. Even a Solid State Drive (SSD) doesn&#8217;t solve the problem (I tried. Still takes hours).<\/p>\n<p>I had to find a way to load a somewhat big chunk of the file into memory then execute the insert query for every 1000 records. I came up with the following code: <\/p>\n<h3>(Method 2)<\/h3>\n<p><code><\/p>\n<pre>\r\n$file_handle = fopen(dirname(__FILE__) \r\n\t. \"\/countries.csv\", \"r\");\r\n$init_sql = \"INSERT INTO country_code \r\n\t(country_code,country_name) VALUES \";\r\n$sql = $init_sql;\r\n$count = 1;\r\n$i = 0;\r\nwhile (!feof($file_handle) ) {\r\n\t$line = fgetcsv($file_handle, 1024);\r\n\t$line[1] = mysql_real_escape_string($line[1]);\r\n\t$sql .= \"('{$line[0]}','{$line[1]}'),\";\r\n\tif($count++ >= 1000){\r\n\t\t$i++;\r\n\t\t$count = 0;\r\n\t\t$sql = rtrim($sql , ',');\r\n\t\t$sql .= \";\";\r\n\t\t$db->query($sql);\r\n\t\t$sql = $init_sql;\r\n\t}\r\n}\r\n$sql = rtrim($sql , \",\");\r\n$sql .= \";\";\r\n$db->query($sql);\r\nfclose($file_handle);\r\n<\/pre>\n<p><\/code><\/p>\n<blockquote><p>Method 1 exec time: 3.0624158382416 s<br \/>\nMethod 2 exec time: 0.010763883590698 s<\/p>\n<p>Method 2 is <strong>284 x<\/strong> faster<\/p><\/blockquote>\n<p>The import time is considerably improved. According to my benchmarks, it is 270-285x faster. <\/p>\n<p>I was once importing US ZIP codes in a database. It contained between 43k records. The second method allows importing it in less than 5 seconds \ud83d\ude09<\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you have to import let&#8217;s say for example Country Name and County Code into a database from a CSV file like this: (source) Code,English Name,French Name AD,Andorra,Andorre AE,United Arab Emirates,\u00c9mirats arabes unis AF,Afghanistan,Afghanistan AG,Antigua and Barbuda,Antigua-et-Barbuda The database schema is as follows: country_code {country_code,country_name} In PHP, you could have done: (Method 1) $file_handle = &hellip; <a href=\"https:\/\/nayarweb.com\/blog\/2014\/mass-importing-into-sql-database-from-csf\/\" class=\"continue-reading\">Continue reading <span class=\"screen-reader-text\">Mass Importing into SQL Database from .csv File<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[210,1],"tags":[76,98],"class_list":["post-2442","post","type-post","status-publish","format-standard","hentry","category-technology","category-uncategorized","tag-mysql","tag-sql"],"_links":{"self":[{"href":"https:\/\/nayarweb.com\/blog\/wp-json\/wp\/v2\/posts\/2442","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nayarweb.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nayarweb.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nayarweb.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nayarweb.com\/blog\/wp-json\/wp\/v2\/comments?post=2442"}],"version-history":[{"count":13,"href":"https:\/\/nayarweb.com\/blog\/wp-json\/wp\/v2\/posts\/2442\/revisions"}],"predecessor-version":[{"id":2455,"href":"https:\/\/nayarweb.com\/blog\/wp-json\/wp\/v2\/posts\/2442\/revisions\/2455"}],"wp:attachment":[{"href":"https:\/\/nayarweb.com\/blog\/wp-json\/wp\/v2\/media?parent=2442"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nayarweb.com\/blog\/wp-json\/wp\/v2\/categories?post=2442"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nayarweb.com\/blog\/wp-json\/wp\/v2\/tags?post=2442"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}