A big mistake when dealing with large files in PHP is attempting to open the file and load the entire file straight into memory or loop through all the lines and build up a buffer to loop through after. Example:
Bad Example
$rows= [];
$handle = fopen($filename, "r");
if ($handle) {
while (!feof($handle)) {
$rows[] = fgets($handle, 1000);
}
fclose($handle);
}
$count = 0;
foreach ($rows as $csv) {
$count++;
//skip header
if ($count == 1) {
continue;
}
$row = explode($csv, ',');
// Do something
}
There's still a ton of bad tutorials and sites suggesting the above from old PHP versions. We can use either pure PHP or Laravel's LazyCollections to process very large datasets and not hit our memory limit.
Processing a CSV with a while loop (PHP)
$row = 0;
$import = fopen($filename, 'r');
while ($data = fgetcsv($import)) {
$row++;
//skip header row
if ($row == 1) {
continue;
}
// Process csv row
}
Processing a CSV with LazyCollection (Laravel)
use Illuminate\Support\LazyCollection;
LazyCollection::make(function () use ($filename) {
$file = fopen($filename, 'r');
while ($data = fgetcsv($file)) {
yield $data;
}
})->skip(1)->each(function ($data) {
// Process csv row
});
With both of the above code examples, we solved the memory limit issues. But, we're still going to run into a different problem eventually. When using pure PHP, I would just set that script's max execution time to unlimited set_time_limit(0);
at the start of the file. In Laravel this processing should be moved into a queue
Thanks, for reading, please let me know if you found this helpful or have questions over on Twitter: twitter.com/digiguydev.