old MacOS (up to ~2001) and old Office For MacOS (up to 2007? I think) use carriage-return for newlines,
Microsoft Windows use carriage-return+line-feed for newlines,
Unix (Linux and modern MacOS) use line-feeds,
Some systems use BOM/byte-order-masks just to say they use UTF-8, i've even encountered one-BOM-per-CSV-row!
For a csv-file parser handling all the above cases, I wrote:
<?php
function parse_csv(string $csv, string $separator = ","): array
{
$csv = strtr(
$csv,
[
"\xEF\xBB\xBF" => "", "\r\n" => "\n", "\r" => "\n" ]
);
$lines = explode("\n", $csv);
$keys = str_getcsv(array_shift($lines), $separator);
$ret = array();
foreach ($lines as $lineno => $line) {
if (strlen($line) < 1) {
continue;
}
$parsed = str_getcsv($line, $separator);
if (count($parsed) !== count($keys)) {
throw new \RuntimeException("error on csv line #{$lineno}: count mismatch:" . count($parsed) . ' !== ' . count($keys) . ": " . var_export([
'error' => 'count mismatch',
'keys' => $keys,
'parsed' => $parsed,
'line' => $line
], true));
}
$ret[] = array_combine($keys, $parsed);
}
return $ret;
}
?>