Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive readFilter->readCell() increase runtime notably when parsing XSLX files #772

Closed
DennisBirkholz opened this issue Nov 14, 2018 · 0 comments

Comments

@DennisBirkholz
Copy link
Contributor

DennisBirkholz commented Nov 14, 2018

This is:

- [x] a bug report

What is the expected behavior?

Parsing with styles enabled or disabled should not have a considerable different parsing time.

What is the current behavior?

A rather large file with ~6000 rows and 10 columns takes about 2 seconds to parse with $reader->setReadDataOnly(false) but takes about 30 seconds with $reader->setReadDataOnly(true).

What are the steps to reproduce?

<?php

require __DIR__ . '/vendor/autoload.php';

// Takes 30 seconds
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReaderForFile('large_spreadsheet.xlsx');
$document = $reader->load('large_spreadsheet.xlsx');

// Takes 2 seconds
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReaderForFile('large_spreadsheet.xlsx');
$reader->setReadDataOnly(true);
$document = $reader->load('large_spreadsheet.xlsx');

Which versions of PhpSpreadsheet and PHP are affected?

PHP 5.6 and PHPSpreadsheet 1.5.0

@DennisBirkholz DennisBirkholz changed the title Excessive readFilter->readCell() increase runtime notably Excessive readFilter->readCell() increase runtime notably when parsing XSLX files Nov 14, 2018
DennisBirkholz added a commit to PubGrade/PhpSpreadsheet that referenced this issue Nov 14, 2018
…lter

For large XLSX files `Reader/Xlsx::readColumnsAndRowsAttributes()` performs
a lot of calls to `$this->getReadFilter()` and `$this->getReadFilter()->readCell()`
as `readCell()` is called twice for each (possibbly filled) cell.

By ignoring calls to the DefaultReadFilter implementation (which always returns true),
using no custom read filter will not incur any runtime penalty.

The runtime penaltiy when using a custom read filter is reduced by a third by
caching the read filter into a variable instead of using the getter method.

Fixes issue PHPOffice#772.
DennisBirkholz added a commit to PubGrade/PhpSpreadsheet that referenced this issue Nov 14, 2018
…lter

For large XLSX files `Reader/Xlsx::readColumnsAndRowsAttributes()` performs
a lot of calls to `$this->getReadFilter()` and `$this->getReadFilter()->readCell()`
as `readCell()` is called twice for each (possibbly filled) cell.

By ignoring calls to the DefaultReadFilter implementation (which always returns true),
using no custom read filter will not incur any runtime penalty.

The runtime penaltiy when using a custom read filter is reduced by a third by
caching the read filter into a variable instead of using the getter method.

Fixes issue PHPOffice#772.
MarkBaker pushed a commit that referenced this issue Nov 29, 2018
…lter (#773)

For large XLSX files `Reader/Xlsx::readColumnsAndRowsAttributes()` performs
a lot of calls to `$this->getReadFilter()` and `$this->getReadFilter()->readCell()`
as `readCell()` is called twice for each (possibbly filled) cell.

By ignoring calls to the DefaultReadFilter implementation (which always returns true),
using no custom read filter will not incur any runtime penalty.

The runtime penaltiy when using a custom read filter is reduced by a third by
caching the read filter into a variable instead of using the getter method.

Fixes issue #772.
guillaume-ro-fr pushed a commit to guillaume-ro-fr/PhpSpreadsheet that referenced this issue Jun 12, 2019
…lter (PHPOffice#773)

For large XLSX files `Reader/Xlsx::readColumnsAndRowsAttributes()` performs
a lot of calls to `$this->getReadFilter()` and `$this->getReadFilter()->readCell()`
as `readCell()` is called twice for each (possibbly filled) cell.

By ignoring calls to the DefaultReadFilter implementation (which always returns true),
using no custom read filter will not incur any runtime penalty.

The runtime penaltiy when using a custom read filter is reduced by a third by
caching the read filter into a variable instead of using the getter method.

Fixes issue PHPOffice#772.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant