Skip to content

fs: improve fsPromises readFile performance#37608

Closed
Linkgoron wants to merge 2 commits into
nodejs:masterfrom
Linkgoron:fs-promises-improve-read-performance
Closed

fs: improve fsPromises readFile performance#37608
Linkgoron wants to merge 2 commits into
nodejs:masterfrom
Linkgoron:fs-promises-improve-read-performance

Conversation

@Linkgoron

@Linkgoron Linkgoron commented Mar 5, 2021

Copy link
Copy Markdown
Contributor

Improve the fsPromises readFile performance by allocating only one buffer, when size is known. I also increased the size of the readbuffer chunks (which are now the same as in read_file_context), and also removed the extra read if size bytes have been read.

Most of the changes here are essentially shamelessly copied from read_file_context, I've also removed a call to the inner read and just called binding.read directly.

I also had to fix a few abort-controller tests, as they were dependant on the extra read at the end of the file.

This improves the promises.readFile performance by 20% on my machine. However there is still a 15-20% difference between the cb version and this version on my machine, so I didn't mark this as resolving #37583

edit:
second commit contains fs.promises.readFile benchmark

@nodejs-github-bot nodejs-github-bot added fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run. labels Mar 5, 2021
@Linkgoron Linkgoron marked this pull request as ready for review March 5, 2021 11:39
@benjamingr

Copy link
Copy Markdown
Member

@nodejs/fs @jasnell

@benjamingr benjamingr left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went all the way back to https://github.com/nodejs/node/pull/18297/files to see if there is a compelling reason that buffer size was chosen - looks like it was tried initially and not really iterated on - so this change looks fine :)

@nodejs-github-bot

Copy link
Copy Markdown
Collaborator

@Linkgoron

Linkgoron commented Mar 5, 2021

Copy link
Copy Markdown
Contributor Author

I went all the way back to https://github.com/nodejs/node/pull/18297/files to see if there is a compelling reason that buffer size was chosen - looks like it was tried initially and not really iterated on - so this change looks fine :)

@benjamingr I believe that the number was changed in read_file_context here #27063, but probably missed changing it as an oversight in the promisifed version.

@benjamingr

Copy link
Copy Markdown
Member

cc @BridgeAR

@benjamingr benjamingr added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. and removed needs-ci PRs that need a full CI run. labels Mar 5, 2021
@nodejs-github-bot

Copy link
Copy Markdown
Collaborator

readFile,
writeFile,
truncate
truncate,

@RaisinTen RaisinTen Mar 5, 2021

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
truncate,
truncate

nit: Unrelated change

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'm in favour of not spending mental energy on stylistic changes that are not enforced as part of linting rules.

@Linkgoron

Linkgoron commented Mar 5, 2021

Copy link
Copy Markdown
Contributor Author

I've copy-pasted the readfile benchmark and converted it to a promisified version and these are the results of the comparison tool:

                                                             confidence improvement accuracy (*)   (**)  (***)
fs/readfile-promises.jsconcurrent=1 len=1024 duration=5             ***     22.24 %       ±0.85% ±1.14% ±1.49%
fs/readfile-promises.jsconcurrent=1 len=16777216 duration=5         ***      9.93 %       ±0.82% ±1.09% ±1.43%
fs/readfile-promises.jsconcurrent=10 len=1024 duration=5            ***     25.43 %       ±0.86% ±1.14% ±1.49%
fs/readfile-promises.jsconcurrent=10 len=16777216 duration=5        ***     28.60 %       ±1.19% ±1.58% ±2.06%

@benjamingr benjamingr added the needs-ci PRs that need a full CI run. label Mar 5, 2021
Improve the fsPromises readFile performance
by allocating only one buffer, when size is known,
increase the size of the readbuffer chunks,
and dont read more data if size bytes have been read

refs: nodejs#37583
@Linkgoron Linkgoron force-pushed the fs-promises-improve-read-performance branch from ea69fbc to 1bbc2bb Compare March 5, 2021 19:28
@nodejs-github-bot

Copy link
Copy Markdown
Collaborator

add a benchmark for fs.promises.readFile
@Linkgoron Linkgoron force-pushed the fs-promises-improve-read-performance branch from 1bbc2bb to e31ccd9 Compare March 5, 2021 19:31
@nodejs-github-bot

Copy link
Copy Markdown
Collaborator

buffer = fullBuffer;
offset = totalRead;
length = MathMin(size - totalRead, kReadFileBufferLength);
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is a behavioral change for files that are being appended to while they are being read. I think that should be okay, because ultimately there are no guarantees for the relative timing of the two operations, but it might be something to keep in mind.

@Linkgoron Linkgoron Mar 5, 2021

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're correct and I should've mentioned it in the PR but forgot. This is aligning the behaviour with the cb version, so I thought that it would be OK.

length = MathMin(kReadFileBufferLength, this.size - this.pos);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that’s okay 👍

@nodejs-github-bot

Copy link
Copy Markdown
Collaborator

const buf = Buffer.alloc(isFirstChunk ? firstChunkSize : chunkSize);
const { bytesRead, buffer } =
await read(filehandle, buf, 0, buf.length, -1);
endOfFile = bytesRead === 0;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it looks like it should be enough to change this line to:

    totalRead += bytesRead;
    endOfFile = bytesRead === 0 || totalRead === size;

in order to get the same performance improvement as in this PR (at least that's what I saw on my machine). Another useful optimization here is Buffer.allocUnsafeSlow. So, maybe it's worth considering keeping only these 4 lines of changes, but with the same result.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that some of the changes are not strictly necessary for the improvement, but another feature (IMO) of my changes it that now the logic is essentially the same as the sync readFile.

@benjamingr benjamingr added the notable-change PRs with changes that should be highlighted in changelogs. label Mar 10, 2021
@benjamingr

Copy link
Copy Markdown
Member

Landed in e2f5bb7...b9fd4eb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

author ready PRs that have at least one approval, no pending requests for changes, and a CI started. fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run. notable-change PRs with changes that should be highlighted in changelogs.

Projects

None yet

Development

Successfully merging this pull request may close these issues.