Skip to content

Perf: Optimize pages loading (Filecache path like approach)#2549

Open
Koc wants to merge 1 commit into
mainfrom
feature/optimize-pages-loading-v2
Open

Perf: Optimize pages loading (Filecache path like approach)#2549
Koc wants to merge 1 commit into
mainfrom
feature/optimize-pages-loading-v2

Conversation

@Koc

@Koc Koc commented May 31, 2026

Copy link
Copy Markdown
Contributor

📝 Summary

This is alternative approach to #2390 that fixes same performance issue (closes #2380).

Benefits comparing to previous implementation:

  • no extra columns
  • no migration to re-process already existent page
  • no listeners
  • much simpler implementation

So, we're just load all necessary pages via simple query SELECT * FROM filecache WHERE storage_id = <storageId> AND path LIKE 'appdata_<instanceId>/collectives/<collectiveId>/%'

🖼️ Screenshots

image

Collective with 390 pages with various nesting level

🏚️ Before 🏡 After
747 queries - depends on pages count and nesting level 50 queries - more or less constant
image image

🚧 TODO

  • As a future improvement we can consider to add index to filecache table to storage_id, path columns (but this requires extra PR to nextcloud/server)

🏁 Checklist

  • Code is properly formatted (npm run lint / npm run stylelint / composer run cs:check)
  • Sign-off message is added to all commits
  • Tests (unit, integration and/or end-to-end) passing and the changes are covered with tests
  • Documentation (README or documentation) has been updated or is not required

🤖 AI (if applicable)

  • The content of this PR was partly or fully generated using AI tools
  • The AI-generated content was reviewed, comprehended and tested by a human

@Koc Koc force-pushed the feature/optimize-pages-loading-v2 branch 4 times, most recently from 5d10baa to 5e6b2e9 Compare May 31, 2026 13:31

@mejo- mejo- left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @Koc, this looks really promising 🤩

I have some comments, but I'm genuinely curious what you think about the comments.

If you do further changes to the PR, could you do them in separate fixup commits (and don't force-push changes to the existing commit for now) so it's easier to review your changes?

* @throws NotFoundException
* @throws NotPermittedException
*/
public function getPagesFromFolderV2(int $collectiveId, Folder $folder, string $userId, bool $recurse = false, bool $forceIndex = false): array {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to introduce this as a new function instead of replacing getPagesFromFolder() altogether? There's still one call left to the old function (lib/Service/PageService.php:984), was this left out on purpose?

I'd prefer to replace the old function and update the Unit tests along, as it would be great to not loose the existing unit testing for getPagesFromFolder() for the new function.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that this is temporary POC method to have possibility quickly switch between old an new implementation. So it's easier to add V2 prefix to call.

But, anyway, I guess that old implementation should be preserved for background jobs/console commands to re-create tree.

Comment thread lib/Model/FileInfo.php Outdated
Comment thread lib/Mount/CollectiveFolderManager.php
$qb->select('fileid', 'storage', 'path', 'parent', 'name', 'mimetype', 'mimepart',
'size', 'mtime', 'storage_mtime', 'encrypted', 'etag', 'permissions')
->from('filecache')
->where($qb->expr()->eq('storage', $qb->createNamedParameter($storageId, IQueryBuilder::PARAM_INT)))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to not further filter on mimetype here and only get folders and Markdown files? If I understand the code further down correctly, we only process folders and Markdown files anyway, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My goal was to introduce a new index that will cover columns used in filter. So, in this case we should extend it with mime type then

Comment thread lib/Service/PageService.php
Comment thread lib/Service/PageService.php Outdated
* @throws NotFoundException
* @throws \OCP\DB\Exception
*/
public function getFileCacheForCollective(int $collectiveId): array {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be an option to pass the folder name here as well and further narrow down the query to only contain files in this subfolder? This would make the query less heavy when getPagesFromFolderV2() is called with a subfolder and not for the whole collective folder.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be particularly helpful for building the templateFolder page tree, as that one doesn't need to process all the files from the collective rood folder.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, but as far I understand we should load all possible descendants (not only direct children). Filecache is not nested tree/closure tree structure, so we can't efficiently load all descendants

Comment thread lib/Service/PageService.php
Comment thread lib/Model/PageInfo.php
$this->setTimestamp($fileInfo->mtime);
$this->setSize($fileInfo->size);
$this->setFileName($fileInfo->name);
if ($collectivePath !== null) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess these checks can be omitted as the default is null anyway, so setting them to null again doesn't hurt.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, but in this case we should update setter to accept null. And we should do the same for many other setters. I'd prefer to keep same code style like we have in PageInfo::fromFile() method.

Comment thread lib/Model/PageInfo.php
/**
* Build the page info from a lightweight filecache entry (see PageService::getPagesFromFolderV2).
*/
public function fromFileInfo(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This copies much of the logic of fromFile(). Maybe worth refactoring to have less code duplication?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had an attempt to do that but with this approach we will have large method with all possible parameters. So we will have even more lines of code comparing to what we have now.

@Koc Koc self-assigned this Jun 9, 2026
Signed-off-by: Kostiantyn Miakshyn <molodchick@gmail.com>
@Koc Koc force-pushed the feature/optimize-pages-loading-v2 branch from 5e6b2e9 to 2901fd7 Compare June 10, 2026 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize performance of the pages loading by stopping using filesystem always

2 participants