-
-
Notifications
You must be signed in to change notification settings - Fork 74
Disk caching #683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disk caching #683
Conversation
Keeping the file open allows better prefetching/caching on both the application layer (ifstream) and OS level (vfs layer) as the access pattern is sequential and predictable.
While common C++ documentation seems to repeatedly state how seekg/tellg are for setting/getting the read position in the stream, and seekp/tellp doing the same for the write position, I had to ask FSKING ChatGPT why the f*** the output files keep getting corrupted and it then telling me that on some platforms, namely Linux/POSIX, they are the same pointer since they just wrap native POSIX I/O, which only knows one shared offset for reading and writing. Good job not mentioning this anywhere important, like idk, maybe cppreference.com or such. >:(
- Don't set FD_CLOEXEC, but close manually before fork: Since we don't use pipe2 to set the flag atomically, we're prone to races with parallel fork+execs either way. As ChildProcess uses a mutex here anyways, as long as there is no other part in lms that would fork+exec, we're not any more or less safe now, but the code is shorter and we cannot fail the fcntl. - Do not close stdin and stderr. Again ffmpeg seems to handle this fine, but if a process tries to write to stderr and fd 2 is not open, it might just bail out. Even worse, the process might have opened some file it wants to work with, and that file got assigned fd 2 as that was the next free fd - the program would corrupt whatever file it opened there whenever it tries to write to stderr. Try to open /dev/null instead for stdin and stderr, and if that fails, keep whatever lms inherited. - exec does not return on success, no need to check return code, any return is a failure - wrong error message in error path when assigning fd to boost stream
Hello! You are right, sharing now is better, so that we can discuss general design. Can you also rebase on develop? |
The modifications on FileResource seem to be unrelated? I think we need a transcoding service, that would be used from subsonic code or from the ui code. So a first step would be to move existing code to a new service, that does not even do caching, and then add the caching feature. |
Thanks for getting back this quickly! Yeah I could see that coming. Worst case at least the concept survives. ;)
Agreed, I'll make several smaller PRs. Regarding FileResource: The changes in ChildProcess in particular were preparations for the TODO I left: Which should probably also be another PR after the basic version looks good. :)
Sure, only realized this after opening the PR
So this should be the first step now I guess before doing any more work. Should I just try to figure out how that should look like, or would you prefer doing the refactoring of the existing code and then I add the rest?
Unrelated but just my 2ct, I actually prefer the current design. It allows users to quickly swap out the binary for an older/newer version for testing, and avoids dealing with a large API surface, resource handling, and in case there is for example a corrupted file triggering a crash bug, only the ffmpeg process will die, and not the entire lms instance. |
Fair enough! Not sure about sourceGood() though, would have just throw/catch an exception in case of file access error.
Please don't waste time on this, I prefer you to focus on more useful parts of the code :)
Yes, I admit there are some drawbacks indeed. Last time I tried (many years ago), the ffmpeg code path to transcode had some annoying hacks, but maybe that's better now. That's the advantage of isolating this in a low level library, that should be a pure implementation detail and should have no impact on the rest of the code. |
Ok I put what I had in mind on develop. it is largely untested, but I think it should be good enough to start from |
New version soon... |
My attempt at implementing disk caching.
Introduces a new config variable
transcode-cache-size
. Set to <> 0 to enable.It's supposed to be the maximum size of the cache later on, but there is no cleanup worker yet. In fact there's still quite some TODOs in the code, and a bit too much debug spam.
But I thought I'd rather share this sooner than later for feedback, so I tried adding comments in places I thought might benefit from explaining the reasoning behind it. Code is probably also quite ugly in places due to focus on getting things to work and lack of knowledge regarding modern C++ and boost. :)
(Please ignore the first 2 commits entirely, a proper rebase will be done once this is in acceptable shape.)