-
-
Notifications
You must be signed in to change notification settings - Fork 74
Minor fixes to ChildProcess #685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Don't set FD_CLOEXEC, but close manually before fork: Since we don't use pipe2() to set the flag atomically, we're prone to races with concurrent fork+execs either way. As ChildProcess uses a mutex here anyways, as long as there is no other part in lms that would fork+exec, we're not any more or less safe now, but the code is shorter and we cannot fail the fcntl. - Don't make the write-end of the pipe non-blocking. Usually programs do not expect this, i.e. they either never check the return code of a write to stdout (potential data loss), or if they do, they just bail out on any error, and don't handle EAGAIN. ffmpeg seemed to handle this fine though (or we were just lucky and always read the data faster than ffmpeg could produce it). - Do not close stdin and stderr. Again ffmpeg seems to handle this, at least regarding stdin thanks to -nostdin, but in general if a process tries to write to stderr and fd 2 is not open, it might just bail out. Even worse, the process might have opened some file it wants to work with, and that file got assigned fd 2 (as that was the next free fd) - the program would corrupt whatever file it opened there whenever it tries to write to stderr. Try to open /dev/null instead for stdin and stderr, and if that fails, keep whatever lms inherited open, which should be safer than relying on the child process to handle this case properly. - exec() does not return on success, no need to check return code, any return from it is a failure. - Wrong error message in error path when assigning fd to boost stream. - F_SETPIPE_SZ requires and int, not size_t. This worked on little endian systems since the value passed was < 2^32, but on big endian systems you'd effectively pass "0" if using a 64bit type. - Address a few clang-tidy complaints (constness)
IIRC I closed stderr because ffmpeg got stuck if we don't pull bytes from its stderr in case of many errors being written . |
That should be addressed by opening |
For this, transcoding large files should work fine. FFmpeg just hangs while writing its output as long as the UI does not dequeue the next chunk of data. Have you checked whether it's still working correctly? Edit: actually there are already some bugs ongoing that broke this (like connection closed despite keep alive messages).
|
@@ -63,32 +63,23 @@ namespace lms::core | |||
{ | |||
// make sure only one thread is executing this part of code | |||
static std::mutex mutex; | |||
std::unique_lock<std::mutex> lock{ mutex }; | |||
const std::lock_guard<std::mutex> lock{ mutex }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(const std::scoped_lock lock{mutex} should be enough)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I wasn't sure about that. I researched all the types and what the differences are, and found that generally unique_lock has more overhead, and lock_guard and scoped_lock are almost identical. The intriguing argument for lock_guard was that you can avoid accidentally not locking anything.
This compiles:
std::scoped_lock lock;
while this doesn't: std::lock_guard<std::mutex> lock;
, i.e., avoiding accidentally not passing the actual lock.
But I just checked and saw that the rest of LMS uses scoped_lock, so I'll switch to that for consistency.
If ffmpeg already stops/slows down when LMS does not fetch the next chunk of data from the pipe fast enough, then this is a very strong indication that ffmpeg sets the fd to blocking mode again on its own to be safe, or gracefully handles
Let's assume some_command is really fast and grep really slow. If the pipe between them fills up, you want blocking mode in some_command: the And since LMS still sets the read end of the pipe to non-blocking, any read from the pipe when there's no data available will still return immediately with
I'm using Tempo most of the time, but so far the webUI seemed to behave exactly as it should. I can use it some more at work to make sure. FWIW I've been running the changes for almost a week now, and nothing blew up yet. ;)
I've run into this in the past with other web apps, where nginx has a default timeout for a connection to a backend that one needs to raise. I think with long transcode sessions you can run into the same problem. But it seems wt already gives you a proper notification about an aborted request, as during testing, I've seen a lot of aborted transcode sessions in the log, where the ffmpeg child gets killed properly.
Indeed - I thought about setting an upper limit for the length of transcoded tracks, and not use the cache if it's exceeded, but ironically, the longer the track, the more helpful the cache would be, as chances increase the transfer will be interrupted if you have bad wifi/mobile. And without the cache, that would mean streaming the song from the start again, if your client does not use the
I've not addressed this yet, and in the first implementation I just skip the cache if timeOffset is != 0 for that reason. As you said, this would just spam the cache with dozens of chunks from the same song that will likely never be used again, because chances are close to zero that you'll seek to exactly the same point again.
I think this is the way to go, if you decide to make use of the cache for the WebUI too. Transcode once, then use |
Yes I think both UI and subsonic backend should use the same transcoding interface. Anyway, this PR is merged. Thanks again for your work! |
Commit from the disk caching draft isolated into single PR.
FD_CLOEXEC
, but close manually before fork: Since we don't usepipe2()
to set the flag atomically, we're prone to races with concurrent fork+execs either way. As ChildProcess uses a mutex here anyways, as long as there is no other part in lms that would fork+exec, we're not any more or less safe now, but the code is shorter and we cannot fail thefcntl()
.EAGAIN
. ffmpeg seemed to handle this fine though (or we were just lucky and always read the data faster than ffmpeg could produce it).-nostdin
, but in general if a process tries to write to stderr and fd 2 is not open, it might just bail out. Even worse, the process might have opened some file it wants to work with, and that file got assigned fd 2 (as that was the next free fd) - the program would corrupt whatever file it opened there whenever it tries to write to stderr. Try to open /dev/null instead for stdin and stderr, and if that fails, keep whatever lms inherited open, which should be safer than relying on the child process to handle this case properly.F_SETPIPE_SZ
requires and int, not size_t. This worked on little endian systems probably since the value passed was <2^32, but on big endian systems you'd effectively pass "0" if using a 64bit type.--
I just noticed that in
main()
, stdin also gets closed. For similar reasons as above, this might be risky. If any part of lms, or any library it depends on ever tries to read from stdin, it will cause trouble. Checking my local lms instance, we get:That is, fd 0 (stdin) is now an event fd, used for signalling within the process. If anything reads from this fd, it will snatch away an event notification from something within lms, which could result in very subtle and confusing bugs, like something seemingly deadlocking or hanging for unspecified amounts of time.
Instead, I'd recommend also either opening
/dev/null
anddup2()
ing it onto stdin, or leaving stdin as is; when started via a service manager like systemd, stdin is/dev/null
anyways, so everything would be fine either way.