-
Notifications
You must be signed in to change notification settings - Fork 17
Make reboot_on_exit=false work? #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
A description of what leads you to think it's a crash would be nice ;). (i.e., what exactly happens, when, what's on screen or not; that kind of thing ;)). Also, which version/build of KOReader are you running? That said, I haven't had time to check how everything behaves on FW 4.x in general, so, here be dragons ;). |
Thanks, well after quitting KOReader, I can see Nickel's progress bar starting to fill, but the home screen is never reached: the progress bar stays frozen at 3 filled bars out of 5. I'm using KOReader version v2015.11-896-g4316284. I witnessed the same behavior when I tried with the latest 3.x firmware. |
I vaguely recall fixing at least one apparent cause of this (IIRC, making sure WIFi was disabled before restarting Nickel), but apparently that wasn't enough ;). I'll hopefully have more free time available in a few weeks to try and dig into that again, but, as you said, debugging those issues is kind of annoyingly hard... |
Yes, It happens to me too. I mentioned elsewhere, but seems related to https://github.com/koreader/koreader/blob/master/platform/kobo/koreader.sh#L48:L57. @NiLuJe I think siphon nickel full environment should be safe on fmon side, but produces the same crash in fmon returning to nickel. @baskerville: you can try deleting those lines https://github.com/koreader/koreader/blob/master/platform/kobo/koreader.sh#L49:L53 nickel environment should be handled by nickel.sh in https://github.com/koreader/koreader/blob/master/platform/kobo/nickel.sh#L10:L27 and https://github.com/koreader/koreader/blob/master/platform/kobo/nickel.sh#L62:L70 |
@pazos I tried: I tought nickel was caught into an infinite loop because the black squares kept being filled and emptied, but after about 15 seconds the home screen finally appeared but it was displayed in a tiny region of the screen and there was a critical error dialog box with only one button: sign off! |
Yep, that (the broken fb output) sounds familiar, and is one of the reasons I switched to a full
siphon... Somehow, something's missing somewhere, or something causes
nickel to initialize differently, and doing shit on both ends initially appeared to do the trick on my end...
@pazos: To be clear, are you confirming that, right now, the exact same crash happens when exiting KOReader when launched from fmon instead of kfmon?
|
I found something that sounds related https://www.mobileread.com/forums/showpost.php?p=3403689&postcount=266. |
We do kill wi-fi when starting nickel, because I noticed that well before
FW 4.x
…On Apr 9, 2017 10:48 AM, "Bastien Dejean" ***@***.***> wrote:
I found something that sounds related https://www.mobileread.com/
forums/showpost.php?p=3403689&postcount=266.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAG1Zq7gNgcgNb-AHhPMpoTI6_DDa1shks5ruJtDgaJpZM4MdwBp>
.
|
@baskerville: sorry to hear that! @NiLuJe: yes, siphoning the whole environment makes imposible to return to nickel (It gets stuck on the third square, like baskerville). I'm not sure what causes this but fmon works really nice loading the minimum environment. Returning from koreader to nickel is less than 3 seconds in my KA1 |
@pazos Can you provide a link to the fmon archive? |
here is the updated one I proposed for the new instructions. I'm waiting for the fix to this bug and upload kfmon instructions in koreader wiki too. Look at the lastest comment in |
@NiLuJe: I do understand why you launch kfmon via udev hook, but I think I'm going to try to start it from inittab and see what's happen. As you pointed out maybe the environment is different and could cause issues. kfmon is a good improvement over fmon, with documentation, readable sources and a lot of checks. The missing feature is returning without reboot, which I've been doing since 2012 with fmon. @baskerville: on returning to nickel: the first sync fails (udhcpcd ins't able to get a lease). Nickel itself seems to handle it in less than a minute. So the next sync will work. steps to reproduce: |
cool!!!! Please create install instructions and we're done :) I think is better to move further fmon development to your repo and leave this issue open here. |
Okay, term & finals are nearly done, I'll be able to take a good look at this Real Soon now! :). |
@NiLuJe That is awesome!!! |
@NiLuJe How are things going now? Are you still working on this? |
I just pushed updated binaries to the MR thread, with basically a shot in the dark at playing with the env on kfmon's side. This should hopefully more closely match fmon? Or not, I still don't have time to investigate all that properly... On the plus side, that also means the OS X fix finally made it to a binary release... Ooops. ^^. |
Oookay, that crap was always bugging me, and I needed an excuse to procrastinate, so I took another stab at this... My current assumption is that all our woes are indeed caused by being started via udev. So, after fixing the same kind of mess on my end (i.e., a crappy env, and fucked up std* fds), it looks sane-ish: When started from udev, everything points to /dev/null (because of udev, c.f., https://github.com/gentoo/eudev/blob/master/src/udev/udev-event.c#L409), and when started otherwise, everything looks as it should (AFAICT). Started from udev:
Restarted from an ssh shell:
(fds 3/4/5 are what will end up as 0/1/2 in a child, i.e., stdin/out/err). FWIW, nickel:
So, tomorrow: actually switching to the animator patch approach, and see how that fares... |
And... success?! First initial test w/ Plato... Thing is, it takes so much time that I'm wondering if that actually wasn't already working yesterday... Probably not, because I managed to hit the frontlight timeout yesterday without anything happening :D. Looking at it over SSH, it seems to be blocking on a bunch of pselect() calls to some pipes until they timeout a couple of times... So, quick necromancy: pinging @baskerville @pazos & @xpirad : Does that sound remotely familiar with your experience w/ @baskerville's fmon & Plato? FWIW, I'm on a H2O on FW 4.7. |
I can also get a long delay, from time to time, with fmon. But, most of the time, Nickel restarts in less than 2 seconds. |
I'm wondering if my SIGCHLD handler isn't potentially what's confusing nickel, since those pipes & pselect calls relate to clone()'ed processes... Going to rip it out and see what happens \o/ EDIT: Going to try something first, I may have fucked up the blocking... I'm learning (and failing to understand :D) way too much about the unix process lifecycle... Also, signals make my brain hurt >_<". |
Yup, works as it should without the SIGCHLD handler -_-". That means I had to abandon a few neat things I quite liked (in theory) about having one, but that, truthfully, weren't terribly useful in practice. I guess that's the price of not actually having any background in C (or programming in general). You have an idea, you get excited because stuff actually seems to be doable, and then a few hours and LOCs down the line, you realize that it's either a huge mess to design it that way, a terrible idea, has an unknown amounts of weird corner-cases, or it just plain doesn't work. :D. |
Now to polish things up a bit, and simplify the KOReader scripts I had to mangle in the first place :D. |
And I managed to spend another night trying something which works neeeeeaaaarly just as I want it, but still manages to mysteriously fail in some circumstances for unknown reasons. Still, it behaves properly with a bit of a workaround plugged in, and I like it, so, meh. |
That said, one thing I'm still reliably reproducing on my H2O is the IR grid not waking up after a suspend (from a timeout, and also from the power button if you wait a bit before waking it) once we restart nickel. FWIW, that happens as soon as we kill nickel, because I initially had to work it around in KOReader... Keep in mind that this is not new, and sometimes (although somewhat rarely) also happens with only nickel, (i.e., a freshly booted nickel, and never having run anything else besides it during that boot session). I think at one point I tweaked my KOReader startup script to just SIGSTOP/SIGCONT nickel, because this was horribly annoying when debugging stuff on device... |
Eureka! Take three appears to behave properly... (On the reaping child process front, I mean). If all else fails, add threads?! sigh. |
It's alive! \o/ I'm sorry it took me so long to fix this, and I'm sorry I managed to miss @baskerville's fmon fork for so long (despite you mentioning it in this very thread, don't know how I managed that o_O. I only ended up looking it up purely randomly, because I happened to be checking out the MR KoboDev forum on a whim, and even then, only in KSM's credits, not even Plato's thread :D). Anywaaaaay. Will roll a release and close this. Please shout at me if I broke something else ^^. |
Done \o/. |
Sneaked in a minor (but long-standing) fd leak fix w/ 3881d72 |
I've tried
reboot_on_exit=false
on my Glo HD (with firmware 4.3.8945) and it seems to produce a crash while Nickel is relaunching but I'm not sure on how to debug.The text was updated successfully, but these errors were encountered: