forked from chaos/diod
-
Notifications
You must be signed in to change notification settings - Fork 0
[pull] master from chaos:master #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
pull
wants to merge
149
commits into
bergwolf:master
Choose a base branch
from
chaos:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Use Automake declarations to install SCRIPTS and DATA files.
Build fixes
Signed-off-by: Roy Li <rongqing.li@windriver.com>
auto.diod.in: remove bashisms
Problem: Mergifyio is a nice bot to auto rebase and merge PRs that are approved and have passed all tests. Add initial mergify support. Initial reviewers that can approve pull requests are on team @chaos/diod. PRs must be tagged with the merge-when-passing label.
Problem: the ubuntu 18.04 runner was deprecated in 2023. Drop this runner from the test matrix.
add mergify support
The members of the timeval struct are both signed (defined by POSIX) and typically both 64 bits on a system where time_t is 64 bits. This is possible also on 32 bit systems where time_t is larger to handle the 2038 problem. It's practically impossible to find a type and printf format that works even on all glibc systems. Play it safe and always use printf with intmax_t and explict int64_t variables for scanf.
Handle various time_t sizes in printf and scanf
Some grep implementations print obsoletion warnings for egrep, which will make test output not match expected results.
Use cd in subshell instead of less predictable pushd/popd. Fixes #104
PATH_EXPDIR is a directory for all tests except t16.
Some tests leave large files behind otherwise.
t35 requires 500MB and t40 requires 100MB.
test fixes
Once refcount reaches zero the conn data could be deallocated and then the refcond is not valid anymore. So always signal on this under the lock to avoid this race.
Do the cleanup of the connection thread in a callback function so we can cancel the thread using pthread_cancel().
The connection thread is detached so we cannot join it to wait for its compleation on shutdown. Use the conncount in server to keep track of when all connection threads have actually finished. To keep this safe we must wait to update this value until the thread has done all of its cleanup.
The conection threads are using the resources that are deallocated in the main diod thread. If they are not shut down cleanly before this deallocation we could get craches at shutdown due to this.
Shutdown cleanup
Problem: the tests in tests/kern hang on systemd managed systems Systemd sets the root file system propagation to shared, so the private mounts created by the "kconjoin" test utilty are not really private, and exiting the process that created them does not cause an implicit unmount and diod does not exit. Follow the lead of the unshare(1) command, which as of util-linux-2.27, now sets / propgation recursively to private after unsharing it. Fixes #61
Problem: tests/kern/t32 (dbench) was marked XFAIL to get a clean test run on Yocto (kernel 6.6.45), but it passes on my deskotp with Ubuntu 22.04's (kernel 6.9.3) and in CI with Ubuntu 22.04 (kernel 6.5.0).. Drop the XFAIL to get this running in CI. Apologies to Ola Nilsson - we will need to track down the source of the failure on Yocto.
Problem: dbench is vendored in the diod source tree, but it's a lot, and is available as an ubuntu package. Drop the dbench source code. Modify the test script to skip the dbench test if unavailable. Add the dbench package to github workflow in anticipation of running the kernel tests later.
Problem: tests/kern/t46 and t47 require that setfattr is installed, but the test runner doesn't check. Skip the test if setfattr is unavailable. Add the attr package to the CI runner in anticipation of running the kernel tests in CI.
Problem: some packages that are no longer required are installed in CI. Drop libpopt and tcp wrappers.
Problem: the RDMA code is not built in CI so it is subject to rot. Add a build-only check.
Problem: when built with --enable-rdma, diod fails to initialize with the following error: # diod -E diod: waiting on rdma connection diod: rdma_get_cm_event: Unknown error -1 and dmesg contains [3890201.976178] ucma_write: process 2106336 (diod) changed security contexts after opening file descriptor, this is not allowed Initializng rdma later, after user transitions, seems to avoid this particular problem at initialization. Fixes #107
tidy up experimental RDMA feature
Problem: some TAP unit tests emit debug output prefixed with "#: ", which is legal TAP but looks funny. Change diod_log_init() to not append ": " to the log prefix if it already ends in a space, and change the test server to do that. Also skip the log prefix if diod_log_init() is called with a NULL argument, anticipating the next commit.
Problem: the "diod: " log prefix is not helpful because server logging is rarely mixed with output from other programs. Or if it is, it's been processed through the systemd journal or syslog which also adds its own prefix. Initialize server logging with no prefix.
Problem: it may be useful, especially when debugging, to know what ports and interfaces diod is listening on. Log this when diod starts up. Update tests/misc/t15 for this change and clean up its Makefile.am.
Problem: diod supports runtime configuration reloading but the systemd unit file doesn't. Arrange to send SIGHUP to diod when an admin runs systemctl reload diod.
enable systemctl reload diod + minor logging improvements
Problem: diod caches the result of geteuid (), which requires a reader of the code to mentally track whether the euid changed since it was cached. There is likely no benefit in avoiding repeated calls to geteuid (). Just do that.
Problem: when diod starts, it may not be clear whether it is running as root or not. Also the code for making the transition is a over-complicated. Log a message after the transition. To simplify - use getpwnam() and getpwuid() instead of thread-safe versions - short-circuit no-op transition from root to root - drop an unused parameter from the internal function. - improve inline comments.
Problem: when diod drops root and loads the credentials of another user, it calls SYS_setgroups (per-thread) instead of setgroups() (per-process) on Linux, but this call should be for the whole process. Use setgroups().
Problem: when diod starts, it may not be clear which access policy has been enacted. Log it.
Problem: the --enable-impersonation=TYPE arguments and platform defaults are a little complex for this niche case. Replace with two binary options: --disable-multiuser to disable multi-user support --with-ganesha-kmod to use FreeBSD nfs-ganesha-kmod for multi-user support At this point FreeBSD users making a server-only multi-user build would use configure --disable-diodmount --with-ganesha-kmod
Problem: diod's three distinct modes of operation (allsquash, runasuser, and multiuser) are hard to discern in diod's startup logic. Refactor this code so that initialization occurs in distinct sections.
Problem: if multi-user mode is selected and diod is configured with --disable-multiuser or if ganesha initialization fails, the server still starts, which could lead to privilege escalation for clients. Make those errors fatal.
Problem: If diod is configured to require authentication, but was built without munge support, it still runs. Make that a fatal error. Log a message if users will be allowed to connect without authentication.
Problem: if munge is not found by configure, diod is silently built without authentication support. Make that a hard error unless configured with --disable-auth. Also, use pkg-config to find munge. Drop a GPL_LICENSED declaration in front of the munge.h include directive. The munge license has changed to LGPL and this is no longer necessary.
Problem: if libcap is not found by configure but multi-user is enabled, diod runs without it, leading to test failures noted in #103. Make that a hard error unless configured with --disable-multiuser. Also, use pkg-config to find libcap. Don't build the libnpfs/capability unit test if multi-user is disabled.
tighten up diod configuration and initialization
Commit 3dc9363 ("diod: always run in the foreground") did what it says on the tin and removed the functionality to daemonize. The old --foreground option that disabled daemonization was also removed. The breaks existing setups that were using foreground execution all along via passing --foreground: The option no longer exists and instead the full usage text is printed. As not to break existing setups, let's keep the --foreground option around for now and just turn it a no-op, given that it became the default behavior.
diod: keep --foreground option for compatibility
The check here is skipped if diod receives Tread or Twrite with a count greater than UINT32_MAX - IOHDRSZ (4294967271). Eventually this trips an assert in np.c. Let's just cast the incoming size to 64-bits before doing any math that may lead to an overflow.
diod: Fix uint32 overflow when Tread/Twrite count > UINT32_MAX-IOHDRSZ
np_deserialize will deserialize any valid 9P message type, but the code path here only handles 'T' type messages from a client (i.e. Tmkdir). This assert will trip if we receive an R-message from a client that can be decoded by np_deserialize. To fix this, we'll reject invalid message types before they are passed to a worker thread.
libnpfs: Fix assert failure when receiving R-type messages
Although most 9P clients handle the ENOTSUPP case for these calls, they're still handy to have available. Tunlinkat has semantics that are more favorable than Tremove in some cases. For example, 9P clients won't need to handle the Tremove failure case where the call fails, but still clunks the fid.
Implement Trenameat and Tunlinkat
Recent releases of autoconf-archive have introduced a macro for LuaJIT support which was meant to be optional but turned out to break autogen if it wasn't used. The fix is simply to use it as intended. See linked issue for references to detailed bug reports to autoconf maintainers.
Fix non-optional LUAJIT support macro in autoconf-archive > 2023.02.20
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot]
Can you help keep this open source service alive? 💖 Please sponsor : )