-
Notifications
You must be signed in to change notification settings - Fork 456
Add Mac specific FileBasedLock implementation #213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Mac specific FileBasedLock implementation #213
Conversation
|
||
if (this.lockFileDescriptor != InvalidFileDescriptor) | ||
{ | ||
if (Close(this.lockFileDescriptor) != 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sanoursa note that with the current changes the lock file is not removed from the file system. This allows us to avoid the race conditions described here:
https://stackoverflow.com/questions/17708885/flock-removing-locked-file-without-race-condition
https://stackoverflow.com/questions/33166460/how-to-delete-a-locked-flock-file-without-race-condition-before-or-after-rele
It appears that there are some options for removing it if we play some tricks with stat
, but unless you see a major issue with this approach I'm thinking we stick with this simpler approach and avoid the complexity\races of attempting to delete this file.
Note that if we were not cross platform we could flock
the file we're interested in itself (e.g. mapping.dat
) but to keep the cross-platform code simple and consistent it's easiest if we use a separate file on Mac like we do on Windows.
|
||
if (this.lockFileDescriptor == InvalidFileDescriptor) | ||
{ | ||
this.lockFileDescriptor = Creat(this.LockPath, Convert.ToInt32("644", 8)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Use
open
instead ofcreat
to make intention clearer - Don't truncate file on open as at that point we have not acquired the
flock
yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, and it sounds like this has resolved the test issues.
GVFS/GVFS.Common/FileBasedLock.cs
Outdated
private const long InvalidFileLength = -1; | ||
private const string EtwArea = nameof(FileBasedLock); | ||
private static readonly Encoding UTF8NoBOM = new UTF8Encoding(false, true); // Default encoding used by StreamWriter | ||
protected readonly PhysicalFileSystem FileSystem; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's generally a good idea to avoid protected member variables. These are fine since they are readonly, but I would personally make them properties with no setter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I'll make the change
|
||
public string Message { get; } | ||
} | ||
public abstract void Dispose(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need this method definition here. The interface should already be enough to require subclasses to implement this right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this might be leftover from when I was moving things around, let me try removing it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It ends up the abstract Dispose
is required, without it we get the error:
FileBasedLock.cs(7,43): error CS0535: 'FileBasedLock' does not implement interface member 'IDisposable.Dispose()' [/Users/wilbaker/Repos/VFSForGit/src/GVFS/GVFS.Common/GVFS.Common.csproj]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea you're right. I must be spending too much time in C++ these days :-)
|
||
if (this.lockFileDescriptor == InvalidFileDescriptor) | ||
{ | ||
this.lockFileDescriptor = Creat(this.LockPath, Convert.ToInt32("644", 8)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 644 always the right mode? What if someone has left this file behind with a different mode on it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this code will be clearer with the switch to open
(creat
is the same as open
with the flags we're interested in here).
The mode
parameter is only used when creating a new file, it's ignored if the file already exists:
The mode argument specifies the file mode bits be applied when
a new file is created. This argument must be supplied when
O_CREAT or O_TMPFILE is specified in flags; if neither O_CREAT
nor O_TMPFILE is specified, then mode is ignored. The
effective mode is modified by the process's umask in the usual
way: in the absence of a default ACL, the mode of the created
file is (mode & ~umask). Note that this mode applies only to
future accesses of the newly created file;
private static extern int Close(int fd); | ||
|
||
[DllImport("libc", EntryPoint = "flock", SetLastError = true)] | ||
private static extern int Flock(int fd, int operation); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I would capitalize this as FLock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
{ | ||
int errno = Marshal.GetLastWin32Error(); | ||
|
||
// Log error if not EWOULDBLOCK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment accurate? I don't see an if
in here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still WIP, for now I'm logging everything just to make sure things are working as expected
|
||
this.Tracer.RelatedInfo($"Lock acquired for for '{this.LockPath}'"); | ||
|
||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be setting this.lockAcquired = true
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we should, good catch :)
{ | ||
if (Flock(this.lockFileDescriptor, LockUn) != 0) | ||
{ | ||
this.Tracer.RelatedWarning($"Failed to release lock for: '{this.LockPath}'"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then maybe IDisposable
isn't the proper interface for this class. Do we need a TryUnlock
instead so we can retry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And should we avoid clearing lockAcquired
when this happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading through the comments for flock
the lock will be released automatically when the file descriptor is closed, and so strictly speaking I don't think we need this call in the first place:
Locks created by flock() are associated with an open file description
(see open(2)). This means that duplicate file descriptors (created
by, for example, fork(2) or dup(2)) refer to the same lock, and this
lock may be modified or released using any of these file descriptors.
Furthermore, the lock is released either by an explicit LOCK_UN
operation on any of these duplicate file descriptors, or when all
such file descriptors have been closed.
http://man7.org/linux/man-pages/man2/flock.2.html
(Note also that all the error values described for flock
only apply when acquiring the lock and not releasing it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note also that if close
fails it should not be retried, as the fd
has been successfully released (any errors are just for diagnostic purposes):
A careful programmer will check the return value of close(), since it
is quite possible that errors on a previous write(2) operation are
reported only on the final close() that releases the open file
description. Failing to check the return value when closing a file
may lead to silent loss of data. This can especially be observed
with NFS and with disk quota.Note, however, that a failure return should be used only for
diagnostic purposes (i.e., a warning to the application that there
may still be I/O pending or there may have been failed I/O) or
remedial purposes (e.g., writing the file once more or creating a
backup).Retrying the close() after a failure return is the wrong thing to do,
since this may cause a reused file descriptor from another thread to
be closed. This can occur because the Linux kernel always releases
the file descriptor early in the close operation, freeing it for
reuse; the steps that may return an error, such as flushing data to
the filesystem or device, occur only later in the close operation.
{ | ||
[TestFixture] | ||
public class FileBasedLockTests | ||
public class WindowsFileBasedLockTests | ||
{ | ||
[TestCase] | ||
public void CreateLockWhenDirectoryMissing() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check if this scenario works on Mac
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Feel free to ignore the nit picky stuff, but I am curious about failing TryAcquireLock
if writing to the file fails.
public class MacFileBasedLock : FileBasedLock | ||
{ | ||
// #define O_WRONLY 0x0001 /* open for writing only */ | ||
private const int OpenWriteOnly = 0x0001; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: since you created a NativeMethods
subclass (which I like), I would also move all of these constants inside there since that is the only context in which the constants are needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats a good idea, I'll move these
} | ||
|
||
[Flags] | ||
private enum FLockOperations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also a nit: you're currently using constants for some and an enum for others. I would keep them consistent, and the constants seem to make the usage less cluttered (no need for a cast)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make them all contants
|
||
if (bytesWritten == -1) | ||
{ | ||
int errno = Marshal.GetLastWin32Error(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we still returning true if we hit this, or the other error right below it? Should we fail here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we still returning true if we hit this
We're returning true because FLock
successfully acquired the lock
or the other error right below it? Should we fail here?
I don't think that we need to. Currently the signature is only to help with diagnostics, it need not be written for the locking to work correctly. (I also considered not writing the signature at all, it would speed things up and eliminate a class of possible failures, thoughts on that option?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea that's a good point. The reason we added the signature in the first place is that we were also creating the index.lock from the mount process. It was useful to know, for diagnostics, if an abandoned index.lock had been created by git or by us. Now that we no longer create the index.lock... is the signature adding any value at all? Even on Windows? We could just drop it if it's not serving a purpose any longer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can just drop it. Everything that uses FileBasedLock will always clean up an orphan lock (without checking the signature), and on Windows the file is always opened with DELETE_ON_CLOSE and so unless there is a power-loss or BSOD event there won't be any stale locks left on disk,
{ | ||
if (NativeMethods.Close(this.lockFileDescriptor) != 0) | ||
{ | ||
int errno = Marshal.GetLastWin32Error(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest leaving a comment here that an error from close is advisory only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
namespace GVFS.Platform.Windows | ||
{ | ||
public class WindowsFileBasedLock : FileBasedLock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing there are no meaningful changes in this code compared to the old FileBasedLock
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were a lot of changes to this file to remove dead code and clean up the interface.
If you'd like to see the changes you can check out the first commit:
I did not perform the file rename in that commit to make it easier to see the diffs in this file (see GVFS/GVFS.Common/FileBasedLock.cs in the above commit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Note that these changes pre-dated the creation of the abstract FileBasedLock, and so the private members were not yet moved out)
ProjFS.Mac/PrjFSLib/PrjFSLib.cpp
Outdated
@@ -331,6 +331,7 @@ PrjFS_Result PrjFS_WritePlaceholderFile( | |||
goto CleanupAndFail; | |||
} | |||
|
|||
// TODO(Mac): Only call chmod is fileMode is different than the default file mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see I made a typo in this comment, I'll fix that up
{ | ||
if (this.deleteOnCloseStream != null) | ||
{ | ||
throw new InvalidOperationException("Lock has already been acquired"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sanoursa let me know what you think of this change (to throw an exception) in my latest commit.
I made this change (rather than returning true
) to help enforce the rule that TryAcquireLock
should not be called on a FileBasedLock
after it has successfully acquired the lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throwing InvalidOperationException
makes perfect sense here. A re-entrant usage of this lock violates its contract and could result in other bugs if we don't make that obvious.
The FetchStepCleansUpStaleFetchLock() functional test currently only succeeds on Windows (and is therefore marked with the MacTODO.TestNeedsToLockFile category) because it checks whether the FetchStep command has released its lock file by testing if the lock file has been deleted. This works with our WindowsFileBasedLock implementation, which opens the lock file with FileOptions.DeleteOnClose, so the file is removed when the lock is released. However, the MacFileBasedLock version leaves the file in place and uses flock(2) and LOCK_EX to acquire and release the lock atomically; there is no simple way to delete the file while still holding the open file descriptor needed for flock(). To make this test succeed on Mac/Linux while still being meaningful, we need to be robust against the lock file existing both at the start of the test and afterwards, while still ensuring that the lock has been released by the FetchStep command. We therefore simply attempt to open the lock file after the fetch command completes, using the FileShare.None attribute, which will fail on all platforms if the lock file has not been released. On Windows this is because WindowsFileBasedLock uses FileShare.Read to open the lock file, so if the lock is still held our attempt with FileShare.None will fail. On macOS (and Linux) this is because our use of flock() in MacFileBasedLock will conflict with the flock() with LOCK_EX which the .NET Core Unix implementation of FileShare.None also uses. See the discussions in microsoft/VFSForGit#213, dotnet/runtime#24432, as well as the .NET Core library implementation in: https://github.com/dotnet/runtime/blob/341b8b22280e297daa7d96b4b8a9fa1d40d28 aa9/src/libraries/System.Private.CoreLib/src/System/IO/FileStream.Unix.cs#L73-L8 8 We also rename the test to FetchStepReleasesFetchLockFile() to more accurately reflect what it is now testing.
The FetchStepCleansUpStaleFetchLock() functional test currently only succeeds on Windows (and is therefore marked with the MacTODO.TestNeedsToLockFile category) because it checks whether the FetchStep command has released its lock file by testing if the lock file has been deleted. This works with our WindowsFileBasedLock implementation, which opens the lock file with FileOptions.DeleteOnClose, so the file is removed when the lock is released. However, the MacFileBasedLock version leaves the file in place and uses flock(2) and LOCK_EX to acquire and release the lock atomically; there is no simple way to delete the file while still holding the open file descriptor needed for flock(). To make this test succeed on Mac/Linux while still being meaningful, we need to be robust against the lock file existing both at the start of the test and afterwards, while still ensuring that the lock has been released by the FetchStep command. We therefore simply attempt to open the lock file after the fetch command completes, using the FileShare.None attribute, which will fail on all platforms if the lock file has not been released. On Windows this is because WindowsFileBasedLock uses FileShare.Read to open the lock file, so if the lock is still held our attempt with FileShare.None will fail. On macOS (and Linux) this is because our use of flock() in MacFileBasedLock will conflict with the flock() with LOCK_EX which the .NET Core Unix implementation of FileShare.None also uses. See the discussions in microsoft/VFSForGit#213, dotnet/runtime#24432, as well as the .NET Core library implementation in: https://github.com/dotnet/runtime/blob/341b8b22280e297daa7d96b4b8a9fa1d40d28 aa9/src/libraries/System.Private.CoreLib/src/System/IO/FileStream.Unix.cs#L73-L8 8 We also rename the test to FetchStepReleasesFetchLockFile() to more accurately reflect what it is now testing.
The FetchStepCleansUpStaleFetchLock() functional test currently only succeeds on Windows (and is therefore marked with the MacTODO.TestNeedsToLockFile category) because it checks whether the FetchStep command has released its lock file by testing if the lock file has been deleted. This works with our WindowsFileBasedLock implementation, which opens the lock file with FileOptions.DeleteOnClose, so the file is removed when the lock is released. However, the MacFileBasedLock version leaves the file in place and uses flock(2) and LOCK_EX to acquire and release the lock atomically; there is no simple way to delete the file while still holding the open file descriptor needed for flock(). To make this test succeed on Mac/Linux while still being meaningful, we need to be robust against the lock file existing both at the start of the test and afterwards, while still ensuring that the lock has been released by the FetchStep command. We therefore simply attempt to open the lock file after the fetch command completes, using the FileShare.None attribute, which will fail on all platforms if the lock file has not been released. On Windows this is because WindowsFileBasedLock uses FileShare.Read to open the lock file, so if the lock is still held our attempt with FileShare.None will fail. On macOS (and Linux) this is because our use of flock() in MacFileBasedLock will conflict with the flock() with LOCK_EX which the .NET Core Unix implementation of FileShare.None also uses. See the discussions in microsoft/VFSForGit#213, dotnet/runtime#24432, as well as the .NET Core library implementation in: https://github.com/dotnet/runtime/blob/341b8b22280e297daa7d96b4b8a9fa1d40d28 aa9/src/libraries/System.Private.CoreLib/src/System/IO/FileStream.Unix.cs#L73-L8 8 We also rename the test to FetchStepReleasesFetchLockFile() to more accurately reflect what it is now testing.
The FetchStepCleansUpStaleFetchLock() functional test currently only succeeds on Windows (and is therefore marked with the MacTODO.TestNeedsToLockFile category) because it checks whether the FetchStep command has released its lock file by testing if the lock file has been deleted. This works with our WindowsFileBasedLock implementation, which opens the lock file with FileOptions.DeleteOnClose, so the file is removed when the lock is released. However, the MacFileBasedLock version leaves the file in place and uses flock(2) and LOCK_EX to acquire and release the lock atomically; there is no simple way to delete the file while still holding the open file descriptor needed for flock(). To make this test succeed on Mac/Linux while still being meaningful, we need to be robust against the lock file existing both at the start of the test and afterwards, while still ensuring that the lock has been released by the FetchStep command. We therefore simply attempt to open the lock file after the fetch command completes, using the FileShare.None attribute, which will fail on all platforms if the lock file has not been released. On Windows this is because WindowsFileBasedLock uses FileShare.Read to open the lock file, so if the lock is still held our attempt with FileShare.None will fail. On macOS (and Linux) this is because our use of flock() in MacFileBasedLock will conflict with the flock() with LOCK_EX which the .NET Core Unix implementation of FileShare.None also uses. See the discussions in microsoft/VFSForGit#213, dotnet/runtime#24432, as well as the .NET Core library implementation in: https://github.com/dotnet/runtime/blob/341b8b22280e297daa7d96b4b8a9fa1d40d28aa9/src/libraries/System.Private.CoreLib/src/System/IO/FileStream.Unix.cs#L73-L88 We also rename the test to FetchStepReleasesFetchLockFile() to more accurately reflect what it is now testing.
Fixes #211
Issue
The previous implementation of
FileBasedLock
is that it depended on Windows specific file sharing rules\behavior.Changes
FileBasedLock
to the Windows platform project and rename it toWindowsFileBasedLock
MacFileBasedLock
that usesflock
for locking on MacFileBasedLock
that is implemented by the two platform specific classes mentioned above.CreateFileBasedLock
to theGVFSPlatform
Note that with this change it's the
flock
that ensures\indicates exclusive access (unlike on Windows where the lock file being present indicates ownership of the lock)