8000 Add Mac specific FileBasedLock implementation by wilbaker · Pull Request #213 · microsoft/VFSForGit · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add Mac specific FileBasedLock implementation #213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Aug 24, 2018

Conversation

wilbaker
Copy link
Member
@wilbaker wilbaker commented Aug 23, 2018

Fixes #211

Issue

The previous implementation of FileBasedLock is that it depended on Windows specific file sharing rules\behavior.

Changes

  • Move FileBasedLock to the Windows platform project and rename it to WindowsFileBasedLock
  • Create a new MacFileBasedLock that uses flock for locking on Mac
  • Add a new abstract FileBasedLock that is implemented by the two platform specific classes mentioned above.
  • Add CreateFileBasedLock to the GVFSPlatform

Note that with this change it's the flock that ensures\indicates exclusive access (unlike on Windows where the lock file being present indicates ownership of the lock)


if (this.lockFileDescriptor != InvalidFileDescriptor)
{
if (Close(this.lockFileDescriptor) != 0)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

@sanoursa note that with the current changes the lock file is not removed from the file system. This allows us to avoid the race conditions described here:

https://stackoverflow.com/questions/17708885/flock-removing-locked-file-without-race-condition
https://stackoverflow.com/questions/33166460/how-to-delete-a-locked-flock-file-without-race-condition-before-or-after-rele

It appears that there are some options for removing it if we play some tricks with stat, but unless you see a major issue with this approach I'm thinking we stick with this simpler approach and avoid the complexity\races of attempting to delete this file.

Note that if we were not cross platform we could flock the file we're interested in itself (e.g. mapping.dat) but to keep the cross-platform code simple and consistent it's easiest if we use a separate file on Mac like we do on Windows.


if (this.lockFileDescriptor == InvalidFileDescriptor)
{
this.lockFileDescriptor = Creat(this.LockPath, Convert.ToInt32("644", 8));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Use open instead of creat to make intention clearer
  • Don't truncate file on open as at that point we have not acquired the flock yet

Copy link
Contributor
@sanoursa sanoursa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, and it sounds like this has resolved the test issues.

private const long InvalidFileLength = -1;
private const string EtwArea = nameof(FileBasedLock);
private static readonly Encoding UTF8NoBOM = new UTF8Encoding(false, true); // Default encoding used by StreamWriter
protected readonly PhysicalFileSystem FileSystem;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's generally a good idea to avoid protected member variables. These are fine since they are readonly, but I would personally make them properties with no setter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I'll make the change


public string Message { get; }
}
public abstract void Dispose();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need this method definition here. The interface should already be enough to require subclasses to implement this right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this might be leftover from when I was moving things around, let me try removing it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It ends up the abstract Dispose is required, without it we get the error:

FileBasedLock.cs(7,43): error CS0535: 'FileBasedLock' does not implement interface member 'IDisposable.Dispose()' [/Users/wilbaker/Repos/VFSForGit/src/GVFS/GVFS.Common/GVFS.Common.csproj]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea you're right. I must be spending too much time in C++ these days :-)


if (this.lockFileDescriptor == InvalidFileDescriptor)
{
this.lockFileDescriptor = Creat(this.LockPath, Convert.ToInt32("644", 8));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 644 always the right mode? What if someone has left this file behind with a different mode on it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code will be clearer with the switch to open (creat is the same as open with the flags we're interested in here).

The mode parameter is only used when creating a new file, it's ignored if the file already exists:

The mode argument specifies the file mode bits be applied when
a new file is created. This argument must be supplied when
O_CREAT or O_TMPFILE is specified in flags; if neither O_CREAT
nor O_TMPFILE is specified, then mode is ignored. The
effective mode is modified by the process's umask in the usual
way: in the absence of a default ACL, the mode of the created
file is (mode & ~umask). Note that this mode applies only to
future accesses of the newly created file;

http://man7.org/linux/man-pages/man2/open.2.html

private static extern int Close(int fd);

[DllImport("libc", EntryPoint = "flock", SetLastError = true)]
private static extern int Flock(int fd, int operation);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I would capitalize this as FLock

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

{
int errno = Marshal.GetLastWin32Error();

// Log error if not EWOULDBLOCK
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment accurate? I don't see an if in here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still WIP, for now I'm logging everything just to make sure things are working as expected


this.Tracer.RelatedInfo($"Lock acquired for for '{this.LockPath}'");

return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be setting this.lockAcquired = true here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we should, good catch :)

{
if (Flock(this.lockFileDescriptor, LockUn) != 0)
{
this.Tracer.RelatedWarning($"Failed to release lock for: '{this.LockPath}'");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then maybe IDisposable isn't the proper interface for this class. Do we need a TryUnlock instead so we can retry?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And should we avoid clearing lockAcquired when this happens?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading through the comments for flock the lock will be released automatically when the file descriptor is closed, and so strictly speaking I don't think we need this call in the first place:

Locks created by flock() are associated with an open file description
(see open(2)). This means that duplicate file descriptors (created
by, for example, fork(2) or dup(2)) refer to the same lock, and this
lock may be modified or released using any of these file descriptors.
Furthermore, the lock is released either by an explicit LOCK_UN
operation on any of these duplicate file descriptors, or when all
such file descriptors have been closed.

http://man7.org/linux/man-pages/man2/flock.2.html

(Note also that all the error values described for flock only apply when acquiring the lock and not releasing it)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note also that if close fails it should not be retried, as the fd has been successfully released (any errors are just for diagnostic purposes):

A careful programmer will check the return value of close(), since it
is quite possible that errors on a previous write(2) operation are
reported only on the final close() that releases the open file
description. Failing to check the return value when closing a file
may lead to silent loss of data. This can especially be observed
with NFS and with disk quota.

Note, however, that a failure return should be used only for
diagnostic purposes (i.e., a warning to the application that there
may still be I/O pending or there may have been failed I/O) or
remedial purposes (e.g., writing the file once more or creating a
backup).

Retrying the close() after a failure return is the wrong thing to do,
since this may cause a reused file descriptor from another thread to
be closed. This can occur because the Linux kernel always releases
the file descriptor early in the close operation, freeing it for
reuse; the steps that may return an error, such as flushing data to
the filesystem or device, occur only later in the close operation.

http://man7.org/linux/man-pages/man2/close.2.html

@wilbaker wilbaker changed the title [WIP] Add Mac specific FileBasedLock implementation Add Mac specific FileBasedLock implementation Aug 24, 2018
{
[TestFixture]
public class FileBasedLockTests
public class WindowsFileBasedLockTests
{
[TestCase]
public void CreateLockWhenDirectoryMissing()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if this scenario works on Mac

Copy link
Contributor
@sanoursa sanoursa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Feel free to ignore the nit picky stuff, but I am curious about failing TryAcquireLock if writing to the file fails.

public class MacFileBasedLock : FileBasedLock
{
// #define O_WRONLY 0x0001 /* open for writing only */
private const int OpenWriteOnly = 0x0001;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: since you created a NativeMethods subclass (which I like), I would also move all of these constants inside there since that is the only context in which the constants are needed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a good idea, I'll move these

}

[Flags]
private enum FLockOperations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a nit: you're currently using constants for some and an enum for others. I would keep them consistent, and the constants seem to make the usage less cluttered (no need for a cast)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make them all contants


if (bytesWritten == -1)
{
int errno = Marshal.GetLastWin32Error();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we still returning true if we hit this, or the other error right below it? Should we fail here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we still returning true if we hit this

We're returning true because FLock successfully acquired the lock

or the other error right below it? Should we fail here?

I don't think that we need to. Currently the signature is only to help with diagnostics, it need not be written for the locking to work correctly. (I also considered not writing the signature at all, it would speed things up and eliminate a class of possible failures, thoughts on that option?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea that's a good point. The reason we added the signature in the first place is that we were also creating the index.lock from the mount process. It was useful to know, for diagnostics, if an abandoned index.lock had been created by git or by us. Now that we no longer create the index.lock... is the signature adding any value at all? Even on Windows? We could just drop it if it's not serving a purpose any longer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just drop it. Everything that uses FileBasedLock will always clean up an orphan lock (without checking the signature), and on Windows the file is always opened with DELETE_ON_CLOSE and so unless there is a power-loss or BSOD event there won't be any stale locks left on disk,

{
if (NativeMethods.Close(this.lockFileDescriptor) != 0)
{
int errno = Marshal.GetLastWin32Error();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest leaving a comment here that an error from close is advisory only

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


namespace GVFS.Platform.Windows
{
public class WindowsFileBasedLock : FileBasedLock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing there are no meaningful changes in this code compared to the old FileBasedLock?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were a lot of changes to this file to remove dead code and clean up the interface.

If you'd like to see the changes you can check out the first commit:

2ab1b2b

I did not perform the file rename in that commit to make it easier to see the diffs in this file (see GVFS/GVFS.Common/FileBasedLock.cs in the above commit)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Note that these changes pre-dated the creation of the abstract FileBasedLock, and so the private members were not yet moved out)

@@ -331,6 +331,7 @@ PrjFS_Result PrjFS_WritePlaceholderFile(
goto CleanupAndFail;
}

// TODO(Mac): Only call chmod is fileMode is different than the default file mode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see I made a typo in this comment, I'll fix that up

{
if (this.deleteOnCloseStream != null)
{
throw new InvalidOperationException("Lock has already been acquired");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanoursa let me know what you think of this change (to throw an exception) in my latest commit.

I made this change (rather than returning true) to help enforce the rule that TryAcquireLock should not be called on a FileBasedLock after it has successfully acquired the lock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throwing InvalidOperationException makes perfect sense here. A re-entrant usage of this lock violates its contract and could result in other bugs if we don't make that obvious.

@wilbaker wilbaker merged commit 607a4e2 into microsoft:master Aug 24, 2018
chrisd8088 added a commit to chrisd8088/scalar that referenced this pull request Sep 5, 2020
The FetchStepCleansUpStaleFetchLock() functional test currently
only succeeds on Windows (and is therefore marked with the
MacTODO.TestNeedsToLockFile category) because it checks whether
the FetchStep command has released its lock file by testing if
the lock file has been deleted.

This works with our WindowsFileBasedLock implementation, which
opens the lock file with FileOptions.DeleteOnClose, so the file is
removed when the lock is released.  However, the MacFileBasedLock
version leaves the file in place and uses flock(2) and LOCK_EX
to acquire and release the lock atomically; there is no simple
way to delete the file while still holding the open file
descriptor needed for flock().

To make this test succeed on Mac/Linux while still being meaningful,
we need to be robust against the lock file existing both at the
start of the test and afterwards, while still ensuring that the
lock has been released by the FetchStep command.

We therefore simply attempt to open the lock file after the fetch
command completes, using the FileShare.None attribute, which will
fail on all platforms if the lock file has not been released.

On Windows this is because WindowsFileBasedLock uses FileShare.Read
to open the lock file, so if the lock is still held our attempt
with FileShare.None will fail.

On macOS (and Linux) this is because our use of flock() in
MacFileBasedLock will conflict with the flock() with LOCK_EX which
the .NET Core Unix implementation of FileShare.None also uses.

See the discussions in microsoft/VFSForGit#213, dotnet/runtime#24432,
as well as the .NET Core library implementation in:

https://github.com/dotnet/runtime/blob/341b8b22280e297daa7d96b4b8a9fa1d40d28
aa9/src/libraries/System.Private.CoreLib/src/System/IO/FileStream.Unix.cs#L73-L8
8

We also rename the test to FetchStepReleasesFetchLockFile() to
more accurately reflect what it is now testing.
chrisd8088 added a commit to chrisd8088/scalar that referenced this pull request Sep 5, 2020
The FetchStepCleansUpStaleFetchLock() functional test currently
only succeeds on Windows (and is therefore marked with the
MacTODO.TestNeedsToLockFile category) because it checks whether
the FetchStep command has released its lock file by testing if
the lock file has been deleted.

This works with our WindowsFileBasedLock implementation, which
opens the lock file with FileOptions.DeleteOnClose, so the file is
removed when the lock is released.  However, the MacFileBasedLock
version leaves the file in place and uses flock(2) and LOCK_EX
to acquire and release the lock atomically; there is no simple
way to delete the file while still holding the open file
descriptor needed for flock().

To make this test succeed on Mac/Linux while still being meaningful,
we need to be robust against the lock file existing both at the
start of the test and afterwards, while still ensuring that the
lock has been released by the FetchStep command.

We therefore simply attempt to open the lock file after the fetch
command completes, using the FileShare.None attribute, which will
fail on all platforms if the lock file has not been released.

On Windows this is because WindowsFileBasedLock uses FileShare.Read
to open the lock file, so if the lock is still held our attempt
with FileShare.None will fail.

On macOS (and Linux) this is because our use of flock() in
MacFileBasedLock will conflict with the flock() with LOCK_EX which
the .NET Core Unix implementation of FileShare.None also uses.

See the discussions in microsoft/VFSForGit#213, dotnet/runtime#24432,
as well as the .NET Core library implementation in:

https://github.com/dotnet/runtime/blob/341b8b22280e297daa7d96b4b8a9fa1d40d28
aa9/src/libraries/System.Private.CoreLib/src/System/IO/FileStream.Unix.cs#L73-L8
8

We also rename the test to FetchStepReleasesFetchLockFile() to
more accurately reflect what it is now testing.
chrisd8088 added a commit to chrisd8088/scalar that referenced this pull request Sep 5, 2020
The FetchStepCleansUpStaleFetchLock() functional test currently
only succeeds on Windows (and is therefore marked with the
MacTODO.TestNeedsToLockFile category) because it checks whether
the FetchStep command has released its lock file by testing if
the lock file has been deleted.

This works with our WindowsFileBasedLock implementation, which
opens the lock file with FileOptions.DeleteOnClose, so the file is
removed when the lock is released.  However, the MacFileBasedLock
version leaves the file in place and uses flock(2) and LOCK_EX
to acquire and release the lock atomically; there is no simple
way to delete the file while still holding the open file
descriptor needed for flock().

To make this test succeed on Mac/Linux while still being meaningful,
we need to be robust against the lock file existing both at the
start of the test and afterwards, while still ensuring that the
lock has been released by the FetchStep command.

We therefore simply attempt to open the lock file after the fetch
command completes, using the FileShare.None attribute, which will
fail on all platforms if the lock file has not been released.

On Windows this is because WindowsFileBasedLock uses FileShare.Read
to open the lock file, so if the lock is still held our attempt
with FileShare.None will fail.

On macOS (and Linux) this is because our use of flock() in
MacFileBasedLock will conflict with the flock() with LOCK_EX which
the .NET Core Unix implementation of FileShare.None also uses.

See the discussions in microsoft/VFSForGit#213, dotnet/runtime#24432,
as well as the .NET Core library implementation in:

https://github.com/dotnet/runtime/blob/341b8b22280e297daa7d96b4b8a9fa1d40d28
aa9/src/libraries/System.Private.CoreLib/src/System/IO/FileStream.Unix.cs#L73-L8
8

We also rename the test to FetchStepReleasesFetchLockFile() to
more accurately reflect what it is now testing.
chrisd8088 added a commit to chrisd8088/scalar that referenced this pull request Sep 5, 2020
The FetchStepCleansUpStaleFetchLock() functional test currently
only succeeds on Windows (and is therefore marked with the
MacTODO.TestNeedsToLockFile category) because it checks whether
the FetchStep command has released its lock file by testing if
the lock file has been deleted.

This works with our WindowsFileBasedLock implementation, which
opens the lock file with FileOptions.DeleteOnClose, so the file is
removed when the lock is released.  However, the MacFileBasedLock
version leaves the file in place and uses flock(2) and LOCK_EX
to acquire and release the lock atomically; there is no simple
way to delete the file while still holding the open file
descriptor needed for flock().

To make this test succeed on Mac/Linux while still being meaningful,
we need to be robust against the lock file existing both at the
start of the test and afterwards, while still ensuring that the
lock has been released by the FetchStep command.

We therefore simply attempt to open the lock file after the fetch
command completes, using the FileShare.None attribute, which will
fail on all platforms if the lock file has not been released.

On Windows this is because WindowsFileBasedLock uses FileShare.Read
to open the lock file, so if the lock is still held our attempt
with FileShare.None will fail.

On macOS (and Linux) this is because our use of flock() in
MacFileBasedLock will conflict with the flock() with LOCK_EX which
the .NET Core Unix implementation of FileShare.None also uses.

See the discussions in microsoft/VFSForGit#213, dotnet/runtime#24432,
as well as the .NET Core library implementation in:

https://github.com/dotnet/runtime/blob/341b8b22280e297daa7d96b4b8a9fa1d40d28aa9/src/libraries/System.Private.CoreLib/src/System/IO/FileStream.Unix.cs#L73-L88

We also rename the test to FetchStepReleasesFetchLockFile() to
more accurately reflect what it is now testing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0