User Tools

Site Tools


groupdatadirs

Sharing Data - Giving Read/Write Access to Others

Don't Make Data World-Readable!

Users often need to share data with collaborators. It's tempting to simply do chmod +rx /data/my-data (or similar) to make the data 'world-readable' so that anyone on the cluster can read it. In linux terms, this means changing the 'other' permissions to read. But this generally isn't safe and should definitely not be done with HIPAA-protected data.

Even worse is to make your data world-writable, meaning anyone on the cluster can delete it or change it.

There are good alternatives to making data and data directories world-readable and world-writable. Keep reading…

Quickly Share Some Data within the Cluster

There's an easy script for you to use - search for cfn-share-data below.

Linux Permissions - Important to Understand

Make sure you're comfortable with the basics of linux file permissions and ownership. If not, read this webpage carefully and play around a little on your files:

https://www.digitalocean.com/community/tutorials/an-introduction-to-linux-permissions

'Other' and 'World' Permissions

The term 'world-permission', like a used above, is commonly used to mean the 'other' permissions assigned to a file or directly. It refers to all users other than the file's (or directory's) owner, and other than users who are in the group assigned to the file (or directory).

How to Share Data

There are three main ways to share data, described in detail in the sections below:

  • Shared Directories - generally for longer-term sharing with an established group and collaborators.
    • More details below
    • PROS:
      • easy to see ownership and permissions from regular linux ls -l command
      • onetime setup, then forget about it
    • CONS:
      • requires admins to do the initial setup and add new collaborators to the shared group
      • risk of large data loss from accidental user actions, since the tendency is to end up with a single large shared directory with rwx permissions on everything for everyone in the group
  • ACL (Access Control List) permissions - user-controlled permissions for sharing files and directories with existing users and linux groups
    • PROS:
      • no setup needed from CfN sysadmins
      • there's an easy script for you to use - see cfn-share-data below
      • good for quick sharing
      • easy to use for fine-tuned permissions settings (e.g. give one user read-only permission, another write permissions for the same directory)
      • fine-tuned permissions can make it easier to protect against large amounts of data being accidentally deleted or changed by one user with permissions to a group directory
    • CONS:
      • NOTE you can't directly use getfacl and setfacl from chead or from compute nodes on the /data/ directories. This is because we use NFSv4 which is incompatible. See the cfn-share-data script below, which uses ssh calls.
      • viewing permissions requires getfacl command instead of just ls -l
      • easier to forget that or how something is shared because of above issue
  • PUBLIC directory and other shared groups
    • We've created a PUBLIC directory in everyone's data directory that we suggest you use for sharing files or data that are safely shared with everyone on the cluster. You can also create symlinks here to other files that are world-readable.

Details below…


Shared Group Directory

Research groups can request a group data directory that allows all members of the group (lab members and collaborators) to read and write files and sub-directories under the group data directory. By 'group', we mean a linux group to which a research group's cluster user accounts have been assigned by the admins.

The way we set these up:

  • the directory is owned by the shared linux group
  • all new dirs and files created in the directory will be owned by the directory's group
  • all new dirs and files created in the directory will have group rwx permissions, without each user having to:
    • a) have that group as their login group, or
    • b) set their umask accordingly.

For consistent results, each user's primary/login group must be same as their username, see below.

This is very convenient for sharing data among lab members. In this configuration all new files and sub-directories are owned by the group.

WARNING

With this level of access any member of the group has the ability to possibly accidentally erase any data stored in the group data directory.

The so called ‘sticky bit’ of a directory controls the inheritance of group ownership for new files and sub-directories. In the Shared Group Directory configuration the sticky bit is set for all new files to be owned by the group. Any one who belongs to the group has permissions to read, write or delete files in the group directory.

The group permissions are set to rwx on all new files and sub-directories via the ACL (access control list) settings on the group directory. See below for ACL discussion.

Example:

drwxrws---+ 20 emikkelsen cnds 4.0K 2016-03-01 10:38 /data/jag/cnds/
  • The directory /data/jag/cnds is owned by user emikkelsen and group cnds.
  • The first group of permissions rwx means the owner can read/write/execute in the directory.
  • The second group of rws is the group permissions, showing the group members have read/write/execute permissions, and the s shows that the 'sticky bit' is set (see above). Little s means, both sticky bit and execute permissions are set. If there were a big S instead, it would mean only the sticky bit is set, execute permissions are not set.
  • The final group —- means all other users cannot read, write or execute.
  • The + at the end of the permissions means that ACL (access control list) permissions have been set. See below.

Login Group Issue

For consistency with all programs (I'm looking at you, FSL!), each user should have their “login” group be the same as their username. Run id username using the user's username and the “gid” value should be the same as their username. For example:

id mgstauff
uid=2198(mgstauff) gid=2198(mgstauff) groups=1006(cnds),1001(cfn)

Access Control Lists

ACL (Access Control Lists) provides a more flexible permission mechanism for file systems. It is designed to assist with UNIX file permissions. ACL allows you to give permissions for any user, group or multiple individual users to any disc resource.

Sharing Your Data Using ACL's

The Quick Way to Use ACLs

We've created a script on the cluster (chead and nodes) for easily sharing data and viewing ACLs:

cfn-share-data
  • This script should be in your search path. If not, look in /share/admin/
  • Run the script with no parameters to print instructions and get a few examples.
  • This script will be prompt you for your password - this is normal. It's using ssh under the hood to get/set ACLs.

To clear all ACL's from a file or directory (and its sub-directories), run this command:

cfn-share-data -c <directory-or-file-to-clear>

More detailed info on ACLs

There's some discussion here: This site has more information on access control lists: https://wiki.archlinux.org/index.php/Access_Control_Lists (you can skip the 'Configuration' section).

Recognizing when ACL's exist

When a file or directory has ACL rules applied to it, you'll see a + at the end of the permissions block when you list it using ls -l:

  1. rw-rw—-+ 1 mgstauff mgstauff 15 Mar 7 13:57 file1

Parent Directories Must Allow Access

A complication of using ACL's is that parent directories of a file or directory you wish to share must at least have –x permissions (execute) for the user or group you're sharing with. There's a script we've created to help with this. See the examples below.

Examples

Share a single file (or directory) with another user

At first my file called file1 has no ACL settings:

[mgstauff@chead acltest]$ ll
-rw-rw---- 1 mgstauff mgstauff 15 Mar  7 13:57 file1

Use setfacl to allow rzorger to read and write it:

[mgstauff@chead acltest]$ setfacl -m u:rzorger:rwx file1
[mgstauff@chead acltest]$ ll
-rw-rw----+ 1 mgstauff mgstauff 15 Mar  7 13:57 file1

But we must also give execute permission to rzorger on all parent directories. Execute permission does not allow any reading, writing or deleting of files or directories, just access to files or directories for which a user does have explicit permissions set.

[mgstauff@chead acltest]$ setfacl -m u:rzorger:x $(cfnparents file1)

'cfnparents' is a script in your search path that creates a list of all the parent directories of the path your pass into it, excluding the top level directories like /data/jet, /data/jag, etc.

Now rzorger can read and write the file.

Check ACL permissions

Check what ACL permissions are on a file:

[mgstauff@chead acltest]$ getfacl file1
# file: file1
# owner: mgstauff
# group: mgstauff
user::rw-                  <- this is the 'regular' linux permissions for owner of the file
user:rzorger:rw-           <- user rzorger has rw permissions
group::rw-                 <- regular linux group permissions
mask::rwx                  <- masks the 'regular' linux permissions - google if you're interested
other::---                 <- regular linux other permissions

Share with a second user with different permissions

Here's how I add permissions to another user, but only to read the file this time (sorry Phil!):

setfacl -m u:pcook:r-- file1
setfacl -m u:pcook:--x $(cfnparents file1)

if I run getfacl on the file, I'll see this new line:

user:pcook:r--              <- user pcook can only read this file

Sharing with a group

The procedure is same for sharing with a group. In the setfacl command, replace the u with g, and the username with the group name:

[mgstauff@chead acltest]$ setfacl -m g:rzorgergroup:r-x filename
[mgstauff@chead acltest]$ setfacl -m g:rzorgergroup:--x $(cfnparents filename)

Change all files and sub-directories

To recursively change permissions for all files and sub-directories in a directory, add the -R option:

[mgstauff@chead acltest]$ setfacl -R -m u:rzorger:r-x /home/mgstauff/acltest/
[mgstauff@chead acltest]$ setfacl -m u:rzorger:--x $(cfnparents /home/mgstauff/acltest/)

**NOTE** That the second command that sets --x on parernt directories does NOT use the -R option.

Removing ACL's

Use the -x option to remove a particular ACL:

setfacl -x u:rzorger file1

all ACL's for user rzorger on file 1 have been cleared. You can use the -R to clear recursively.

To clear all ACL's (i.e. for all users and groups who have an ACL set for a file or directory):

setfacl -b file1

Test first what will happen

Add the –test option to your command to see what will happen without actually making the change.


Change an Existing Directory to a Shared Directory

To change an existing data directory to a shared group directory:

NOTE If a new group is needed to do this, you must ask one of the admins to create the group for you first.

1. Make sure top dir is owned by desired group and has group rwx permissions. For example:

chown mgstauff:stauffgroup /data/jet/mgstauff
chmod g+rwx /data/jet/mgstauff

2. Set group sticky bit on top of dir - all new files created here will be owned by group stauffgroup.

chmod g+s /data/jet/mgstauff

or, to set group sticky bit on all existing subdirs:

find /data/jet/mgstauff -type d -exec chmod g+s {} \;

3. Set acl's for group permissions - sets default group permissions to rwx (read-write-execute)

setfacl -m d:g:<other_username_or_groupname>:rwx /data/…

Where <other_username_or_groupname> is the group or username of the user you wish to grant access and /data/… is the full path to the file or sub-directory to be modified.

The -m flag stands for “Modify” the existing ACL. The little d in front of g makes it a “Default” ACL, so that in future if any file/dir gets created under perms directory, <other_username_or_groupname> will have rwx permission on them too.

Similarly to check facl settings one would use:

getfacl /data/…

This site has more information on access control lists: https://wiki.archlinux.org/index.php/Access_Control_Lists

4. All users whom you want to grant access this way need to belong to the group used above. If they don't already, ask the admin to add them.

5. For consistency with all programs (I'm looking at you, FSL!), each user should have their “login” group be the same as their username. Run id username using the user's username and the “gid” value should be the same as their username. If this is not the case, contact the admins. For example:

id mgstauff
uid=2198(mgstauff) gid=2198(mgstauff) groups=1006(cnds),1001(cfn)

Alternative Methods of Sharing

As an alternative to Group directories there are other levels of permission that can be configured. Other options available include:

  • Changing the group ownership to a different group that you belong to with chown. For example to change the owner of /data/… and sub-files to “root”, and its group to “staff”:

chown -hR root:staff /data/…

  • Or by placing data you want to share in your /data/<drive_name>/<user/group_name>/PUBLIC directory. This directory and its contents are world readable. You may also create symlinks in your PUBLIC so people can find it easily. Note though that the file to which the links point must have permissions set for user's to read (these could be ACL's as well). Tocreate a symbolic link while in your /data/<drive_name>/<group_name>/PUBLIC directory:
ln -s <TARGET> <LINK_NAME>

The <TARGET> is the existing file or directory to which the new file LINK_NAME will point.
  • Additionally if you want to allow other users to copy files into your disk space you can create a sub-directory of your group’s /PUBLIC directory and configure its permissions to be world-writable:
mkdir DROPBOX
chmod 777 DROPBOX

Just be sure to move this data to a safe spot after the transfer.
**NOTE ** For sensitive data, use ACL's to create a dir with write-permissions for whomever is
transferring data to you.
groupdatadirs.txt · Last modified: 2017/05/30 18:34 by mgstauff