====== Sharing Data - Giving Read/Write Access to Others====== ===== Don't Make Data World-Readable! ===== Users often need to share data with collaborators. It's tempting to simply do ''chmod +rx /data/my-data'' (or similar) to make the data 'world-readable' so that anyone on the cluster can read it. In linux terms, this means changing the 'other' permissions to read. **But this generally isn't safe and should definitely not be done with HIPAA-protected data.** Even **worse** is to make your data world-writable, meaning anyone on the cluster can delete it or change it. There are good alternatives to making data and data directories world-readable and world-writable. Keep reading... ===== Quickly Share Some Data within the Cluster ===== There's an easy script for you to use - search for ''cfn-share-data'' below. ===== Linux Permissions - Important to Understand ===== Make sure you're comfortable with the basics of linux file permissions and ownership. If not, read this webpage carefully and play around a little on your files: https://www.digitalocean.com/community/tutorials/an-introduction-to-linux-permissions ==== 'Other' and 'World' Permissions ==== The term 'world-permission', like a used above, is commonly used to mean the 'other' permissions assigned to a file or directly. It refers to all users other than the file's (or directory's) owner, and other than users who are in the group assigned to the file (or directory). ===== How to Share Data ===== There are three main ways to share data, described in detail in the sections below: * Shared Directories - generally for longer-term sharing with an established group and collaborators. * More details below * PROS: * easy to see ownership and permissions from regular linux ''ls -l'' command * onetime setup, then forget about it * CONS: * requires admins to do the initial setup and add new collaborators to the shared group * risk of large data loss from accidental user actions, since the tendency is to end up with a single large shared directory with ''rwx'' permissions on everything for everyone in the group * ACL (Access Control List) permissions - user-controlled permissions for sharing files and directories with existing users and linux groups * PROS: * no setup needed from CfN sysadmins * **there's an easy script for you to use - see ''cfn-share-data'' below** * good for quick sharing * easy to use for fine-tuned permissions settings (e.g. give one user read-only permission, another write permissions for the same directory) * fine-tuned permissions can make it easier to protect against large amounts of data being accidentally deleted or changed by one user with permissions to a group directory * CONS: * **NOTE** you can't directly use ''getfacl'' and ''setfacl'' from ''chead'' or from compute nodes on the ''/data/'' directories. This is because we use NFSv4 which is incompatible. See the ''cfn-share-data'' script below, which uses ssh calls. * viewing permissions requires ''getfacl'' command instead of just ''ls -l'' * easier to forget that or how something is shared because of above issue * PUBLIC directory and other shared groups * We've created a PUBLIC directory in everyone's data directory that we suggest you use for sharing files or data that are safely shared with everyone on the cluster. You can also create symlinks here to other files that are world-readable. **Details below...** ---- ===== Shared Group Directory ===== Research groups can request a group data directory that allows all members of the group (lab members and collaborators) to read and write files and sub-directories under the group data directory. By 'group', we mean a linux group to which a research group's cluster user accounts have been assigned by the admins. The way we set these up: * the directory is owned by the shared linux group * all new dirs and files created in the directory will be owned by the directory's group * all new dirs and files created in the directory will have group ''rwx'' permissions, **without** each user having to: * a) have that group as their login group, or * b) set their umask accordingly. For consistent results, each user's primary/login group must be same as their username, see below. This is very convenient for sharing data among lab members. In this configuration all new files and sub-directories are owned by the group. === WARNING === **With this level of access any member of the group has the ability to possibly accidentally erase any data stored in the group data directory. ** The so called ‘sticky bit’ of a directory controls the inheritance of group ownership for new files and sub-directories. In the Shared Group Directory configuration the sticky bit is set for all new files to be owned by the group. Any one who belongs to the group has permissions to read, write or delete files in the group directory. The group permissions are set to ''rwx'' on all new files and sub-directories via the ACL (access control list) settings on the group directory. See below for ACL discussion. Example: drwxrws---+ 20 emikkelsen cnds 4.0K 2016-03-01 10:38 /data/jag/cnds/ * The directory ''/data/jag/cnds'' is owned by user ''emikkelsen'' and group ''cnds''. * The first group of permissions ''rwx'' means the owner can read/write/execute in the directory. * The second group of ''rws'' is the group permissions, showing the group members have read/write/execute permissions, and the ''s'' shows that the 'sticky bit' is set (see above). Little ''s'' means, both sticky bit and execute permissions are set. If there were a big ''S'' instead, it would mean only the sticky bit is set, execute permissions are not set. * The final group ''----'' means all other users cannot read, write or execute. * The ''+'' at the end of the permissions means that ACL (access control list) permissions have been set. See below. === Login Group Issue === For consistency with all programs (I'm looking at you, FSL!), each user should have their “login” group be the same as their username. Run id username using the user's username and the “gid” value should be the same as their username. For example:\\ id mgstauff uid=2198(mgstauff) gid=2198(mgstauff) groups=1006(cnds),1001(cfn) ---- =====Access Control Lists===== ACL (Access Control Lists) provides a more flexible permission mechanism for file systems. It is designed to assist with UNIX file permissions. ACL allows you to give permissions for any user, group or multiple individual users to any disc resource. ====Sharing Your Data Using ACL's==== ===The Quick Way to Use ACLs === We've created a script on the cluster (chead and nodes) for easily sharing data and viewing ACLs: cfn-share-data * This script should be in your search path. If not, look in ''/share/admin/'' * Run the script with no parameters to print instructions and get a few examples. * __This script will be prompt you for your password__ - this is normal. It's using ''ssh'' under the hood to get/set ACLs. __To clear all ACL's__ from a file or directory (and its sub-directories), run this command: cfn-share-data -c ====More detailed info on ACLs==== There's some discussion here: This site has more information on access control lists: https://wiki.archlinux.org/index.php/Access_Control_Lists (you can skip the 'Configuration' section). === Recognizing when ACL's exist === When a file or directory has ACL rules applied to it, you'll see a ''+'' at the end of the permissions block when you list it using ''ls -l'': -rw-rw----+ 1 mgstauff mgstauff 15 Mar 7 13:57 file1 === Parent Directories Must Allow Access === A complication of using ACL's is that parent directories of a file or directory you wish to share must at least have --x permissions (execute) for the user or group you're sharing with. There's a script we've created to help with this. See the examples below. === Examples === **Share a single file (or directory) with another user** At first my file called ''file1'' has no ACL settings: [mgstauff@chead acltest]$ ll -rw-rw---- 1 mgstauff mgstauff 15 Mar 7 13:57 file1 Use ''setfacl'' to allow rzorger to read and write it: [mgstauff@chead acltest]$ setfacl -m u:rzorger:rwx file1 [mgstauff@chead acltest]$ ll -rw-rw----+ 1 mgstauff mgstauff 15 Mar 7 13:57 file1 But we must also give execute permission to rzorger on all parent directories. Execute permission does not allow any reading, writing or deleting of files or directories, just access to files or directories for which a user does have explicit permissions set. [mgstauff@chead acltest]$ setfacl -m u:rzorger:x $(cfnparents file1) 'cfnparents' is a script in your search path that creates a list of all the parent directories of the path your pass into it, excluding the top level directories like /data/jet, /data/jag, etc. Now rzorger can read and write the file. **Check ACL permissions** Check what ACL permissions are on a file: [mgstauff@chead acltest]$ getfacl file1 # file: file1 # owner: mgstauff # group: mgstauff user::rw- <- this is the 'regular' linux permissions for owner of the file user:rzorger:rw- <- user rzorger has rw permissions group::rw- <- regular linux group permissions mask::rwx <- masks the 'regular' linux permissions - google if you're interested other::--- <- regular linux other permissions **Share with a second user with different permissions** Here's how I add permissions to another user, **but** only to read the file this time (sorry Phil!): setfacl -m u:pcook:r-- file1 setfacl -m u:pcook:--x $(cfnparents file1) if I run ''getfacl'' on the file, I'll see this new line: user:pcook:r-- <- user pcook can only read this file **Sharing with a group** The procedure is same for sharing with a group. In the ''setfacl'' command, replace the ''u'' with ''g'', and the username with the group name: [mgstauff@chead acltest]$ setfacl -m g:rzorgergroup:r-x filename [mgstauff@chead acltest]$ setfacl -m g:rzorgergroup:--x $(cfnparents filename) **Change all files and sub-directories** To recursively change permissions for all files and sub-directories in a directory, add the ''-R'' option: [mgstauff@chead acltest]$ setfacl -R -m u:rzorger:r-x /home/mgstauff/acltest/ [mgstauff@chead acltest]$ setfacl -m u:rzorger:--x $(cfnparents /home/mgstauff/acltest/) **NOTE** That the second command that sets --x on parernt directories does NOT use the -R option. **Removing ACL's** Use the ''-x'' option to remove a particular ACL: setfacl -x u:rzorger file1 all ACL's for user rzorger on file 1 have been cleared. You can use the ''-R'' to clear recursively. To clear all ACL's (i.e. for all users and groups who have an ACL set for a file or directory): setfacl -b file1 **Test first what will happen** Add the ''--test'' option to your command to see what will happen without actually making the change. ---- ===== Change an Existing Directory to a Shared Directory ===== To change an existing data directory to a shared group directory: **NOTE** If a new group is needed to do this, you must ask one of the admins to create the group for you first. 1. Make sure top dir is owned by desired group and has group rwx permissions. For example:\\ chown mgstauff:stauffgroup /data/jet/mgstauff chmod g+rwx /data/jet/mgstauff 2. Set group sticky bit on top of dir - all new files created here will be owned by group stauffgroup.\\ chmod g+s /data/jet/mgstauff or, to set group sticky bit on all existing subdirs:\\ find /data/jet/mgstauff -type d -exec chmod g+s {} \; 3. Set acl's for group permissions - sets default group permissions to rwx (read-write-execute)\\ setfacl -m d:g::rwx /data/… Where is the group or username of the user you wish to grant access and /data/… is the full path to the file or sub-directory to be modified. The -m flag stands for "Modify" the existing ACL. The little d in front of g makes it a "Default" ACL, so that in future if any file/dir gets created under perms directory, will have rwx permission on them too. Similarly to check facl settings one would use:\\ getfacl /data/… This site has more information on access control lists: https://wiki.archlinux.org/index.php/Access_Control_Lists 4. All users whom you want to grant access this way need to belong to the group used above. If they don't already, ask the admin to add them. 5. For consistency with all programs (I'm looking at you, FSL!), each user should have their “login” group be the same as their username. Run id username using the user's username and the “gid” value should be the same as their username. If this is not the case, contact the admins. For example:\\ id mgstauff uid=2198(mgstauff) gid=2198(mgstauff) groups=1006(cnds),1001(cfn) ---- =====Alternative Methods of Sharing===== As an alternative to Group directories there are other levels of permission that can be configured. Other options available include: * Changing the group ownership to a different group that you belong to with chown. For example to change the owner of /data/... and sub-files to "root", and its group to "staff":\\ chown -hR root:staff /data/… * Or by placing data you want to share in your /data///PUBLIC directory. This directory and its contents are world readable. You may also create symlinks in your PUBLIC so people can find it easily. Note though that the file to which the links point must have permissions set for user's to read (these could be ACL's as well). Tocreate a symbolic link while in your /data///PUBLIC directory: ln -s The is the existing file or directory to which the new file LINK_NAME will point. * Additionally if you want to allow other users to copy files into your disk space you can create a sub-directory of your group’s /PUBLIC directory and configure its permissions to be world-writable:\\ mkdir DROPBOX chmod 777 DROPBOX Just be sure to move this data to a safe spot after the transfer. **NOTE ** For sensitive data, use ACL's to create a dir with write-permissions for whomever is transferring data to you.