Something went wrong on our end
README 3.91 KiB
_ __ _ _____ (_)___ ___________/ /_ (_) _____ _____ / ___/ / / __ `/ ___/ ___/ __ \/ / | / / _ \/ ___/ / /__ / / /_/ / / / /__/ / / / /| |/ / __/ / \___/_/ /\__,_/_/ \___/_/ /_/_/ |___/\___/_/ /___/ DESCRIPTION cjarchiver is a Python script that can be used to compress a directory including all its files and subdirectories. ----------------------------------------------------------- PREREQUISITES In order to use cjarchiver in sciCORE clusters we need to load the cjarchiver module: ml cjarchiver This will load the default version. If you need a specific version you can search for it with: ml spider cjarchiver To archive a directory our current working directory should be at the same level than the target directory. Additionally it is mandatory that the target directory contains a metadata file with JSON format. The user should create this file following this format: { "name": "NAME OF INVESTIGATOR", "email": "EMAIL OF INVESTIGATOR", "pi_name": "NAME OF PI", "pi_email": "EMAIL OF PI", "project": "INSERT PROJECT NAME HERE", "project_start_date": "YYYY-MM-DD", "project_end_date": "YYYY-MM-DD", "description": "INSERT PROJECT DESCRIPTION HERE MULTILINE IS NOT OK", "collaborators":[ { "name": "COLLABORATOR NAME", "email": "COLLABORATOR EMAIL" }, { "name": "COLLABORATOR NAME", "email": "COLLABORATOR EMAIL" } ], "comments": "ADDITIONAL COMMENTS (E.G. LEGAL REQUIREMENTS REGARDING DURATION OF DATA PRESERVATION, ETC...)" } ----------------------------------------------------------- USAGE To execute cjarchiver: cjarchiver <target_directory> [options] If successful, this will generate four files with the name format <username>_YYYYMMDDThhmmss_<targetfoldername> and the following extensions: .log - with the outputs of the script. .json - copy of the ARCHIVE_METADATA.json file. .manifest - with the full list of archived files including permissions, ownership, size, date, and path. .md5sum - with the full list of archived files and their corresponding path and MD5 checksum. .tar.bz2 - compressed archive of the target directory. The manifest and md5sum files are also automatically copied inside of the target directory and, therefore, included in the .tar.bz2 file. After the creation of these files, cjarchiver renames the target directory as <targetdirectory>.toberemoved/ As its name indicates, <targetdirectory>.toberemoved/ can be deleted, but prior to that, we strongly recommend to check that the .tar.bz2 file has been created. ----------------------------------------------------------- OPTIONS -h, --help: Shows a help message and exits. -x <subdirectory>, --exclude <subdirectory>: The user can specify subdirectories to be excluded from archiving (only first level subdirectories names, not full path). It can be repeated for additional subdirectories. ----------------------------------------------------------- EXAMPLES Archive directory "old_data": cjarchive old_data Archive directory "old_data" but exclude "old_data/bad_exp" cjarchive old_data -x bad_exp Archive directory "old_data" but exclude "old_data/bad_exp" and "old_data/bad_data" cjarchive old_data -x bad_exp -x bad_data ----------------------------------------------------------- KNOWN ISSUES cjarchiver uses the find command to create the manifest and the md5sum files. It is known that find might fail when used through NFS to access remote directories. We recommend to use cjarchiver locally (i.e. directly where the target data is located).