Newer
Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
_ __ _
_____ (_)___ ___________/ /_ (_) _____ _____
/ ___/ / / __ `/ ___/ ___/ __ \/ / | / / _ \/ ___/
/ /__ / / /_/ / / / /__/ / / / /| |/ / __/ /
\___/_/ /\__,_/_/ \___/_/ /_/_/ |___/\___/_/
/___/
DESCRIPTION
cjarchiver is a Python script that can be used to compress
a directory including all its files and subdirectories.
-----------------------------------------------------------
PREREQUISITES
In order to use cjarchiver in sciCORE clusters we need to
load the cjarchiver module:
ml cjarchiver
This will load the default version. If you need a specific
version you can search for it with:
ml spider cjarchiver
To archive a directory our current working directory should
be at the same level than the target directory. Additionally
it is mandatory that the target directory contains a metadata
file with JSON format. The user should create this file
following this format:
{
"name": "NAME OF INVESTIGATOR",
"email": "EMAIL OF INVESTIGATOR",
"pi_name": "NAME OF PI",
"pi_email": "EMAIL OF PI",
"project": "INSERT PROJECT NAME HERE",
"project_start_date": "YYYY-MM-DD",
"project_end_date": "YYYY-MM-DD",
"description": "INSERT PROJECT DESCRIPTION HERE MULTILINE IS NOT OK",
"collaborators":[
{ "name": "COLLABORATOR NAME",
"email": "COLLABORATOR EMAIL"
},
{ "name": "COLLABORATOR NAME",
"email": "COLLABORATOR EMAIL"
}
],
"comments": "ADDITIONAL COMMENTS (E.G. LEGAL REQUIREMENTS REGARDING DURATION OF DATA PRESERVATION, ETC...)"
}
-----------------------------------------------------------
USAGE
To execute cjarchiver:
cjarchiver <target_directory> [options]
If successful, this will generate four files with the name
format <username>_YYYYMMDDThhmmss_<targetfoldername> and the
following extensions:
.log - with the outputs of the script.
.json - copy of the ARCHIVE_METADATA.json file.
.manifest - with the full list of archived files including
permissions, ownership, size, date, and path.
.md5sum - with the full list of archived files and their
corresponding path and MD5 checksum.
.tar.bz2 - compressed archive of the target directory.
The manifest and md5sum files are also automatically copied
inside of the target directory and, therefore, included in
the .tar.bz2 file.
After the creation of these files, cjarchiver renames the
target directory as <targetdirectory>.toberemoved/
As its name indicates, <targetdirectory>.toberemoved/ can be
deleted, but prior to that, we strongly recommend to check that
the .tar.bz2 file has been created.
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
-----------------------------------------------------------
OPTIONS
-h, --help: Shows a help message and exits.
-x <subdirectory>,
--exclude <subdirectory>: The user can specify subdirectories
to be excluded from archiving (only
first level subdirectories names, not
full path). It can be repeated for
additional subdirectories.
-----------------------------------------------------------
EXAMPLES
Archive directory "old_data":
cjarchive old_data
Archive directory "old_data" but exclude "old_data/bad_exp"
cjarchive old_data -x bad_exp
Archive directory "old_data" but exclude "old_data/bad_exp"
and "old_data/bad_data"
cjarchive old_data -x bad_exp -x bad_data
-----------------------------------------------------------
KNOWN ISSUES
cjarchiver uses the find command to create the manifest and
the md5sum files. It is known that find might fail when used
through NFS to access remote directories. We recommend to use
cjarchiver locally (i.e. directly where the target data is
located).