-
Notifications
You must be signed in to change notification settings - Fork 14
Copy Tools
Paul Nilsson edited this page Apr 25, 2022
·
1 revision
The Pilot is capable to use externally defined copy tools or available backend API for file transfers.
The logic implementing file transfer operation for stage-in
and stage-out
modes using corresponding library or external tool is isolated into dedicated Pilot copytool module.
Each copytool module relies on following settings to configure and c 10000 ustomize top Staging workflow (implemented in Data API) for file transfer operations:
parameter | type | default value | description |
---|---|---|---|
allowed_schemas |
list | any enabled for PandaQueue | a prioritized list of supported schemas for transfers by given copytool |
require_replicas |
boolean | False | indicates if given copytool requires input replicas to be resolved first from Rucio before stage-in
|
require_input_protocols |
boolean | False | indicates if given copytool requires input protocols and manual generation of input replicas for stage-in
|
require_protocols |
boolean | True | indicates if given copytool requires protocols to be resolved first for stage-out
|
check_availablespace |
boolean | True | indicates whether space check should be applied before stage-in transfers using given copytool |
resolve_surl |
handler | StagingClient.resolve_surl |
Get final destination SURL for file to be transferred. Can be customized at the level of specific copytool |
resolve_replica |
handler | StageInClient.resolve_replica |
Resolve input replica (matched by domain ) first according to primary_schemas , if not found then look up within allowed_schemas . Can be customized at the level of specific copytool |
In addition to these settings, each copytool module must implement following interface functions:
function signature | arguments | description |
---|---|---|
is_valid_for_copy_in(files) |
files : list of input FileSpec entries |
Check if passed files list is valid (allowed) for stage-in operation. Typically returns True
|
is_valid_for_copy_out(files) |
files : list of output FileSpec entries |
Check if passed files list is valid (allowed) for stage-out operation. Typically returns True
|
copy_in(files, **kwargs) |
|
Download (stage-in) given files using copytool related implementation. Copytool should update corresponding state fields of FileSpec object (status, status_code) |
copy_out(files, **kwargs) |
|
Upload (stage-out) given files using copytool related implementation. Copytool should update corresponding state fields of FileSpec object (status, status_code) |
The current range of supported copy tools is described below.
Copy tool | Require replicas (stage-in) |
Require input protocols (stage-in) |
Require protocols (stage-out) |
Check space (stage-in) |
Allowed schemas | description |
---|---|---|---|---|---|---|
gfal or gfal-copy
|
✔️ | ❌ | ✔️ | ✔️ | ['srm', 'gsiftp', 'https', 'davs', 'root'] |
GFAL2 tool (gfal-copy command) |
gs |
❌ | ✔️ | ✔️ | ✔️ | ['gs', 'srm', 'gsiftp', 'https', 'davs', 'root'] |
Google Cloud Storage (google.cloud API) |
lsm |
❌ | ❌ | ✔️ | ✔️ | ['srm', 'gsiftp', 'root'] |
Local site mover (lsm-get , lsm-put commands) |
mv |
❌ | ❌ | ✔️ | ❌ | any | Move file using filesystem commands (ln -s for stage-in, mv for stage-out) |
objectstore |
❌ | ✔️ | ✔️ | ✔️ | ['srm', 'gsiftp', 'https', 'davs', 'root', 's3', 's3+rucio'] |
Transfer files to OS storage using Rucio CLI (rucio download for stage-in, rucio upload for stage-out) |
rucio |
✔️ | ❌ | ❌ | ✔️ | any | Transfer files to RSE using Rucio python API (rucio.client.downloadclient , rucio.client.uploadclient ) |
s3 |
❌ | ✔️ | ✔️ | ✔️ | ['srm', 'gsiftp', 'https', 'davs', 'root', 's3', 's3+rucio'] |
Transfer files to Amazon Cloud Object Storage (S3 bucket) using boto3 python AWS API |
xrdcp |
✔️ | ❌ | ✔️ | ✔️ | ['root'] |
Transfer files using xrdcp command |
- Introduction
- Pilot Architecture
- Pilot Workflows
- Event service
- Metadata
- Direct Access
- Signal Handling
- Error Codes
- Containers
- Special Algorithms
- Pilot Configuration
- Timing Measurements
- Copy Tools
- Fallback Mechanism in Unified PanDA Queues
- Pilot release procedure