8000 Copy Tools · PanDAWMS/pilot3 Wiki · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Copy Tools

Paul Nilsson edited this page Apr 25, 2022 · 1 revision

The Pilot is capable to use externally defined copy tools or available backend API for file transfers.

The logic implementing file transfer operation for stage-in and stage-out modes using corresponding library or external tool is isolated into dedicated Pilot copytool module.

Each copytool module relies on following settings to configure and c 10000 ustomize top Staging workflow (implemented in Data API) for file transfer operations:

parameter type default value description
allowed_schemas list any enabled for PandaQueue a prioritized list of supported schemas for transfers by given copytool
require_replicas boolean False indicates if given copytool requires input replicas to be resolved first from Rucio before stage-in
require_input_protocols boolean False indicates if given copytool requires input protocols and manual generation of input replicas for stage-in
require_protocols boolean True indicates if given copytool requires protocols to be resolved first for stage-out
check_availablespace boolean True indicates whether space check should be applied before stage-in transfers using given copytool
resolve_surl handler StagingClient.resolve_surl Get final destination SURL for file to be transferred. Can be customized at the level of specific copytool
resolve_replica handler StageInClient.resolve_replica Resolve input replica (matched by domain) first according to primary_schemas, if not found then look up within allowed_schemas. Can be customized at the level of specific copytool

In addition to these settings, each copytool module must implement following interface functions:

function signature arguments description
is_valid_for_copy_in(files) files: list of input FileSpec entries Check if passed files list is valid (allowed) for stage-in operation. Typically returns True
is_valid_for_copy_out(files) files: list of output FileSpec entries Check if passed files list is valid (allowed) for stage-out operation. Typically returns True
copy_in(files, **kwargs)
  • files: list of FileSpec entries
  • kwargs: extra arguments passed by top workflow
Download (stage-in) given files using copytool related implementation. Copytool should update corresponding state fields of FileSpec object (status, status_code)
copy_out(files, **kwargs)
  • files: list of FileSpec entries
  • kwargs: extra arguments passed by top workflow
Upload (stage-out) given files using copytool related implementation. Copytool should update corresponding state fields of FileSpec object (status, status_code)

The current range of supported copy tools is described below.

Copy tool Require replicas
(stage-in)
Require input protocols
(stage-in)
Require protocols
(stage-out)
Check space
(stage-in)
Allowed schemas description
gfal or gfal-copy ✔️ ✔️ ✔️ ['srm', 'gsiftp', 'https', 'davs', 'root'] GFAL2 tool (gfal-copy command)
gs ✔️ ✔️ ✔️ ['gs', 'srm', 'gsiftp', 'https', 'davs', 'root'] Google Cloud Storage (google.cloud API)
lsm ✔️ ✔️ ['srm', 'gsiftp', 'root'] Local site mover (lsm-get, lsm-put commands)
mv ✔️ any Move file using filesystem commands (ln -s for stage-in, mv for stage-out)
objectstore ✔️ ✔️ ✔️ ['srm', 'gsiftp', 'https', 'davs', 'root', 's3', 's3+rucio'] Transfer files to OS storage using Rucio CLI (rucio download for stage-in, rucio upload for stage-out)
rucio ✔️ ✔️ any Transfer files to RSE using Rucio python API (rucio.client.downloadclient, rucio.client.uploadclient)
s3 ✔️ ✔️ ✔️ ['srm', 'gsiftp', 'https', 'davs', 'root', 's3', 's3+rucio'] Transfer files to Amazon Cloud Object Storage (S3 bucket) using boto3 python AWS API
xrdcp ✔️ ✔️ ✔️ ['root'] Transfer files using xrdcp command
Clone this wiki locally
0