Skip to content
Snippets Groups Projects
Commit d1551bed authored by Jenny Thuy Dung Tran's avatar Jenny Thuy Dung Tran
Browse files

modifications

parent 4afe3162
No related branches found
No related tags found
No related merge requests found
......@@ -10,31 +10,32 @@ redirect_from:
---
# Data transfer between Atlas and UL HPC Clusters
A recommended storage pattern is to have the master copy of data on Atlas (project folder) and to only store data on the HPC Clusters temporarily for the duration of computation/analysis. All analysis performed on the HPC Clusters must therefore be regularly transferred to Atlas. This How-to Card describes the different methods to transfer data between Atlas and the UL HPC Clusters. The three recommended methods to transfer data are:
A recommended storage pattern is to have the master copy of data on Atlas (project folder) and to only store data on the UL HPC Clusters temporarily for the duration of computational analysis. All analysis performed on the UL HPC Clusters must therefore be regularly transferred to Atlas. This How-to Card describes the different methods to transfer data between Atlas and the UL HPC Clusters. The three recommended methods to transfer data are:
1. [Via laptop with ```scp``` or ```rsync```]()
2. [Via dedicated Virtual Machine (VM)]()
3. [Via Large File Transfer (LFT)]()
Please visit the dedicated knowledge bases to see how to [connect to UL HPC Clusters](https://hpc-docs.uni.lu/connect/access/) and to [mount to Atlas](https://service.uni.lu/sp?id=kb_article_view&sysparm_article=KB0010233).
<img src="img/data-transfer-flow.png">
## 1. Via laptop using scp or rsync
The most common commands for data transfers over SSH are ```scp``` and ```rsync```.
When using the UL laptop to transfer data between UL HPC Clusters and Atlas, you must [mount Atlas via smb on laptop](https://service.uni.lu/sp?id=kb_article_view&sysparm_article=KB0010233) before using ```scp``` and ```rsync``` for the transfer. While both commands ensure a secure transfer of data between the UL HPC Clusters and Atlas, ```rsync``` is in general preferred over ```scp``` due to it's selective incremental transfer of files, which makes it faster than ```scp```.
* ```scp```: transfers all files and directories.
* ```rsync```: transfers only the files which differs between the source and the destination.
While both commands ensure a secure transfer of data between the UL HPC Clusters and Atlas, ```rsync``` is in general preferred over ```scp``` due to it's selective incremental transfer of files, which makes it faster than ```scp```.
Before transferring the data over SSH, you must mount Atlas to your local laptop. Depending on your operating system, you can find a guide on the [Service portal](https://service.uni.lu/sp?id=kb_article_view&sysparm_article=KB0010233).
* ```rsync```: transfers only the files which differs between the source and the destination.
Finally, to use use ```rsync``` and ```scp``` for data transfer, please visit the [UL HPC documentation](https://hpc-docs.uni.lu/data/transfer/#data-transfer-tofromwithin-ul-hpc-clusters).
Please visit the [UL HPC documentation](https://hpc-docs.uni.lu/data/transfer/#data-transfer-tofromwithin-ul-hpc-clusters) to see how to use ```rsync``` and ```scp```.
## 2. Via dedicated Virtual Machine (VM) using rsync
Data can be transferred via a dedicated VM, which can be requested through [ServiceNow](https://service.uni.lu/sp?id=sc_cat_item&table=sc_cat_item&sys_id=49956812db3fa010ca53454039961978).
Data can be transferred via a dedicated VM, which can be requested via [ServiceNow](https://service.uni.lu/sp?id=sc_cat_item&table=sc_cat_item&sys_id=49956812db3fa010ca53454039961978).
Instead of transferring data between Atlas and UL HPC Clusters through the laptop as described above, the transfer will go through the dedicated VM. Once connected to the VM and mounted to Atlas, the ```rsync``` command can be use in similar way as described in the [UL HPC documentation]((https://hpc-docs.uni.lu/data/transfer/#data-transfer-tofromwithin-ul-hpc-clusters)). This method is recommended for **recurrent transfer of big data**.
> Note: For larger transfers between Atlas and UL HPC Clusters, you may want to run some operations in the background using ```screen``` or ```tmux```.
## 3. Via Large File Transfer (LFT)
An alternative solution for transferring data between Atlas and UL HPC Clusters is to use LFT, which can transfer high data volumes of several tera bytes. However, LFT is only an advantage to use, if data is already on LFT (e.g. received by collaborators). In this case, data can easily be uploaded to Atlas manually, ensuring a master copy of the raw data. A copy of the data can then directly be uploaded to UL HPC Clusters for computational analysis.
An alternative solution is to use LFT for transferring data between Atlas and UL HPC Clusters, which can transfer high data volumes of several tera bytes. However, LFT should only be used, if data is already on LFT (e.g. received by external collaborators). In this case, you can make a copy of the data and directly upload it to the UL HPC Clusters for computational analysis. A master copy of the data must then be manually uploaded to Atlas for internal archival.
Please visit the dedicated [How-to Card on LFT]({{'/?exchange-channels:lft' | relative_url }}) to see further information.
> NOTE: LFT is only recommended for the use case mentioned above, where raw data is already received via LFT, as this method requires installation of dedicated software and consists of lengthy steps in comparison to the first two methods.
\ No newline at end of file
> NOTE: LFT is only recommended for the use case mentioned above, where raw data were received via LFT. In other use cases, we recommend opting for method 1 or 2.
\ No newline at end of file
external/exchange-channels/atlas-hpc/img/data-transfer-flow.png

127 KiB

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment