Grid computing systems such as the Open Science Grid and the NSF TeraGrid give users easy access to hundreds or thousands of CPUs at once. However, within computing grids, it is not always easy to access one’s data. Traditional filesystems such as NFS and AFS are not usable in most grid computing systems, because they require privileged access to install and use at both client and server side. A user of grid computing rarely has such access.
To remedy this problem, we have designed and implemented a variety of filesystems for grid computing, all based on the Parrot and Chirp software. These user-level tools can be deployed without special privileges into existing grids, and used to access data wherever it may be located. We work directly with users in bioinformatics and high energy physics to design and deploy production filesystem services. You can download and use our software from this page.
Related Publications
Wharf: Sharing Docker Images in a Distributed File System
Chao Zheng, Lukas Rupprecht, Vasily Tarasov, Douglas Thain, Mohamed Mohamed, Dimitrios Skourtis, Amit S. Warke, and Dean Hildrebarnd
@inproceedings{wharf-socc-2018,author={Zheng, Chao and Rupprecht, Lukas and Tarasov, Vasily and Thain, Douglas and Mohamed, Mohamed and Skourtis, Dimitrios and Warke, Amit S. and Hildrebarnd, Dean},title={{Wharf: Sharing Docker Images in a Distributed File System}},booktitle={{ACM Symposium on Cloud Computing}},pages={12},year={2018},note={{doi: 10.1145/3267809.3267836}},cclpaperid={957},keywords={filesystems, career, gridfs},}
Taming Metadata Storms in Parallel Filesystems with MetaFS
Tim Shaffer and Douglas Thain
In Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems, 2017
@inproceedings{metafs-pdsw-2017,author={Shaffer, Tim and Thain, Douglas},title={{Taming Metadata Storms in Parallel Filesystems with MetaFS}},booktitle={{Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems}},pages={25-30},year={2017},note={{doi: 10.1145/3149393.3149401}},cclpaperid={948},keywords={filesystems, career, gridfs},}
The Evolution of Global Scale Filesystems for Scientific Software Distribution
Jakob Blomer, Predrag Buncic, Rene Meusel, Gerardo Ganis, Igor Sfiligoi, and Douglas Thain
IEEE/AIP Computing in Science and Engineering, 2015
@article{globalfs-cise-2015,author={Blomer, Jakob and Buncic, Predrag and Meusel, Rene and Ganis, Gerardo and Sfiligoi, Igor and Thain, Douglas},title={{The Evolution of Global Scale Filesystems for Scientific Software Distribution}},journal={{IEEE/AIP Computing in Science and Engineering}},volume={17},number={6},pages={61-71},year={2015},note={{doi: 10.1109/MCSE.2015.111}},cclpaperid={926},keywords={parrot, filesystems, career, gridfs},}
Fine-Grained Access Control in the Chirp Distributed File System
Patrick Donnelly and Douglas Thain
In IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, 2012
@inproceedings{chirp-tickets-ccgrid12,author={Donnelly, Patrick and Thain, Douglas},title={{Fine-Grained Access Control in the Chirp Distributed File System}},booktitle={{IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing}},year={2012},note={{doi: 10.1109/CCGrid.2012.128}},cclpaperid={101},keywords={parrot, chirp, filesystems, career, gridfs},}
@incollection{chirp-didc-chapter,author={Thain, Douglas and Albrecht, Michael and Bui, Hoang and Bui, Peter and Carmichael, Rory and Emrich, Scott and Flynn, Patrick},title={{Data Intensive Computing with Clustered Chirp Servers}},editor={Kosar, Tevfik},booktitle={{Data Intensive Distributed Computing: Challenges and Solutions for Large Scale Information Management}},pages={140-154},publisher={IGI},year={2012},note={{isbn: 9781615209712}},doi={10.4018/978-1-61520-971-2.ch009},cclpaperid={99},keywords={chirp, filesystems, career, gridfs}}
Attaching Cloud Storage to a Campus Grid Using Parrot, Chirp, and Hadoop
Patrick Donnelly, Peter Bui, and Douglas Thain
In IEEE International Conference on Cloud Computing Technology and Science, 2010
@inproceedings{chirp+parrot+hdfs,author={Donnelly, Patrick and Bui, Peter and Thain, Douglas},title={{Attaching Cloud Storage to a Campus Grid Using Parrot, Chirp, and Hadoop}},booktitle={{IEEE International Conference on Cloud Computing Technology and Science}},pages={488-495},year={2010},note={{doi: 10.1109/CloudCom.2010.74}},cclpaperid={90},keywords={parrot, chirp, filesystems, career, gridfs},}
ROARS: A Scalable Repository for Data Intensive Scientific Computing
Hoang Bui, Peter Bui, Patrick Flynn, and Douglas Thain
In The Third International Workshop on Data Intensive Distributed Computing at ACM HPDC 2010, 2010
@inproceedings{roars-didc10,author={Bui, Hoang and Bui, Peter and Flynn, Patrick and Thain, Douglas},title={{ROARS: A Scalable Repository for Data Intensive Scientific Computing}},booktitle={{The Third International Workshop on Data Intensive Distributed Computing at ACM HPDC 2010}},year={2010},note={{doi: 10.1145/1851476.1851587}},cclpaperid={85},keywords={chirp, filesystems, career, hecura, gridfs},}
CDF Software Distribution on the Grid using Parrot
Gabrielle Compostella, Simone Pagan Griso, Donatella Lucchesi, Igor Sfiligoi, and Douglas Thain
@inproceedings{parrot-chep09,author={Compostella, Gabrielle and Griso, Simone Pagan and Lucchesi, Donatella and Sfiligoi, Igor and Thain, Douglas},title={{CDF Software Distribution on the Grid using Parrot}},booktitle={{Computing in High Energy Physics}},year={2009},note={{doi: 10.1088/1742-6596/219/6/062009}},cclpaperid={22},keywords={parrot, filesystems, career, gridfs},}
Experience with BXGrid: A Data Repository and Computing Grid for Biometrics Research
Hoang Bui, Michael Kelly, Christopher Lyon, Mark Pasquier, Deborah Thomas, Patrick Flynn, and Douglas Thain
@article{bxgrid-jcc,author={Bui, Hoang and Kelly, Michael and Lyon, Christopher and Pasquier, Mark and Thomas, Deborah and Flynn, Patrick and Thain, Douglas},title={{Experience with BXGrid: A Data Repository and Computing Grid for Biometrics Research}},journal={{Journal of Cluster Computing}},volume={12},number={4},pages={373},year={2009},note={{doi: 10.1007/s10586-009-0098-7}},cclpaperid={1},keywords={chirp, filesystems, career, gridfs},}
Chirp: A Practical Global Filesystem for Cluster and Grid Computing
Douglas Thain, Christopher Moretti, and Jeffrey Hemmes
@article{chirp-jgc,author={Thain, Douglas and Moretti, Christopher and Hemmes, Jeffrey},title={{Chirp: A Practical Global Filesystem for Cluster and Grid Computing}},journal={{Journal of Grid Computing}},volume={7},number={1},pages={51-72},year={2009},note={{doi: 10.1007/s10723-008-9100-5}},cclpaperid={14},keywords={parrot, chirp, filesystems, career, gridfs},}
Efficient Access to Many Small Files in a Filesystem for Grid Computing
@inproceedings{small-grid07,author={Thain, Douglas and Moretti, Christopher},title={{Efficient Access to Many Small Files in a Filesystem for Grid Computing}},booktitle={{IEEE Grid Computing}},pages={243-250},year={2007},note={{doi: 10.1109/GRID.2007.4354139}},cclpaperid={31},keywords={parrot, chirp, filesystems, career, gridfs},}
Flexible Object Based Filesystems for Scientific Computing
@thesis{moretti-ms-thesis,author={Moretti, Christopher},title={{Flexible Object Based Filesystems for Scientific Computing}},editor={Thesis, M.S.},booktitle={{University of Notre Dame}},year={2007},cclpaperid={65},keywords={chirp, filesystems, career, gridfs},}
Grid Deployment of Legacy Bioinformatics Applications with Transparent Data Access
Christophe Blanchet, Remi Mollon, Douglas Thain, and Gilbert Deleage
@inproceedings{bio-grid06,author={Blanchet, Christophe and Mollon, Remi and Thain, Douglas and Deleage, Gilbert},title={{Grid Deployment of Legacy Bioinformatics Applications with Transparent Data Access}},booktitle={{IEEE Grid Computing}},pages={120-127},year={2006},note={{doi: 10.1109/ICGRID.2006.311006}},cclpaperid={37},keywords={parrot, filesystems, career, gridfs},}
Operating System Support for Space Allocation in Grid Storage Systems
@inproceedings{alloc-grid06,author={Thain, Douglas},title={{Operating System Support for Space Allocation in Grid Storage Systems}},booktitle={{IEEE Grid Computing}},pages={104-111},year={2006},note={{doi: 10.1109/ICGRID.2006.311004}},cclpaperid={41},keywords={chirp, allocfs, filesystems, career, gridfs},}
Cacheable Decentralized Groups for Grid Resource Access Control
@article{transparent-ccpe,author={Klous, Sander and Frey, Jamie and Son, Se-Chang and Thain, Douglas and Roy, Alain and Livny, Miron and van den Brand, Jo},title={{Transparent Access to Grid Resources for User Software}},journal={{Concurrency and Computation: Practice and Experience}},volume={18},number={7},pages={787-801},year={2006},note={{doi: 10.1002/cpe.961 }},cclpaperid={17},keywords={parrot, filesystems, career, gridfs},}
Using Condor Glide-Ins and Parrot to Move from Dedicated Resources to the Grid
Stefano Belforte, Matthew Normal, Subir Sarkar, Ifor Sfiligoi, Douglas Thain, and Frank Wuerthwein
@article{parrot-lni06,author={Belforte, Stefano and Normal, Matthew and Sarkar, Subir and Sfiligoi, Ifor and Thain, Douglas and Wuerthwein, Frank},title={{Using Condor Glide-Ins and Parrot to Move from Dedicated Resources to the Grid}},journal={{Lecture Notes in Informatics}},volume={81},pages={285-292},year={2006},cclpaperid={49},keywords={parrot, filesystems, career, gridfs},}
Transparently Distributing CDF Software with Parrot
Douglas Thain, Christopher Moretti, and Igor Sfiligoi
@inproceedings{parrot-chep06,author={Thain, Douglas and Moretti, Christopher and Sfiligoi, Igor},title={{Transparently Distributing CDF Software with Parrot}},booktitle={{Computing in High Energy Physics}},pages={1-4},year={2006},cclpaperid={50},keywords={parrot, filesystems, career, gridfs},}
The Consequences of Decentralized Security in a Cooperative Storage System
Douglas Thain, Christopher Moretti, Paul Madrid, Phil Snowberger, and Jeff Hemmes
In Workshop on Security in Storage at IEEE FAST, 2005
@inproceedings{cons-sisw05,author={Thain, Douglas and Moretti, Christopher and Madrid, Paul and Snowberger, Phil and Hemmes, Jeff},title={{The Consequences of Decentralized Security in a Cooperative Storage System}},booktitle={{Workshop on Security in Storage at IEEE FAST}},pages={82-94},year={2005},note={{doi: 10.1109/SISW.2005.11}},cclpaperid={51},keywords={parrot, chirp, filesystems, career, gridfs},}
Separating Abstractions from Resources in a Tactical Storage System
Douglas Thain, Sander Klous, Justin Wozniak, Paul Brenner, Aaron Striegel, and Jesus Izaguirre
@inproceedings{tactical-sc05,author={Thain, Douglas and Klous, Sander and Wozniak, Justin and Brenner, Paul and Striegel, Aaron and Izaguirre, Jesus},title={{Separating Abstractions from Resources in a Tactical Storage System}},booktitle={{IEEE/ACM Supercomputing}},pages={55-67},year={2005},note={{doi: 10.1109/SC.2005.64}},cclpaperid={52},keywords={parrot, chirp, allocfs, filesystems, career, hecura, gridfs},}
Parrot: An Application Environment for Data-Intensive Computing
@article{parrot-scpe05,author={Thain, Douglas and Livny, Miron},title={{Parrot: An Application Environment for Data-Intensive Computing}},journal={{Scalable Computing: Practice and Experience}},volume={6},number={3},pages={9-18},year={2005},cclpaperid={18},keywords={parrot, filesystems, career, gridfs},}
Explicit Control in a Batch Aware Distributed File System
John Bent, Douglas Thain, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, and Miron Livny
In USENIX Networked Systems Design and Implementation (NSDI), 2004
@inproceedings{badfs-nsdi-04,author={Bent, John and Thain, Douglas and Arpaci-Dusseau, Andrea and Arpaci-Dusseau, Remzi and Livny, Miron},title={{Explicit Control in a Batch Aware Distributed File System}},booktitle={{USENIX Networked Systems Design and Implementation (NSDI)}},pages={365-378},year={2004},cclpaperid={58},keywords={filesystems, career, hecura, gridfs},}
Parrot: Transparent User-Level Middleware for Data Intensive Computing
Douglas Thain and Miron Livny
In Workshop on Adaptive Grid Middleware at PACT, 2003
@inproceedings{parrot-agm2003,author={Thain, Douglas and Livny, Miron},title={{Parrot: Transparent User-Level Middleware for Data Intensive Computing}},booktitle={{Workshop on Adaptive Grid Middleware at PACT}},year={2003},cclpaperid={67},keywords={parrot, filesystems, career, gridfs},}
The Kangaroo Approach to Data Movement on the Grid
Douglas Thain, Jim Basney, Se-Chang Son, and Miron Livny
In IEEE High Performance Distributed Computing, 2001
@inproceedings{kangaroo-hpdc01,author={Thain, Douglas and Basney, Jim and Son, Se-Chang and Livny, Miron},title={{The Kangaroo Approach to Data Movement on the Grid}},booktitle={{IEEE High Performance Distributed Computing}},pages={325-333},year={2001},note={{doi: 10.1109/HPDC.2001.945200}},cclpaperid={73},keywords={filesystems, career, gridfs},}