Scalable Data Analysis Applications for High Energy Physics
PIs: Douglas Thain and Kevin Lannon
For more than 10 years, we have collaborated with Prof. Kevin Lannon and the CMS physics group at Notre Dame to design and build large scale data analysis applications that interpret data produced by the Compact Muon Solenoid detector at CERN. These applications are both interesting and challenging from a computer science perspective, because they must consume large quantities of data (Terabytes to Petabytes), scale up to thousands of nodes in clusters, and yet also remain reliable and responsive to the end user. Our latest work makes use of the TaskVine framework along with software such as Dask and Coffea as the foundation to create a variety of custom applications, including Lobster, TopEFT, DV4, RS-Triphoton, and more. We continue to innovate at the interface between computer science and physical science.
Related Publications
Reshaping High Energy Physics Applications for Near-Interactive Execution Using TaskVine
Barry Sly-Delgado, Ben Tovar, Jin Zhou, and Douglas Thain
@inproceedings{reshaping-sc-2024,author={Sly-Delgado, Barry and Tovar, Ben and Zhou, Jin and Thain, Douglas},title={{Reshaping High Energy Physics Applications for Near-Interactive Execution Using TaskVine}},booktitle={{ACM/IEEE Supercomputing}},pages={1-11},year={2024},cclpaperid={996},keywords={taskvine, hep},doi={10.1109/SC41406.2024.00068}}
Shepherd: Seamless Integration of Service Workflows into Task-Based Workflows through Log Monitoring
Saiful Islam and Douglas Thain
In Workshop on Workflows at ACM Supercomputing, 2024
@inproceedings{shepherd-works-2024,author={Islam, Saiful and Thain, Douglas},title={{Shepherd: Seamless Integration of Service Workflows into Task-Based Workflows through Log Monitoring}},booktitle={{Workshop on Workflows at ACM Supercomputing}},pages={1-8},year={2024},cclpaperid={997},keywords={shepherd},}
Dynamic Task Shaping for High Throughput Data Analysis Applications in High Energy Physics
Ben Tovar, Ben Lyons, Kelci Mohrman, Barry Sly-Delgado, Kevin Lannon, and Douglas Thain
In IEEE International Parallel and Distributed Processing Symposium, 2022
@inproceedings{topeft-ipdps-2022,author={Tovar, Ben and Lyons, Ben and Mohrman, Kelci and Sly-Delgado, Barry and Lannon, Kevin and Thain, Douglas},title={{Dynamic Task Shaping for High Throughput Data Analysis Applications in High Energy Physics}},booktitle={{IEEE International Parallel and Distributed Processing Symposium}},year={2022},note={{doi: 10.1109/IPDPS53621.2022.00041}},cclpaperid={979},keywords={workqueue, hep},}
Analysis Cyberinfrastructure: Challenges and Opportunities
Kevin Lannon, Paul Brenner, Michael Hildreth, Kenya Hurtado Anampa, Alan Malta, Rodrigues, Kelci Mohrman, Douglas Thain, and Ben Tovar
@inproceedings{analysis-snowmass-2022,author={Lannon, Kevin and Brenner, Paul and Hildreth, Michael and Anampa, Kenya Hurtado and Malta, Alan and Rodrigues and Mohrman, Kelci and Thain, Douglas and Tovar, Ben},title={{Analysis Cyberinfrastructure: Challenges and Opportunities}},booktitle={{Snowmass}},pages={1-14},year={2022},cclpaperid={986},keywords={hep},}
Scaling Data Intensive Physics Applications to 10k Cores on Non-Dedicated Clusters with Lobster
Anna Woodard, Matthias Wolf, Charles Mueller, Nil Valls, Ben Tovar, Patrick Donnelly, Peter Ivie, Kenyi Hurtado Anampa, Paul Brenner, Douglas Thain, Kevin Lannon, and Michael Hildreth
@inproceedings{lobster-cluster-2015,author={Woodard, Anna and Wolf, Matthias and Mueller, Charles and Valls, Nil and Tovar, Ben and Donnelly, Patrick and Ivie, Peter and Anampa, Kenyi Hurtado and Brenner, Paul and Thain, Douglas and Lannon, Kevin and Hildreth, Michael},title={{Scaling Data Intensive Physics Applications to 10k Cores on Non-Dedicated Clusters with Lobster}},booktitle={{IEEE Conference on Cluster Computing}},year={2015},cclpaperid={915},keywords={workqueue, hep},}