{"publisher":"Association for Computing Machinery","language":[{"iso":"eng"}],"keyword":["electron repulsion integrals","quantum chemistry","atomistic simulation","overlay architecture","fpga acceleration"],"place":"New York, NY, USA","type":"conference","year":"2026","publication":"Proceedings of the 2026 ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA '26)","date_created":"2026-02-06T06:43:22Z","abstract":[{"lang":"eng","text":"The computation of highly contracted electron repulsion integrals (ERIs) is essential to achieve quantum accuracy in atomistic simulations based on quantum mechanics. Its growing computational demands make energy efficiency a critical concern. Recent studies demonstrate FPGAs’ superior performance and energy efficiency for computing primitive ERIs, but the computation of highly contracted ERIs introduces significant algorithmic complexity and new design challenges for FPGA acceleration.In this work, we present SORCERI, the first streaming overlay acceleration for highly contracted ERI computations on FPGAs. SORCERI introduces a novel streaming Rys computing unit to calculate roots and weights of Rys polynomials on-chip, and a streaming contraction unit for the contraction of primitive ERIs. This shifts the design bottleneck from limited CPU-FPGA communication bandwidth to available FPGA computation resources. To address practical deployment challenges for a large number of quartet classes, we design three streaming overlays, together with an efficient memory transpose optimization, to cover the 21 most commonly used quartet classes in realistic atomistic simulations. To address the new computation constraints, we use flexible calculation stages with a free-running streaming architecture to achieve high DSP utilization and good timing closure.Experiments demonstrate that SORCERI achieves an average 5.96x, 1.99x, and 1.16x better performance per watt than libint on a 64-core AMD EPYC 7713 CPU, libintx on an Nvidia A40 GPU, and SERI, the prior best-performing FPGA design for primitive ERIs. Furthermore, SORCERI reaches a peak throughput of 44.11 GERIS (109 ERIs per second) that is 1.52x, 1.13x, and 1.93x greater than libint, libintx and SERI, respectively. SORCERI will be released soon at https://github.com/SFU-HiAccel/SORCERI."}],"project":[{"_id":"52","name":"Computing Resources Provided by the Paderborn Center for Parallel Computing"}],"status":"public","main_file_link":[{"url":"https://dl.acm.org/doi/10.1145/3748173.3779198"}],"title":"SORCERI: Streaming Overlay Acceleration for Highly Contracted Electron Repulsion Integral Computations in Quantum Chemistry","author":[{"last_name":"Stachura","first_name":"Philip","full_name":"Stachura, Philip"},{"id":"77439","full_name":"Wu, Xin","first_name":"Xin","last_name":"Wu"},{"id":"16153","last_name":"Plessl","first_name":"Christian","full_name":"Plessl, Christian","orcid":"0000-0001-5728-9982"},{"full_name":"Fang, Zhenman","last_name":"Fang","first_name":"Zhenman"}],"citation":{"bibtex":"@inproceedings{Stachura_Wu_Plessl_Fang_2026, place={New York, NY, USA}, title={SORCERI: Streaming Overlay Acceleration for Highly Contracted Electron Repulsion Integral Computations in Quantum Chemistry}, DOI={10.1145/3748173.3779198}, booktitle={Proceedings of the 2026 ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA ’26)}, publisher={Association for Computing Machinery}, author={Stachura, Philip and Wu, Xin and Plessl, Christian and Fang, Zhenman}, year={2026}, pages={224–234} }","chicago":"Stachura, Philip, Xin Wu, Christian Plessl, and Zhenman Fang. “SORCERI: Streaming Overlay Acceleration for Highly Contracted Electron Repulsion Integral Computations in Quantum Chemistry.” In Proceedings of the 2026 ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA ’26), 224–34. New York, NY, USA: Association for Computing Machinery, 2026. https://doi.org/10.1145/3748173.3779198.","short":"P. Stachura, X. Wu, C. Plessl, Z. Fang, in: Proceedings of the 2026 ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA ’26), Association for Computing Machinery, New York, NY, USA, 2026, pp. 224–234.","apa":"Stachura, P., Wu, X., Plessl, C., & Fang, Z. (2026). SORCERI: Streaming Overlay Acceleration for Highly Contracted Electron Repulsion Integral Computations in Quantum Chemistry. Proceedings of the 2026 ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA ’26), 224–234. https://doi.org/10.1145/3748173.3779198","ama":"Stachura P, Wu X, Plessl C, Fang Z. SORCERI: Streaming Overlay Acceleration for Highly Contracted Electron Repulsion Integral Computations in Quantum Chemistry. In: Proceedings of the 2026 ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA ’26). Association for Computing Machinery; 2026:224-234. doi:10.1145/3748173.3779198","mla":"Stachura, Philip, et al. “SORCERI: Streaming Overlay Acceleration for Highly Contracted Electron Repulsion Integral Computations in Quantum Chemistry.” Proceedings of the 2026 ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA ’26), Association for Computing Machinery, 2026, pp. 224–34, doi:10.1145/3748173.3779198.","ieee":"P. Stachura, X. Wu, C. Plessl, and Z. Fang, “SORCERI: Streaming Overlay Acceleration for Highly Contracted Electron Repulsion Integral Computations in Quantum Chemistry,” in Proceedings of the 2026 ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA ’26), 2026, pp. 224–234, doi: 10.1145/3748173.3779198."},"date_updated":"2026-02-09T09:16:32Z","page":"224-234","department":[{"_id":"27"},{"_id":"518"}],"publication_status":"published","publication_identifier":{"isbn":["9798400720796"]},"user_id":"77439","doi":"10.1145/3748173.3779198","_id":"63890"}