--- _id: '20125' abstract: - lang: eng text: Datacenter applications have different resource requirements from network and developing flow scheduling heuristics for every workload is practically infeasible. In this paper, we show that deep reinforcement learning (RL) can be used to efficiently learn flow scheduling policies for different workloads without manual feature engineering. Specifically, we present LFS, which learns to optimize a high-level performance objective, e.g., maximize the number of flow admissions while meeting the deadlines. The LFS scheduler is trained through deep RL to learn a scheduling policy on continuous online flow arrivals. The evaluation results show that the trained LFS scheduler admits 1.05x more flows than the greedy flow scheduling heuristics under varying network load. author: - first_name: Asif full_name: Hasnain, Asif id: '63288' last_name: Hasnain - first_name: Holger full_name: Karl, Holger id: '126' last_name: Karl citation: ama: 'Hasnain A, Karl H. Learning Flow Scheduling. In: 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC). IEEE Computer Society. doi:https://doi.org/10.1109/CCNC49032.2021.9369514' apa: 'Hasnain, A., & Karl, H. (n.d.). Learning Flow Scheduling. In 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC). Las Vegas, USA: IEEE Computer Society. https://doi.org/10.1109/CCNC49032.2021.9369514' bibtex: '@inproceedings{Hasnain_Karl, title={Learning Flow Scheduling}, DOI={https://doi.org/10.1109/CCNC49032.2021.9369514}, booktitle={2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC)}, publisher={IEEE Computer Society}, author={Hasnain, Asif and Karl, Holger} }' chicago: Hasnain, Asif, and Holger Karl. “Learning Flow Scheduling.” In 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC). IEEE Computer Society, n.d. https://doi.org/10.1109/CCNC49032.2021.9369514. ieee: A. Hasnain and H. Karl, “Learning Flow Scheduling,” in 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC), Las Vegas, USA. mla: Hasnain, Asif, and Holger Karl. “Learning Flow Scheduling.” 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC), IEEE Computer Society, doi:https://doi.org/10.1109/CCNC49032.2021.9369514. short: 'A. Hasnain, H. Karl, in: 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC), IEEE Computer Society, n.d.' conference: end_date: 2021-01-12 location: Las Vegas, USA name: 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC) start_date: 2021-01-09 date_created: 2020-10-19T14:27:17Z date_updated: 2022-01-06T06:54:20Z ddc: - '000' department: - _id: '75' doi: https://doi.org/10.1109/CCNC49032.2021.9369514 keyword: - Flow scheduling - Deadlines - Reinforcement learning language: - iso: eng main_file_link: - url: https://ieeexplore.ieee.org/document/9369514 project: - _id: '4' name: SFB 901 - Project Area C - _id: '16' name: SFB 901 - Subproject C4 - _id: '1' name: SFB 901 publication: 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC) publication_status: accepted publisher: IEEE Computer Society status: public title: Learning Flow Scheduling type: conference user_id: '63288' year: '2021' ... --- _id: '21005' abstract: - lang: eng text: Data-parallel applications are developed using different data programming models, e.g., MapReduce, partition/aggregate. These models represent diverse resource requirements of application in a datacenter network, which can be represented by the coflow abstraction. The conventional method of creating hand-crafted coflow heuristics for admission or scheduling for different workloads is practically infeasible. In this paper, we propose a deep reinforcement learning (DRL)-based coflow admission scheme -- LCS -- that can learn an admission policy for a higher-level performance objective, i.e., maximize successful coflow admissions, without manual feature engineering. LCS is trained on a production trace, which has online coflow arrivals. The evaluation results show that LCS is able to learn a reasonable admission policy that admits more coflows than state-of-the-art Varys heuristic while meeting their deadlines. author: - first_name: Asif full_name: Hasnain, Asif id: '63288' last_name: Hasnain - first_name: Holger full_name: Karl, Holger id: '126' last_name: Karl citation: ama: 'Hasnain A, Karl H. Learning Coflow Admissions. In: IEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). IEEE Communications Society. doi:10.1109/INFOCOMWKSHPS51825.2021.9484599' apa: 'Hasnain, A., & Karl, H. (n.d.). Learning Coflow Admissions. In IEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). Vancouver BC Canada: IEEE Communications Society. https://doi.org/10.1109/INFOCOMWKSHPS51825.2021.9484599' bibtex: '@inproceedings{Hasnain_Karl, title={Learning Coflow Admissions}, DOI={10.1109/INFOCOMWKSHPS51825.2021.9484599}, booktitle={IEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)}, publisher={IEEE Communications Society}, author={Hasnain, Asif and Karl, Holger} }' chicago: Hasnain, Asif, and Holger Karl. “Learning Coflow Admissions.” In IEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). IEEE Communications Society, n.d. https://doi.org/10.1109/INFOCOMWKSHPS51825.2021.9484599. ieee: A. Hasnain and H. Karl, “Learning Coflow Admissions,” in IEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Vancouver BC Canada. mla: Hasnain, Asif, and Holger Karl. “Learning Coflow Admissions.” IEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), IEEE Communications Society, doi:10.1109/INFOCOMWKSHPS51825.2021.9484599. short: 'A. Hasnain, H. Karl, in: IEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), IEEE Communications Society, n.d.' conference: end_date: 2021-05-13 location: Vancouver BC Canada name: IEEE INFOCOM 2021 - IEEE Conference on Computer Communications start_date: 2021-05-10 date_created: 2021-01-16T18:24:19Z date_updated: 2022-01-06T06:54:42Z ddc: - '000' department: - _id: '75' doi: 10.1109/INFOCOMWKSHPS51825.2021.9484599 keyword: - Coflow scheduling - Reinforcement learning - Deadlines language: - iso: eng main_file_link: - url: https://ieeexplore.ieee.org/document/9484599 project: - _id: '16' name: SFB 901 - Subproject C4 - _id: '4' name: SFB 901 - Project Area C - _id: '1' name: SFB 901 publication: IEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) publication_status: accepted publisher: IEEE Communications Society related_material: link: - relation: confirmation url: https://ieeexplore.ieee.org/document/9484599 status: public title: Learning Coflow Admissions type: conference user_id: '63288' year: '2021' ... --- _id: '17082' abstract: - lang: eng text: Data-parallel applications run on cluster of servers in a datacenter and their communication triggers correlated resource demand on multiple links that can be abstracted as coflow. They often desire predictable network performance, which can be passed to network via coflow abstraction for application-aware network scheduling. In this paper, we propose a heuristic and an optimization algorithm for predictable network performance such that they guarantee coflows completion within their deadlines. The algorithms also ensure high network utilization, i.e., it's work-conserving, and avoids starvation of coflows. We evaluate both algorithms via trace-driven simulation and show that they admit 1.1x more coflows than the Varys scheme while meeting their deadlines. author: - first_name: Asif full_name: Hasnain, Asif id: '63288' last_name: Hasnain - first_name: Holger full_name: Karl, Holger id: '126' last_name: Karl citation: ama: 'Hasnain A, Karl H. Coflow Scheduling with Performance Guarantees for Data Center Applications. In: 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID). IEEE Computer Society; 2020. doi:https://doi.org/10.1109/CCGrid49817.2020.00010' apa: 'Hasnain, A., & Karl, H. (2020). Coflow Scheduling with Performance Guarantees for Data Center Applications. In 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID). Melbourne, Australia: IEEE Computer Society. https://doi.org/10.1109/CCGrid49817.2020.00010' bibtex: '@inproceedings{Hasnain_Karl_2020, title={Coflow Scheduling with Performance Guarantees for Data Center Applications}, DOI={https://doi.org/10.1109/CCGrid49817.2020.00010}, booktitle={2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID)}, publisher={IEEE Computer Society}, author={Hasnain, Asif and Karl, Holger}, year={2020} }' chicago: Hasnain, Asif, and Holger Karl. “Coflow Scheduling with Performance Guarantees for Data Center Applications.” In 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID). IEEE Computer Society, 2020. https://doi.org/10.1109/CCGrid49817.2020.00010. ieee: A. Hasnain and H. Karl, “Coflow Scheduling with Performance Guarantees for Data Center Applications,” in 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), Melbourne, Australia, 2020. mla: Hasnain, Asif, and Holger Karl. “Coflow Scheduling with Performance Guarantees for Data Center Applications.” 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), IEEE Computer Society, 2020, doi:https://doi.org/10.1109/CCGrid49817.2020.00010. short: 'A. Hasnain, H. Karl, in: 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), IEEE Computer Society, 2020.' conference: location: Melbourne, Australia name: 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID) date_created: 2020-06-06T07:40:45Z date_updated: 2022-01-06T06:53:04Z ddc: - '000' department: - _id: '75' doi: https://doi.org/10.1109/CCGrid49817.2020.00010 keyword: - Coflow - Scheduling - Deadlines - Data centers language: - iso: eng main_file_link: - url: https://ieeexplore.ieee.org/abstract/document/9139642 project: - _id: '4' name: SFB 901 - Project Area C - _id: '16' name: SFB 901 - Subproject C4 - _id: '1' name: SFB 901 publication: 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID) publication_status: published publisher: IEEE Computer Society status: public title: Coflow Scheduling with Performance Guarantees for Data Center Applications type: conference user_id: '63288' year: '2020' ...