Pooria Namyar

Ph.D. Student, NSL@USC

  • (2019- ) Ph.D. in Electrical Engineering, University of Southern California, Los Angeles, USA.
  • (2019-2023) M.Sc. in Computer Science, University of Southern California, Los Angeles, USA.
  • (2014-2019) B.Sc. in Electrical Engineering, Sharif University of Technology, Tehran, IRAN.

I am currently on the job market.

[CV], [Research Statement], [Teaching Statement]

Sketch

I am a Ph.D. student in the ECE department at the University of Southern California. I am currently a member of Networked Systems Lab and very fortunate to be advised by Prof. Ramesh Govindan. My research is in the intersection of theory, systems, and machine learning. I focus on studying complex systems and enhancing their performance and availability at scale. Toward this goal, I develop (1) practical, scalable algorithms and optimization methods with formal guarantees and (2) systems and methods to identify, explain, and effectively address performance gaps in both handcrafted and learning-enabled heuristics.

Before joining USC, I completed my B.Sc. in Electrical Engineering at the Sharif University of Technology in 2019. During my undergraduate, I worked on the cloudification of telecommunication networks by leveraging SDN and NFV.

Feel free to drop me an email if you have any questions or want to discuss new ideas!

Work Experience

Graduate Research Assistant (Aug 2019 - present)
University of Southern California, Los Angeles, CA

Research Intern (Sep 2023 - present)
Google, Sunnyvale, CA
Mentors: Devdeep Ray, Yuliang Li, KK Yap, Nandita Dukkipati

Research Intern & Student Researcher (Jun 2022 - Jan 2023)
Microsoft Research, Redmond, WA
Mentors: Behnaz Arzani, Ryan Beckett, Srikanth Kandula

Research Intern & Student Researcher (Jun 2021 - Jan 2022)
Microsoft Research, Redmond, WA
Mentors: Behnaz Arzani, Dan Crankshaw, Srikanth Kandula

Awards

  • Google PhD Fellowship in Networking, 2024
  • Machine Learning and Systems Rising Star, MLCommons, 2024
  • Ming Hsieh Institute (MHI) Scholarship, University of Southern California, 2023 - 2024.
  • Graduate Student Annenberg Fellowship, University of Southern California, 2019 - 2023.

Selected Publications

  1. NSDI
    Solving Max-Min Fair Resource Allocations Quickly on Large Graphs
    Namyar, Pooria, Arzani, Behnaz, Kandula, Srikanth, Segarra, Santiago, Crankshaw, Daniel, Krishnaswamy, Umesh, Govindan, Ramesh, and Raj, Himanshu
    In 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24) Apr 2024

    We consider the max-min fair resource allocation problem. The best-known solutions use either a sequence of optimizations or waterfilling, which only applies to a narrow set of cases. These solutions have become a practical bottleneck in WAN traffic engineering and cluster scheduling, especially at larger problem sizes. We improve both approaches: (1) we show how to convert the optimization sequence into a single fast optimization, and (2) we generalize waterfilling to the multi-path case. We empirically show our new algorithms Pareto-dominate prior techniques: they produce faster, fairer, and more efficient allocations. Some of our allocators also have theoretical guarantees: they trade off a bounded amount of unfairness for faster allocation. We have deployed our allocators in Azure’s WAN traffic engineering pipeline, where we preserve solution quality and achieve a roughly 3× speedup.

  2. NSDI
    Finding Adversarial Inputs for Heuristics using Multi-level Optimization
    Namyar, Pooria, Arzani, Behnaz, Beckett, Ryan, Segarra, Santiago, Raj, Himanshu, Krishnaswamy, Umesh, Govindan, Ramesh, and Kandula, Srikanth
    In 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24) Apr 2024

    Production systems use heuristics because they are faster or scale better than their optimal counterparts. Yet, practitioners are often unaware of the performance gap between a heuristic and the optimum or between two heuristics in realistic scenarios. We present MetaOpt, a system that helps analyze heuristics. Users specify the heuristic and the optimal (or another heuristic) as input, and MetaOpt automatically encodes these efficiently for a solver to find performance gaps and their corresponding adversarial inputs. Its suite of built-in optimizations helps it scale its analysis to practical problem sizes. To show it is versatile, we used MetaOpt to analyze heuristics from three domains (traffic engineering, vector bin packing, and packet scheduling). We found a production traffic engineering heuristic can require 30% more capacity than the optimal to satisfy realistic demands. Based on the patterns in the adversarial inputs MetaOpt produced, we modified the heuristic to reduce its performance gap by 12.5×. We examined adversarial inputs to a vector bin packing heuristic and proved a new lower bound on its performance.

  3. SIGCOMM
    A Throughput-Centric View of the Performance of Datacenter Topologies
    Namyar, Pooria, Supittayapornpong, Sucha, Zhang, Mingyang, Yu, Minlan, and Govindan, Ramesh
    In Proceedings of the 2021 ACM SIGCOMM 2021 Conference Apr 2021

    While prior work has explored many proposed datacenter designs, only two designs, Clos-based and expander-based, are generally considered practical because they can scale using commodity switching chips. Prior work has used two different metrics, bisection bandwidth and throughput, for evaluating these topologies at scale. Little is known, theoretically or practically, how these metrics relate to each other. Exploiting characteristics of these topologies, we prove an upper bound on their throughput, then show that this upper bound better estimates worst-case throughput than all previously proposed throughput estimators and scales better than most of them. Using this upper bound, we show that for expander-based topologies, unlike Clos, beyond a certain size of the network, no topology can have full throughput, even if it has full bisection bandwidth; in fact, even relatively small expander-based topologies fail to achieve full throughput. We conclude by showing that using throughput to evaluate datacenter performance instead of bisection bandwidth can alter conclusions in prior work about datacenter cost, manageability, and reliability.

Other Publications

  1. NSDI
    Enhancing Network Failure Mitigation with Performance-Aware Ranking
    Namyar, Pooria, Ghavidel, Arvin, Crankshaw, Daniel, Berger, Daniel S, Hsieh, Kevin, Kandula, Srikanth, Govindan, Ramesh, and Arzani, Behnaz
    In 22nd USENIX Symposium on Networked Systems Design and Implementation (NSDI 25) 2025

    Cloud providers install mitigations to reduce the impact of network failures within their datacenters. Existing network mitigation systems rely on simple local criteria or global proxy metrics to determine the best action. In this paper, we show that we can support a broader range of actions and select more effective mitigations by directly optimizing end-to-end flow-level metrics and analyzing actions holistically. To achieve this, we develop novel techniques to quickly estimate the impact of different mitigations and rank them with high fidelity. Our results on incidents from a large cloud provider show orders of magnitude improvements in flow completion time and throughput. We also show our approach scales to large datacenters.

  2. NSDI
    Everything Matters in Programmable Packet Scheduling
    Alcoz, Albert Gran, Vass, Balázs, Namyar, Pooria, Arzani, Behnaz, Rétvári, Gábor, and Vanbever, Laurent
    In 22nd USENIX Symposium on Networked Systems Design and Implementation (NSDI 25) 2025

    Operators can deploy any scheduler they desire on existing switches through programmable packet schedulers: they tag packets with ranks (which indicate their priority) and schedule them in the order of these ranks. The ideal programmable scheduler is the Push-In First-Out (PIFO) queue, which schedules packets in a perfectly sorted order by “pushing” packets into any position of the queue based on their ranks. However, it is hard to implement PIFO queues in hardware due to their need to sort packets at line rate (based on their ranks). Recent proposals approximate PIFO behaviors on existing data-planes. While promising, they fail to simultaneously capture both of the necessary behaviors of PIFO queues: their scheduling behavior and admission control. We introduce PACKS, an approximate PIFO scheduler that addresses this problem. PACKS runs on top of a set of priority queues and uses packet-rank information and queue-occupancy levels during enqueue to determine whether to admit each incoming packet and to which queue it should be mapped. We fully implement PACKS in P4 and evaluate it on real workloads. We show that PACKS better approximates PIFO than state-of-the-art approaches. Specifically, PACKS reduces the rank inversions by up to 7× and 15× with respect to SP-PIFO and AIFO, and the number of packet drops by up to 60% compared to SP-PIFO. Under pFabric ranks, PACKS reduces the mean FCT across small flows by up to 33% and 2.6×, compared to SP-PIFO and AIFO. We also show that PACKS runs at line rate on existing hardware (Intel Tofino).

  3. HotNets
    End-to-End Performance Analysis of Learning-enabled Systems
    Namyar, Pooria, Schapira, Michael, Govindan, Ramesh, Segarra, Santiago, Beckett, Ryan, Kakarla, Siva Kesava Reddy, and Arzani, Behnaz
    In Proceedings of the 23rd ACM Workshop on Hot Topics in Networks 2024

    We propose a performance analysis tool for learning-enabled systems that allows operators to uncover potential performance issues before deploying DNNs in their systems. The tools that exist for this purpose require operators to faithfully model all components (a white-box approach) or do inefficient black-box local search. We propose a gray-box alternative, which eliminates the need to precisely model all the system’s components. Our approach is faster and finds substantially worse scenarios compared to prior work. We show that a state-of-the-art learning-enabled traffic engineering pipeline can underperform the optimal by 6\texttimes — a much higher number compared to what the authors found.

  4. HotNets
    Towards Safer Heuristics With XPlain
    Karimi, Pantea, Pirelli, Solal, Kakarla, Siva Kesava Reddy, Beckett, Ryan, Segarra, Santiago, Li, Beibin, Namyar, Pooria, and Arzani, Behnaz
    In Proceedings of the 23rd ACM Workshop on Hot Topics in Networks 2024

    Many problems that cloud operators solve are computationally expensive, and operators often use heuristic algorithms (that are faster and scale better than optimal) to solve them more efficiently. Heuristic analyzers enable operators to find when and by how much their heuristics underperform. However, these tools do not provide enough detail for operators to mitigate the heuristic’s impact in practice: they only discover a single input instance that causes the heuristic to underperform (and not the full set) and they do not explain why.We propose XPlain, a tool that extends these analyzers and helps operators understand when and why their heuristics underperform. We present promising initial results that show such an extension is viable.

  5. ToN
    Optimal Oblivious Routing With Concave Objectives for Structured Networks
    Chitavisutthivong, Kanatip, Supittayapornpong, Sucha, Namyar, Pooria, Zhang, Mingyang, Yu, Minlan, and Govindan, Ramesh
    IEEE/ACM Transactions on Networking 2023

    Oblivious routing distributes traffic from sources to destinations following predefined routes with rules independent of traffic demands. While finding optimal oblivious routing with a concave objective is intractable for general topologies, we show that it is tractable for structured topologies often used in datacenter networks. To achieve this, we apply graph automorphism and prove the existence of the optimal automorphism-invariant solution. This result reduces the search space to targeting the optimal automorphism-invariant solution. We design an iterative algorithm to obtain such a solution by alternating between convex optimization and a linear program. The convex optimization finds an automorphism-invariant solution based on representative variables and constraints, making the problem tractable. The linear program generates adversarial demands to ensure the final result satisfies all possible demands. Since the construction of the representative variables and constraints are combinatorial problems, we design polynomial-time algorithms for the construction. We evaluate the iterative algorithm in terms of throughput performance, scalability, and generality over three potential applications. The algorithm i) improves the throughput up to 87.5% for partially deployed FatTree and achieves up to 2.55\times throughput gain for DRing over heuristic algorithms, ii) scales for three considered topologies with a thousand switches, iii) applies to a general structured topology with non-uniform link capacity and server distribution.

  6. HotNets
    Minding the Gap between Fast Heuristics and Their Optimal Counterparts
    Namyar, Pooria, Arzani, Behnaz, Beckett, Ryan, Segarra, Santiago, Raj, Himanshu, and Kandula, Srikanth
    In Proceedings of the 21st ACM Workshop on Hot Topics in Networks 2022

    Production systems use heuristics because they are faster or scale better than the corresponding optimal algorithms. Yet, practitioners are often unaware of how worse off a heuristic’s solution may be with respect to the optimum in realistic scenarios. Leveraging two-stage games and convex optimization, we present a provable framework that unveils settings where a given heuristic underperforms.

  7. INFOCOM
    Optimal Oblivious Routing for Structured Networks
    Supittayapornpong, Sucha, Namyar, Pooria, Zhang, Mingyang, Yu, Minlan, and Govindan, Ramesh
    In IEEE INFOCOM 2022 - IEEE Conference on Computer Communications 2022

    Oblivious routing distributes traffic from sources to destinations following predefined routes with rules independent of traffic demands. While finding optimal oblivious routing is intractable for general topologies, we show that it is tractable for structured topologies often used in datacenter networks. To achieve this, we apply graph automorphism and prove the existence of the optimal automorphism-invariant solution. This result reduces the search space to targeting the optimal automorphism-invariant solution. We design an iterative algorithm to obtain such a solution by alternating between two linear programs. The first program finds an automorphism-invariant solution based on representative variables and constraints, making the problem tractable. The second program generates adversarial demands to ensure the final result satisfies all possible demands. Since, the construction of the representative variables and constraints are combinatorial problems, we design polynomial-time algorithms for the construction. We evaluate proposed iterative algorithm in terms of throughput performance, scalability, and generality over three potential applications. The algorithm i) improves the throughput up to 87.5% over a heuristic algorithm for partially deployed FatTree, ii) scales for FatClique with a thousand switches, iii) is applicable to a general structured topology with non-uniform link capacity and server distribution.