I am a third-year Ph.D. student in the ECE department at the University of Southern California. I am currently a member of Networked Systems Lab and very fortunate to be advised by Prof. Ramesh Govindan. Generally, I am interested in every aspect of computer networks and systems. My recent focus is on datacenter networks, wide-area networks, network availability, and reliability.
Before joining USC, I completed my B.Sc. in Electrical Engineering at the Sharif University of Technology in 2019. During my undergraduate, I worked on the cloudification of telecommunication networks by leveraging SDN and NFV.
Feel free to drop me an email if you have any questions or want to discuss new ideas!
Graduate Research Assistant (Aug 2019 - present)
University of Southern California, Los Angeles, CA
Research Intern (Jun 2022 - present)
Microsoft Research, Redmond, WA
Mentors: Behnaz Arzani, Ryan Beckett, Srikanth Kandula
Research Intern (Jun 2021 - Jan 2022)
Microsoft Research, Redmond, WA
Mentors: Behnaz Arzani, Dan Crankshaw, Srikanth Kandula
Production systems use heuristics because they are faster or scale better than the corresponding optimal algorithms. Yet, practitioners are often unaware of how worse off a heuristic’s solution may be with respect to the optimum in realistic scenarios. Leveraging two-stage games and convex optimization, we present a provable framework that unveils settings where a given heuristic underperforms.
Oblivious routing distributes traffic from sources to destinations following predefined routes with rules independent of traffic demands. While finding optimal oblivious routing is intractable for general topologies, we show that it is tractable for structured topologies often used in datacenter networks. To achieve this, we apply graph automorphism and prove the existence of the optimal automorphism-invariant solution. This result reduces the search space to targeting the optimal automorphism-invariant solution. We design an iterative algorithm to obtain such a solution by alternating between two linear programs. The first program finds an automorphism-invariant solution based on representative variables and constraints, making the problem tractable. The second program generates adversarial demands to ensure the final result satisfies all possible demands. Since, the construction of the representative variables and constraints are combinatorial problems, we design polynomial-time algorithms for the construction. We evaluate proposed iterative algorithm in terms of throughput performance, scalability, and generality over three potential applications. The algorithm i) improves the throughput up to 87.5% over a heuristic algorithm for partially deployed FatTree, ii) scales for FatClique with a thousand switches, iii) is applicable to a general structured topology with non-uniform link capacity and server distribution.
While prior work has explored many proposed datacenter designs, only two designs, Clos-based and expander-based, are generally considered practical because they can scale using commodity switching chips. Prior work has used two different metrics, bisection bandwidth and throughput, for evaluating these topologies at scale. Little is known, theoretically or practically, how these metrics relate to each other. Exploiting characteristics of these topologies, we prove an upper bound on their throughput, then show that this upper bound better estimates worst-case throughput than all previously proposed throughput estimators and scales better than most of them. Using this upper bound, we show that for expander-based topologies, unlike Clos, beyond a certain size of the network, no topology can have full throughput, even if it has full bisection bandwidth; in fact, even relatively small expander-based topologies fail to achieve full throughput. We conclude by showing that using throughput to evaluate datacenter performance instead of bisection bandwidth can alter conclusions in prior work about datacenter cost, manageability, and reliability.