Abstract
Performance issues in cloud systems are hard to debug. Distributed tracing is a widely adopted approach that gives engineers visibility into cloud systems. Existing trace analysis approaches focus on debugging single request correctness issues but not debugging single request performance issues. Diagnosing a performance issue in a given request requires comparing the performance of the offending request with the aggregate performance of typical requests. Effective and efficient debugging of such issues faces three challenges: (i) identifying the correct aggregate data for diagnosis; (ii) visualizing the aggregated data; and (iii) efficiently collecting, storing, and processing trace data. We present TraVista, a tool designed for debugging performance issues in a single trace that addresses these challenges. TraVista extends the popular single trace Gantt chart visualization with three types of aggregate data - metric, temporal, and structure data, to contextualize the performance of the offending trace across all traces.
📄 Full Paper Available as PDF
This paper is available as a downloadable PDF.
📄 Download PDF
Comments (0)
No comments yet. Be the first to comment.