Construct boundaries and place labels for multi-class scatterplots

2021 
Drawing boundaries and appending text labels for each class of multi-class scatterplot are two common steps to help people perceive and understand class-level spatial and semantic information hidden in the scatterplot. However, massive data points, highly overlapped classes, widespread outliers, extremely non-uniform density of data points lead to readability and scalability issues with existing methods. In this paper, we propose a set of methods that form a three-step framework to overcome these issues. We enable the boundary compact, readable, and controllable, and can find an ideal position that matches the human visual preference for each label. In the first step, we use a MST-based clustering algorithm to further divide classes into clusters and remove class-level outliers to avoid the distortion of boundaries. A stroke-based interaction is integrated into the algorithm, allowing the user to quickly correct the identified clusters or materialize the clusters in his or her mind. In the second step, we design a grid-based boundary construction pipeline which enables the user to tighten the boundary into the main distribution region of its corresponding class in a controlled manner by gradually filtering out cluster-level outliers. Gridding improves scalability at the scale of data points and helps users gain insights by generating different distributions of classes based on a relative or absolute density threshold. In the third step, by combining three factors: the boundary of the target cluster, the boundary of the label, and the density distribution of the target cluster, we can place the label closer to its visually ideal position. Rich illustrations and two cases demonstrate the effectiveness of our methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    50
    References
    0
    Citations
    NaN
    KQI
    []
    Baidu
    map