网站首页 > 技术文章 正文
作者:苏有熊
在GitHub上发现了一个很不错的图表示例库,特分享给大家。
可以从中看到完整而丰富的图表形式,通过对这些图表示例的观察和对其说明性文字的阅读,可以加深对可视化的理解。
原文为英文,部分中文翻译可能语法有问题,请结合中英文查看。
全文源地址在:
https://vega.github.io/vega/
具体某一单图的源地址在图表名称下方。
将通过上下两篇推文来介绍这个图表示例库。
Bar Charts
Bar Chart
https://vega.github.io/vega/examples/bar-chart/
Stacked Bar Chart
https://vega.github.io/vega/examples/
A stacked bar chart depicts the sum of series of quantitative values using layered bars, while still enabling inspection of individual series.
堆叠柱状图描述了使用分层柱状图的一系列定量值的总和,同时仍然支持对单个系列的检查。
Grouped Bar Chart
https://vega.github.io/vega/examples/grouped-bar-chart/
This grouped bar chart facets the data into groups, then creates a bar chart for each sub-group.
这组条形图将数据切面分组,然后为每个子组创建条形图。
Nested Bar Chart
https://vega.github.io/vega/examples/nested-bar-chart/
This nested bar chart depicts aggregated values across multiple categories. The input data is subdivided according to two fields (with uneven category membership). Each sub-group is then aggregated to show the average value of a third, quantitative field.
这个嵌套条形图描述了跨多个类别的聚合值。输入数据根据两个字段进行细分(类别成员不均匀)。然后将每个子组聚合起来显示第三个定量字段的平均值。
Population Pyramid
https://vega.github.io/vega/examples/population-pyramid/
A population pyramid shows the distribution of age groups within a population. This example shows males and females across 150 years of U.S. census data. Drag the slider to see the U.S. population change over time (but watch out for missing data in 1890!).
人口金字塔显示了人口中各年龄段的分布情况。这个例子展示了150年来美国人口普查数据中的男性和女性。拖动滑块可以看到美国人口随时间的变化(但是要注意1890年丢失的数据!)
Line & Area Charts
Line Chart
https://vega.github.io/vega/examples/line-chart/
Line charts are used to depict changing values, with line slopes conveying rates of change. Different interpolators change the curvature of the line. Options such as cardinal or Catmull-Rom interpolation can produce pleasing curves, but can also “hallucinate” maximum or minimum values that do not exist in the data. Use monotone interpolation for smooth curves that faithfully preserve monotonicity.
线形图用于描述变化的值,而线形斜率则表示变化的速率。不同的插值器会改变曲线的曲率。基数插值或Catmull-Rom插值等选项可以生成令人满意的曲线,但也可以“幻觉”数据中不存在的最大值或最小值。使用单调插值平滑曲线,忠实地保持单调。
Area Chart
https://vega.github.io/vega/examples/area-chart/
An area chart uses a filled shape to show changes in a quantitative value.
面积图使用填充的形状来显示定量值的变化。
Stacked Area Chart
https://vega.github.io/vega/examples/stacked-area-chart/
A stacked area chart depicts the sum of series of quantitative values using layered areas, while still enabling inspection of individual series.
堆叠区域图描述了使用分层区域的一系列定量值的总和,同时仍然支持对单个序列的检查。
Horizon Graph
https://vega.github.io/vega/examples/horizon-graph/
By dividing an area chart into consecutive layers, horizon graphs present time-series data in a compact space while preserving resolution. Click the chart to change the number of layers. Though the chart size changes, the spatial resolution of the area chart stays constant.
通过将区域图划分成连续的层,水平图在保持分辨率的同时,在紧凑的空间中呈现时间序列数据。单击此图表可更改层数。虽然图的大小发生了变化,但是面积图的空间分辨率保持不变。
Job Voyager
https://vega.github.io/vega/examples/job-voyager/
The Job Voyager is an interactive stacked graph of occupations based on 150 years of U.S. census data. Either click elements or type queries to explore. The search box supports regular expressions; for example the query "ist$" finds all jobs ending with “ist”.
“工作旅行者”是一个基于150年美国人口普查数据的职业互动叠加图。
Circular Charts
Pie Chart
https://vega.github.io/vega/examples/pie-chart/
A pie chart encodes proportional differences among a set of numeric values as the angular extent and area of a circular slice.
饼图将一组数值之间的比例差异编码为圆形切片的角度范围和面积。
Donut Chart
https://vega.github.io/vega/examples/donut-chart/
A donut chart encodes proportional differences among a set of numeric values using angular extents.
甜甜圈图使用角区段编码一组数值之间的比例差异。
Radial Plot
https://vega.github.io/vega/examples/radial-plot/
This radial plot uses both angular and radial extent to convey multiple dimensions of data. However, this approach is not perceptually effective, as viewers will most likely be drawn to the total area of the shape, conflating the two dimensions. This example also demonstrates one way to add labels to circular plots.
这个径向图同时使用角度和径向范围来表示多个维度的数据。然而,这种方法在感知上并不有效,因为观众很可能会被吸引到形状的整个区域,将两个维度合并起来。这个例子还演示了向循环图添加标签的一种方法。
Dot & Scatter Plots
Scatter Plot
https://vega.github.io/vega/examples/scatter-plot/
Scatter plots are ideal for visualizing the relationship between two quantitative variables. This example plots horsepower vs. mileage for a data set of cars. A size encoding is used to additionally depict acceleration.
散点图是显示两个定量变量之间关系的理想方法。这个例子为汽车的数据集绘制了马力与里程的关系图。尺寸编码还用于描述加速度。
Scatter Plot Null Values
https://vega.github.io/vega/examples/scatter-plot-null-values/
A configurable scatter plot of movie statistics, including IMDB and Rotten Tomatoes review scores. Null values in one or more dimensions are depicted along the margins to better convey missing values. Tooltips are included for interactive inspection of individual movies.
一个可配置的电影统计散点图,包括IMDB和烂番茄影评评分。一个或多个维度中的空值沿着边距进行描述,以便更好地传达缺失的值。工具提示包括对单个电影的交互式检查。
Connected Scatter Plot
https://vega.github.io/vega/examples/connected-scatter-plot/
A connected scatter plot uses line segments to connect consecutive scatter plot points, for example to illustrate trajectories over time. This example shows the shifting relationship between the price of gas and the average number of miles driven in a year, adapted from Driving Shifts Into Reverse by Hannah Fairfield, The New York Times (May 2, 2010).
一个连通的散点图使用线段来连接连续的散点图点,例如用来说明随时间变化的轨迹。这个例子显示了汽油价格和一年平均行驶里程之间的变化关系,这是由汉娜费尔菲尔德的《纽约时报》(2010年5月2日)改编而成。
Error Bars
https://vega.github.io/vega/examples/error-bars/
A dot plot of average yields for a variety of barley strains, with error bars indicating the spread of values. Vega can visualize pre-calculated error ranges or apply a number of standard measures. Use the drop down menu to visualize different measures of spread, including the 95% confidence interval of the mean (calculated via bootstrapping), standard error, standard deviation, and the interquartile range.
不同大麦品种平均产量的点图,误差条表示值的分布。织女星可以可视化预先计算的误差范围或应用一些标准的措施。使用下拉菜单可以可视化不同的扩展度量,包括平均值的95%置信区间(通过引导计算)、标准误差、标准偏差和四分位数范围。
Barley Trellis Plot
https://vega.github.io/vega/examples/barley-trellis-plot/
A trellis plot subdivides a chart into small multiples to isolate specific subsets and promote comparison. This example shows barley yields by variety at different sites, adapted from the original Trellis Display article by Becker et al.
格状图将图表细分成多个小倍数,以分离特定的子集并促进比较。这个例子显示了大麦产量在不同地点的品种,改编自贝克尔等人的原始格子显示文章。
Distributions
Histogram
https://vega.github.io/vega/examples/histogram/
A histogram subdivides a numerical range into bins, and counts the number of data points with each segment. The resulting bar chart provides a discrete estimate of the probability density function.
直方图将一个数值范围细分为bin,并计算每个段的数据点个数。得到的柱状图提供了概率密度函数的离散估计。
Histogram Null Values
https://vega.github.io/vega/examples/histogram-null-values/
This example demonstrates a histogram over a numerical range, with a segment to show the prevalence of null values.
这个例子展示了一个数字范围内的直方图,其中有一个段显示空值的分布。
Probability Density
https://vega.github.io/vega/examples/probability-density/
Visual comparison of estimated probability distributions for a sample of numeric values: a normal (Gaussian) distribution parameterized by the mean and standard deviation, and a kernel density estimate. This example supports estimates of either probability density functions (pdf) or cumulative distribution functions (cdf), using Vega’s density transform.
数值样本的估计概率分布的直观比较:由平均值和标准差参数化的正态(高斯)分布和核密度估计。这个例子支持使用Vega的密度变换估计概率密度函数(pdf)或累积分布函数(cdf)。
Box Plot
https://vega.github.io/vega/examples/box-plot/
A box plot summarizes a distribution of quantitative values using a set of summary statistics. Here, the boxes show the interquartile range (IQR), with the white bar indicating the median value. The thin lines (“whiskers”) currently show the extent of the minimum and maximum values; other values, such as whiskers extending 1.5 * IQR from each end of the box, are often used as well. See the violin plot example for an alternative approach.
方框图使用一组汇总统计信息总结定量值的分布。这里,方框显示四分位数范围(IQR),其中白色的条形表示中值。细线(“须”)目前显示的范围的最小值和最大值;其他的值,例如从盒子的每一端延伸1.5 * IQR的晶须,也经常被使用。参见小提琴的情节示例了解另一种方法。
Violin Plot
https://vega.github.io/vega/examples/violin-plot/
A violin plot visualizes a distribution of quantitative values as a continuous approximation of the probability density function, computed using kernel density estimation (KDE). The densities are additionally annotated with the median value and interquartile range, shown as black lines. Violin plots can be more informative than classical box plots.
小提琴图将定量值的分布可视化为概率密度函数的连续近似,使用核密度估计(KDE)计算。密度还用中值和四分位数范围标注,如黑色线所示。小提琴的情节可以比经典的箱型图提供更多的信息。
Top K Plot
https://vega.github.io/vega/examples/top-k-plot/
A plot of the top-k film directors by aggregate worldwide gross. Performs an aggregation of all directors, ranks them, and filters to only the top results. See the Top-K Plot With Others example to see a variant that combines all remaining directors into an “Others” category.
全球票房总额排名前k位的电影导演的故事情节。执行所有指示符的聚合,对它们进行排序,并只过滤到最上面的结果。查看Top-K Plot和其他示例,可以看到将所有剩余的导演组合到“其他”类别中的变体。
Top K Plot With Others
https://vega.github.io/vega/examples/top-k-plot-with-others/
A plot of the top-k film directors, plus all other directors, by aggregate worldwide gross. Unlike the Top-K Plot example, this chart includes a category of all other directors aggregated together. The visualization spec first computes aggregates for all directors and ranks them. It then copies these ranks back to the source data using a lookup transform, and determines which directors belong in the “other” category before performing a final aggregation.
按全球总票房计算,排名前k的电影导演加上其他所有导演的剧情。与Top-K Plot示例不同,此图表包含所有其他导演的类别。可视化规范首先计算所有导演的聚合并对其进行排序。然后,它使用查找转换将这些排序复制回源数据,并在执行最终聚合之前确定哪些董事属于“其他”类别。
Binned Scatter Plot
https://vega.github.io/vega/examples/binned-scatter-plot/
与标准散点图相比,binned散点图是一种更具可伸缩性的替代方法。将数据点分组到各个箱子中,并使用汇总统计信息对每个箱子进行汇总。这里我们使用一个圆形区域编码来描述记录的数量,可视化数据点的密度。对于较高的箱数量,可能会使用颜色,尽管会失去一些知觉比较的准确性。
Contour Plot
https://vega.github.io/vega/examples/contour-plot/
A contour plot depicts the density of data points using a set of discrete levels. Akin to contour lines on topographic maps, each contour boundary is an isoline of constant density. Kernel density estimation is performed to generate a continuous approximation of the sample density. Vega uses the d3-contour module to perform density estimation and generate contours in the form of GeoJSON polygons.
等高线图使用一组离散的水平来描述数据点的密度。与地形图上的等高线类似,每个等高线边界都是等密度等值线。进行核密度估计,生成样本密度的连续近似。Vega使用d3-contour模块进行密度估计,生成GeoJSON多边形形式的轮廓。
Wheat Plot
https://vega.github.io/vega/examples/wheat-plot/
A wheat plot is an alternative to standard dot plots and histograms that incorporates aspects of both. The x-coordinate of a point is based on its exact value. The y-coordinate is determined by grouping points into histogram bins, then stacking them based on their rank order within each bin. While not scalable to large numbers of data points, wheat plots allow inspection of (and interaction with) individual points without overplotting. For a related approach, see beeswarm plots.
小麦图是标准点图和直方图的另一种选择,两者都包含这两个方面。点的x坐标是根据它的精确值确定的。y坐标的确定是通过将点分组到直方图bin中,然后根据每个bin中点的排列顺序将它们堆叠起来。虽然无法扩展到大量的数据点,但是小麦块允许检查(和交互)单个点,而不需要过度绘图。有关方法,请参见beeswarm plot。
Hypothetical Outcome Plots
https://vega.github.io/vega/examples/hypothetical-outcome-plots/
Rather than showing a continuous probability distribution, Hypothetical Outcome Plots (or HOPs) visualize a set of draws from a distribution, where each draw is shown as a new plot in either a small multiples or animated form. Here we use Vega’s timer event to produce animated frames.
This example – inspired by The New York Times – displays random draws for a simulated time-series of values (these could be sales or employment statistics). The noise signal determines the amount of random variation added to the signal. The trend signal determines the strength of a linear trend, where zero corresponds to no trend at all (a flat uniform distribution). When the noise is high enough, draws from a distribution without any underlying trend may cause us to “hallucinate” interesting variations! Viewing the different frames may help viewers get a more visceral sense of random variation.
假设结果图(或跃点)不是显示连续的概率分布,而是从分布中可视化一组绘图,其中每个绘图都以小倍数或动画形式显示为一个新绘图。这里我们使用Vega的定时器事件来生成动画帧。
这个例子受到《纽约时报》的启发,显示了模拟时间序列值的随机抽取(这些值可以是销售或就业统计数据)。噪声信号决定了加到信号中的随机变化量。趋势信号决定了线性趋势的强度,其中零对应于没有趋势(平坦的均匀分布)。当噪音足够高时,从一个没有任何潜在趋势的分布中提取可能会导致我们“幻觉”有趣的变化!观看不同的画面可以帮助观众获得一种更发自内心的随机变化感。
Regression
https://vega.github.io/vega/examples/regression
A two-dimensional regression analysis models one data variable as a function of another. The resulting model produces a trend line that summarizes and extrapolates observed data. This example uses parametric regression models to predict IMDB users’ film ratings based on Rotten Tomatoes critics’ ratings. The regression options range from linear regression to other functions such as logarithmic, quadratic, and polynomial regression. Alternatively, see the loess regression example for a non-parametric approach to scatter plot smoothing.
二维回归分析将一个数据变量建模为另一个数据变量的函数。由此产生的模型产生一条趋势线,总结和推断观察到的数据。这个例子使用参数回归模型来预测IMDB用户基于烂番茄影评人评分的电影评分。回归选项的范围从线性回归到其他函数,如对数、二次和多项式回归。另外,请参阅局部加权回归示例,了解用于散点图平滑的非参数方法。
Loess Regression
https://vega.github.io/vega/examples/loess-regression/
Locally-estimated regression produces a trend line by performing weighted regressions over a sliding window of points. The loess method (for locally-estimated scatterplot smoothing) computes a sequence of local linear regressions to estimate smoothed points. The bandwidth parameter determines the size of the sliding window of nearest-neighbor points, expressed as a fraction of the total number of points included. Alternatively, see the regression example for regression results using parametric functions.
局部估计回归通过对滑动窗口上的点进行加权回归得到趋势线。局部加权回归(用于局部估计的散点平滑)计算一系列局部线性回归来估计平滑点。带宽参数决定了最近邻点的滑动窗口大小,用包含的点总数的分数表示。或者,参见回归示例了解使用参数函数的回归结果。
上篇完。
- 上一篇: 数据可视化:解析小提琴图(Violin plots)
- 下一篇: 模态测试和核密度估计(模态测试方法)
猜你喜欢
- 2024-09-11 Titanic生存问题预测(科技创新对我国来说不仅是发展问题更是生存问题)
- 2024-09-11 案例算法 | 机器学习python应用,简单机器学习项目实践
- 2024-09-11 "Python可视化神作:16大案例,国界大佬私藏,源码放送!"
- 2024-09-11 高斯混合模型 GMM 的详细解释(高斯混合模型图像分类)
- 2024-09-11 模态测试和核密度估计(模态测试方法)
- 2024-09-11 数据可视化:解析小提琴图(Violin plots)
- 2024-09-11 如何知道一个变量的分布是否为高斯分布?
- 2024-09-11 [seaborn] seaborn学习笔记8-避免过度绘图Avoid Overplotting
- 2024-09-11 【Python可视化系列】一文教会你绘制美观的直方图(理论+源码)
- 2024-09-11 Python数据可视化 | 1、数据可视化流程
- 最近发表
- 标签列表
-
- cmd/c (57)
- c++中::是什么意思 (57)
- sqlset (59)
- ps可以打开pdf格式吗 (58)
- phprequire_once (61)
- localstorage.removeitem (74)
- routermode (59)
- vector线程安全吗 (70)
- & (66)
- java (73)
- org.redisson (64)
- log.warn (60)
- cannotinstantiatethetype (62)
- js数组插入 (83)
- resttemplateokhttp (59)
- gormwherein (64)
- linux删除一个文件夹 (65)
- mac安装java (72)
- reader.onload (61)
- outofmemoryerror是什么意思 (64)
- flask文件上传 (63)
- eacces (67)
- 查看mysql是否启动 (70)
- java是值传递还是引用传递 (58)
- 无效的列索引 (74)