Java知识分享网 - 轻松学习从此开始!    

Java知识分享网

Java1234官方群25:java1234官方群17
Java1234官方群25:838462530
        
SpringBoot+SpringSecurity+Vue+ElementPlus权限系统实战课程 震撼发布        

最新Java全栈就业实战课程(免费)

springcloud分布式电商秒杀实战课程

IDEA永久激活

66套java实战课程无套路领取

锋哥开始收Java学员啦!

Python学习路线图

锋哥开始收Java学员啦!
当前位置: 主页 > Java文档 > Java基础相关 >

visualizing_data PDF 下载


分享到:
时间:2020-09-22 09:09来源:http://www.java1234.com 作者:小锋  侵权举报
visualizing_data PDF 下载
失效链接处理
visualizing_data PDF 下载


本站整理下载:
 
相关截图:
 
主要内容:

A Combination of Many Disciplines
Given the complexity of data, using it to provide a meaningful solution requires
insights from diverse fields: statistics, data mining, graphic design, and information
visualization. However, each field has evolved in isolation from the others.
Thus, visual design—-the field of mapping data to a visual form—typically does not
address how to handle thousands or tens of thousands of items of data. Data mining
techniques have such capabilities, but they are disconnected from the means to inter￾act with the data. Software-based information visualization adds building blocks for
interacting with and representing various kinds of abstract data, but typically these
methods undervalue the aesthetic principles of visual design rather than embrace their
strength as a necessary aid to effective communication. Someone approaching a data
representation problem (such as a scientist trying to visualize the results of a study
involving a few thousand pieces of genetic data) often finds it difficult to choose a rep￾resentation and wouldn’t even know what tools to use or books to read to begin.
Process
We must reconcile these fields as parts of a single process. Graphic designers can learn
the computer science necessary for visualization, and statisticians can communicate
their data more effectively by understanding the visual design principles behind data
representation. The methods themselves are not new, but their isolation within indi￾vidual fields has prevented them from being used together. In this book, we use a pro￾cess that bridges the individual disciplines, placing the focus and consideration on how
data is understood rather than on the viewpoint and tools of each individual field.
The process of understanding data begins with a set of numbers and a question. The
following steps form a path to the answer:
Acquire
Obtain the data, whether from a file on a disk or a source over a network.
Parse
Provide some structure for the data’s meaning, and order it into categories.
Filter
Remove all but the data of interest.
Mine
Apply methods from statistics or data mining as a way to discern patterns or
place the data in mathematical context.
Represent
Choose a basic visual model, such as a bar graph, list, or tree.
Refine
Improve the basic representation to make it clearer and more visually engaging.
Interact
Add methods for manipulating the data or controlling what features are visible.
6 | Chapter 1: The Seven Stages of Visualizing Data
Of course, these steps can’t be followed slavishly. You can expect that they’ll be
involved at one time or another in projects you develop, but sometimes it will be four
of the seven, and at other times all of them.
Part of the problem with the individual approaches to dealing with data is that the
separation of fields leads to different people each solving an isolated part of the prob￾lem. When this occurs, something is lost at each transition—like a “telephone game”
in which each step of the process diminishes aspects of the initial question under
consideration. The initial format of the data (determined by how it is acquired and
parsed) will often drive how it is considered for filtering or mining. The statistical
method used to glean useful information from the data might drive the initial presen￾tation. In other words, the final representation reflects the results of the statistical
method rather than a response to the initial question.
Similarly, a graphicdesigner brought in at the next stage will most often respond to
specific problems with the representation provided by the previous steps, rather than
focus on the initial question. The visualization step might add a compelling and
interactive means to look at the data filtered from the earlier steps, but the display is
inflexible because the earlier stages of the process are hidden. Furthermore,
practitioners of each of the fields that commonly deal with data problems are often
unclear about how to traverse the wider set of methods and arrive at an answer.
This book covers the whole path from data to understanding: the transformation of a
jumble of raw numbers into something coherent and useful. The data under consid￾eration might be numbers, lists, or relationships between multiple entities.
It should be kept in mind that the term visualization is often used to describe the art
of conveying a physical relationship, such as the subway map mentioned near the
start of this chapter. That’s a different kind of analysis and skill from information
visualization, where the data is primarily numericor symbolic(e.g., A, C, G, and T—
the letters of geneticcode—and additional annotations about them). The primary
focus of this book is information visualization: for instance, a series of numbers that
describes temperatures in a weather forecast rather than the shape of the cloud cover
contributing to them.
An Example
To illustrate the seven steps listed in the previous section, and how they contribute
to effective information visualization, let’s look at how the process can be applied to
understanding a simple data set. In this case, we’ll take the zip code numbering sys￾tem that the U.S. Postal Service uses. The application is not particularly advanced,
but it provides a skeleton for how the process works. (Chapter 6 contains a full
implementation of the project.)
An Example | 7
What Is the Question?
All data problems begin with a question and end with a narrative construct that provides a clear answer. The Zipdecode project (described further in Chapter 6) was
developed out of a personal interest in the relationship of the zip code numbering
system to geographicareas. Living in Boston, I knew that numbers starting with a
zero denoted places on the East Coast. Having spent time in San Francisco, I knew
the initial numbers for the West Coast were all nines. I grew up in Michigan, where
all our codes were four-prefixed. But what sort of area does the second digit specify?
Or the third?
The finished application was initially constructed in a few hours as a quick way to
take what might be considered a boring data set (a long list of zip codes, towns, and
their latitudes and longitudes) and create something engaging for a web audience
that explained how the codes related to their geography.
Acquire
The acquisition step involves obtaining the data. Like many of the other steps, this
can be either extremely complicated (i.e., trying to glean useful data from a large system) or very simple (reading a readily available text file).
A copy of the zip code listing can be found on the U.S. Census Bureau web site, as it
is frequently used for geographic coding of statistical data. The listing is a freely
available file with approximately 42,000 lines, one for each of the codes, a tiny portion of which is shown in Figure 1-1.


 

------分隔线----------------------------

锋哥公众号


锋哥微信


关注公众号
【Java资料站】
回复 666
获取 
66套java
从菜鸡到大神
项目实战课程

锋哥推荐