visualizing_data PDF 下载_Java知识分享网-免费Java资源下载

失效链接处理

visualizing_data PDF 下载

本站整理下载：

链接：https://pan.baidu.com/s/1ntLDR80NPZQaD97Xs_15PA

提取码：yrd2

相关截图：

主要内容：

A Combination of Many Disciplines

Given the complexity of data, using it to provide a meaningful solution requires

insights from diverse fields: statistics, data mining, graphic design, and information

visualization. However, each field has evolved in isolation from the others.

Thus, visual design—-the field of mapping data to a visual form—typically does not

address how to handle thousands or tens of thousands of items of data. Data mining

techniques have such capabilities, but they are disconnected from the means to interact with the data. Software-based information visualization adds building blocks for

interacting with and representing various kinds of abstract data, but typically these

methods undervalue the aesthetic principles of visual design rather than embrace their

strength as a necessary aid to effective communication. Someone approaching a data

representation problem (such as a scientist trying to visualize the results of a study

involving a few thousand pieces of genetic data) often finds it difficult to choose a representation and wouldn’t even know what tools to use or books to read to begin.

Process

We must reconcile these fields as parts of a single process. Graphic designers can learn

the computer science necessary for visualization, and statisticians can communicate

their data more effectively by understanding the visual design principles behind data

representation. The methods themselves are not new, but their isolation within individual fields has prevented them from being used together. In this book, we use a process that bridges the individual disciplines, placing the focus and consideration on how

data is understood rather than on the viewpoint and tools of each individual field.

The process of understanding data begins with a set of numbers and a question. The

following steps form a path to the answer:

Acquire

Obtain the data, whether from a file on a disk or a source over a network.

Parse

Provide some structure for the data’s meaning, and order it into categories.

Filter

Remove all but the data of interest.

Mine

Apply methods from statistics or data mining as a way to discern patterns or

place the data in mathematical context.

Represent

Choose a basic visual model, such as a bar graph, list, or tree.

Refine

Improve the basic representation to make it clearer and more visually engaging.

Interact

Add methods for manipulating the data or controlling what features are visible.

6 | Chapter 1: The Seven Stages of Visualizing Data

Of course, these steps can’t be followed slavishly. You can expect that they’ll be

involved at one time or another in projects you develop, but sometimes it will be four

of the seven, and at other times all of them.

Part of the problem with the individual approaches to dealing with data is that the

separation of fields leads to different people each solving an isolated part of the problem. When this occurs, something is lost at each transition—like a “telephone game”

in which each step of the process diminishes aspects of the initial question under

consideration. The initial format of the data (determined by how it is acquired and

parsed) will often drive how it is considered for filtering or mining. The statistical

method used to glean useful information from the data might drive the initial presentation. In other words, the final representation reflects the results of the statistical

method rather than a response to the initial question.

Similarly, a graphicdesigner brought in at the next stage will most often respond to

specific problems with the representation provided by the previous steps, rather than

focus on the initial question. The visualization step might add a compelling and

interactive means to look at the data filtered from the earlier steps, but the display is

inflexible because the earlier stages of the process are hidden. Furthermore,

practitioners of each of the fields that commonly deal with data problems are often

unclear about how to traverse the wider set of methods and arrive at an answer.

This book covers the whole path from data to understanding: the transformation of a

jumble of raw numbers into something coherent and useful. The data under consideration might be numbers, lists, or relationships between multiple entities.

It should be kept in mind that the term visualization is often used to describe the art

of conveying a physical relationship, such as the subway map mentioned near the

start of this chapter. That’s a different kind of analysis and skill from information

visualization, where the data is primarily numericor symbolic(e.g., A, C, G, and T—

the letters of geneticcode—and additional annotations about them). The primary

focus of this book is information visualization: for instance, a series of numbers that

describes temperatures in a weather forecast rather than the shape of the cloud cover

contributing to them.

An Example

To illustrate the seven steps listed in the previous section, and how they contribute

to effective information visualization, let’s look at how the process can be applied to

understanding a simple data set. In this case, we’ll take the zip code numbering system that the U.S. Postal Service uses. The application is not particularly advanced,

but it provides a skeleton for how the process works. (Chapter 6 contains a full

implementation of the project.)

An Example | 7

What Is the Question?

All data problems begin with a question and end with a narrative construct that provides a clear answer. The Zipdecode project (described further in Chapter 6) was

developed out of a personal interest in the relationship of the zip code numbering

system to geographicareas. Living in Boston, I knew that numbers starting with a

zero denoted places on the East Coast. Having spent time in San Francisco, I knew

the initial numbers for the West Coast were all nines. I grew up in Michigan, where

all our codes were four-prefixed. But what sort of area does the second digit specify?

Or the third?

The finished application was initially constructed in a few hours as a quick way to

take what might be considered a boring data set (a long list of zip codes, towns, and

their latitudes and longitudes) and create something engaging for a web audience

that explained how the codes related to their geography.

Acquire

The acquisition step involves obtaining the data. Like many of the other steps, this

can be either extremely complicated (i.e., trying to glean useful data from a large system) or very simple (reading a readily available text file).

A copy of the zip code listing can be found on the U.S. Census Bureau web site, as it

is frequently used for geographic coding of statistical data. The listing is a freely

available file with approximately 42,000 lines, one for each of the codes, a tiny portion of which is shown in Figure 1-1.

最新Java全栈就业实战课程(免费)

AI人工智能学习大礼包

IDEA永久激活

66套java实战课程无套路领取

锋哥开始收Java学员啦！

Python学习路线图

visualizing_data PDF 下载

Java1234官方群25：
Java1234官方群25：	838462530