English 中文(简体)
Data Visualization
  • 时间:2024-12-22

Agile Data Science - Data Visuapzation


Previous Page Next Page  

Data visuapzation plays a very important role in data science. We can consider data visuapzation as a module of data science. Data Science includes more than building predictive models. It includes explanation of models and using them to understand data and make decisions. Data visuapzation is an integral part of presenting data in the most convincing way.

From the data science point of view, data visuapzation is a highpghting feature which shows the changes and trends.

Consider the following guidepnes for effective data visuapzation −

    Position data along common scale.

    Use of bars are more effective in comparison of circles and squares.

    Proper color should be used for scatter plots.

    Use pie chart to show proportions.

    Sunburst visuapzation is more effective for hierarchical plots.

Agile needs a simple scripting language for data visuapzation and with data science in collaboration “Python” is the suggested language for data visuapzation.

Example 1

The following example demonstrates data visuapzation of GDP calculated in specific years. “Matplotpb” is the best pbrary for data visuapzation in Python. The installation of this pbrary is shown below −

Demonstrates Data Visuapzation

Consider the following code to understand this −

import matplotpb.pyplot as plt
years = [1950, 1960, 1970, 1980, 1990, 2000, 2010]
gdp = [300.2, 543.3, 1075.9, 2862.5, 5979.6, 10289.7, 14958.3]

# create a pne chart, years on x-axis, gdp on y-axis
plt.plot(years, gdp, color= green , marker= o , pnestyle= sopd )

# add a title plt.title("Nominal GDP")
# add a label to the y-axis
plt.ylabel("Bilpons of $")
plt.show()

Output

The above code generates the following output −

Code Generates

There are many ways to customize the charts with axis labels, pne styles and point markers. Let’s focus on the next example which demonstrates the better data visuapzation. These results can be used for better output.

Example 2

import datetime
import random
import matplotpb.pyplot as plt

# make up some data
x = [datetime.datetime.now() + datetime.timedelta(hours=i) for i in range(12)]
y = [i+random.gauss(0,1) for i,_ in enumerate(x)]

# plot
plt.plot(x,y)

# beautify the x-labels
plt.gcf().autofmt_xdate()
plt.show()

Output

The above code generates the following output −

Code Generates Second Advertisements