Performing Analysis of Meteorological Data

- Descriptive Roadmap to Data Analysis

Neha Nagpal
5 min readJul 7, 2021

This blog will provide you with comprehensive roadmap of Analysis of Meteorological data -a project based on Data analysis- which can be described as a process consisting of several steps in which the raw data are transformed and processed in order to produce data visualizations and make predictions.

Basically, our goal is to transform the raw data into information and then convert it into knowledge. In this blog, we will perform data cleaning, perform analysis for testing the Hypothesis and show you the possible relevant visualizations.

The Null Hypothesis H0 is “Has the Apparent temperature and humidity compared monthly across 10 years of the data indicate an increase due to Global warming”

The H0 means we need to find whether the for the month say April starting from 2006 to 2016 and the for the same period have increased or not.So for that we will resample the data from hourly to monthly, then compare the same month over the 10 year period supporting our analysis by appropriate visualizations using matplotlib and / or seaborn python library.

Let’s start the journey with

Step 1-Data Collection which is we are going to read the weather dataset taken from Kaggle

(Source URL: https://www.kaggle.com/muthuj7/weather-dataset)

  1. First, of all let's import all the necessary python libraries that are numpy, pandas, matplotlib, seaborn, scikitlearn.
  2. Then using pd.read_csv to read the dataset

Lets take a look at first five rows of our dataset using .head() function

Using more features like .shape,.columns and describe to understand the data.

Preprocessing the data- Checking For null values and whether or not duplicate values are present

Step 2- Exploratory Data Analysis

performing initial investigations on data so as to discover patterns,to spot anomalies,to test hypothesis and to check assumptions with the help of summary statistics and graphical representations.

To perform the analysis we require data to be resampled for that we are taking a few titles(parameters) from the dataset according to our need.

In order to resample the data, first we convert the Formatted Date into Date Time again by using pandas another function to_datetime(). We are setting column (Formatted Data) as index of a Data Frame by using set_index() function. In the next line “MS” denotes “Month Starting”, we are displaying average of the apparent temperature and humidity using mean() function. It returns mean of the data set passed as parameters.

Given:

The Null Hypothesis H0 is “Has the Apparent temperature and humidity compared monthly across 10 years of the data indicate an increase due to Global warming”.

The Alternative Hypothesis H1 is “Has the Apparent temperature and humidity compared monthly across 10 years of the data not indicate an increase due to Global warming”.

Step-3 Data Visualization

let’s see how to plot different graphs using above imported python libraries representing our analysis on weather dataset.

Observation: “Humidity” is remain constant from 2006–2016 But “Apparent Temperature ©” is frequently changes from 2006–2016

Now,let’s see plots of the “average temperature and humidity “ of all the months over the stretch of 10 years.

1.January- Huge change in temperature is created where humidity is similar throughout year.

Similary,we’ll use this code for other months by replacing month

2. February -huge change in temperature is created and humidity is unchanged.

3.March

4.April

5.May

6.June

7.July

8.August

9.September

10.October

11.November

12.December

Replot

Heatmap

CONCLUSION

H0 is not accepted because there is no change in Humidity from 2006–2016. So, we will accept the H1.

From all the above visualizations it is clearly visible that there are some major variations in the plots for Apparent Temperature(C), Humidity and Density for over 10 years. Apparent temperature, is the temperature that the human body feels due to the humidity, wind speed, etc. So according to Null Hypothesis stated above we observed average humidity slightly increases. Increase in humidity makes the temperature to feel warmer. As humidity increases, the apparent temperature increases. So we can see, the various parameters which have been affected by global warming and how it is deteriorating the weather.

I am thankful to mentors at https://internship.suvenconsultants.com for providing awesome problem statements and giving many of us a Coding Internship Exprience. Thank you www.suvenconsultants.com

Github Link- https://github.com/NEHA2713/Internship_Data_Analytics.git

--

--