To kick off this blog, I think we should start with one of the cornerstones of modern data analysis, the Pivot table. It was my entry point into data analysis and I would believe that every person who ever touched on the subject of finding a pattern in data, will have come across this format.
According to wikipedia, pivot tables are as old as 1991, so they have got a lot of mileage on them and they stood the test of time. So with this being said, I will use a sample spreadsheet that I quickly generated using the python programming language (you can follow this download link, in case you do not have a sample spreadsheet handy yourself).
I will use Microsoft’s Excel 2016 to create a Pivot Table, probably the most commonly used program, but there is other software out there that can achieve very similar outputs, such as Libre Office, which is very often the standard office suite for Linux distributions (e.g. Ubuntu).
So let’s get started. You will find that once you know it, it is very easily done.
Open your spreadsheet and go to the sheet with your data in it. Find the tab Insert and click on the icon called Pivot Table.
From then on, Excel does essentially everything for you. It will select all data in range in this particular worksheet and a Pivot Table creation wizard will open and give you several options, as you can see below.
Let me just lose a word or two on the two most important ones.
Here you can select which data you want to use for your pivot table. As you can see in this image, Excel automatically chooses all data in range.
For those of you, who have never worked with spreadsheets and were tossed in at the deep end, do not despair, there are a lot of introductory tutorials out there from which I, myself, learned the above. A Google search alone on “Excel Notation” yields first pointers by Google itself with several tutorial links.
The choice you can make here is between data from a spreadsheet, by default the active worksheet, or an external data source.
With regards to spreadsheets, the choice which sheet to use, is really yours. You can even navigate to a different file and use a worksheet from there.
I will not get into the topic of external data source, as I have not had a need for it yet, so I am not clued up on it. I prefer honesty and I will try to direct you, instead, to an article that may give you more information about it, such as this one or that one.
Secondly, you can insert the Pivot table into a new worksheet or an existing one. Just one word of advice, do not mix your source data and pivot table. It is messy and achieves the opposite of what we are trying to achieve, namely, deduce sense from a large amount of data.
Once you have made your choices, click “OK”.
Here is what you will see inside the spreadsheet; the placeholder for your Pivot Table.
When you click inside it, on the right hand side you will see a dialog named “PivotTable Fields”. This is where the magic happens. Just drag and drop, for example, the box “Sales” and “Order number” into the box named “Values”. Go on “Order number” in the Pivot Table, Right-Click > Summarized Values By… > Count. Then drag “Order Date” into the “Rows” box and “Countries” into the “Filters” box.
And this is how you create a Pivot Table from your original data. Have a play with it and see where it takes you.
At this point, I believe, we covered the topic sufficiently to get you started on the first steps. Most of what comes next, I taught myself through need, based on the data that I had and what I wanted to know. For example, if you are worried that the Sales are dropping in a particular region, you might monitor the sales of a specific item over time in that region and extrapolate a trendline (to account for standard fluctuations in monthly sales).
The sample spreadsheet, I have supplied, is unfortunately not useful to try and deduct trends. The random generator of the python random module is about as pseudorandom as a pseudorandom generator can be 😉
I hope this was a comprehensive start to this exciting topic. As I said, this is only the beginning and I cannot wait to take a further deep dive into the topic.
If you have anything to add or suggestions on what topic to tackle in the future, leave a comment below. Thanks for reading 🙂
Links in the text