This is a Playo data exploration that I did for fun with some data I have been scraping for about 4 years now, for this tool that I built to find a Playo venue quickly. I have been scraping data for Bangalore, Hyderabad and Chennai. And I wanted to see if there's something interesting I'd find from the data.
My initial curiosity was to see if the data would let me see the impact of COVID on playing patterns. But, the data isn't rich enough, and I wasn't very sure I'd be able to see much. But, it was worth a try. I start by exploring the data and trying to understand it, and proceeding to see if I find any thing interesting in the data.
Let's start with looking at the number of venues by city.
I know from playing and using the app in Hyderabad and Bangalore that Playo is much more popular and has lot more venues in Bangalore than Hyderabad. So, this graph is a little surprising. Also, the spike in May 2019, seems a little fishy.
I think Playo added a bunch of public sports venues to their database, in an attempt to just display them on their app (even if not bookable) or to allow some kind of mechanism for booking in future... But, most of these venues added in that period don't seem to have any other "usage" data in the current snapshot of data.
Looking at only the venues which have at least one user rating would make things a little bit clearer.
We use number or ratings as a proxy for number of users... though it's such a tricky measure.
Why do number of ratings go down?
You can't remove your rating, you can only vote once for a venue. It's not linked to whether you actually played in that venue or not. I think there was a problem with owners voting for their own venue or -ve voting competition or something.
Did some venues get removed, that got rid of ratings? Yes, that is indeed the case. We can guess at that from the number of venues plot above, but also by looking at the specific changes in data.
For instance, ~50 venues got removed in Bangalore in August 2021
So, ratings graph above doesn't give a sense of users, even if ratings are a bad proxy for usage given that a user can only vote once for a venue. To improve the situation slightly, we can look at ratings for venues that are only still on the listing as of today, and see how ratings on those venues changed.
NOTE: The data wasn't scraped correctly -- the IDs of the venues weren't stored in the data. So, it's a little hackish to track a venue over time. We could use the link as the ID, but links could change too. We use (lat, lng) as an ID, because they are less likely to change and would change only in case of errors.
To try and see how Covid affected playing, we plotted the graphs above with total rating counts across a city, and ratings for some specific venues. We can see some flat portions on the graphs around the time of 1st and 2nd waves, but most of these numbers already seem to be plateauing to be able to notice anything significant, really.
To see if we can notice something more significant, let's look at the venues that seem to be growing/upcoming and see how they were affected.
name_curr | name_init | ratingCount_curr | ratingCount_init | countDiff | |
---|---|---|---|---|---|
163 | Eesha Badminton Academy | Eesha Badminton Academy | 116.0 | 33.0 | 83.0 |
36 | Aptha Badminton Academy | Aptha Badminton Academy | 236.0 | 194.0 | 42.0 |
413 | Prakash Badminton Academy | Prakash Badminton Academy | 58.0 | 20.0 | 38.0 |
351 | NRC Badminton Arena | NRC Badminton Arena | 95.0 | 57.0 | 38.0 |
591 | The Majesstine Sports | The Majesstine Sports | 160.0 | 123.0 | 37.0 |
403 | Play Zone - Kasturinagar | Play Zone - Kasturinagar | 204.0 | 172.0 | 32.0 |
474 | Score Bengaluru Sports Park | Score Bengaluru Sports Park | 150.0 | 126.0 | 24.0 |
387 | Panchajanya Badminton & Fitness Academy | Panchajanya Badminton & Fitness Academy | 103.0 | 82.0 | 21.0 |
367 | Nova Badminton Academy - Arehalli | Nova Badminton Academy - Arehalli | 237.0 | 216.0 | 21.0 |
656 | Whitefield United - Cap Life | Whitefield United | 43.0 | 23.0 | 20.0 |
The venue seems to have gotten about 30 votes in the first month of being added on PlayO (Apr 2021). And the ratings were pretty high. I'm guessing these were ratings by friends of the venue owners, because I remember going to the venue some time in July/August of 2021 and being utterly disappointed by it, which is reflected in the sudden drop in ratings, as soon as the lockdowns in Bangalore eased a little after the second wave.
Not really sure what happened here. There was a huge spike in the number of ratings, taking the avgRating down drastically, before the number of ratings seems to have been reset. I wonder if there were some bad actors at play?
Height of the skyscrapers is proportional to the number of ratings a venue has.
Zoom in and hover over the venues to see more info about the venue.
Pan around to Hyderabad and Chennai to see how they compare to Bangalore