Finding Hidden Community Connections at Scale
Introduction
Pop-sociology suggests that we are only Six Degrees of Separation from any other person on our planet. It's a neat concept that speaks to the interconnectedness of modern society — a web of relationships and connections that span the globe. However, we might also want to apply this same logic to communities. How would we go about it? Why would it be useful for Market research? And, would we find the results similarly fascinating?
This blog post will explain everything you need to know about using Network Analysis for Market research, in less than 5 minutes.
The Science
Network Theory is the field of mathematics concerned with representing a set of entities and their relationships. We have:
- A set of nodes: for our purpose, a node can represent an online community.
- A set of edges: lines that join our communities, representing a connection. These edges may be weighted to show the strength of connection.
We need a way of measuring connection between our online communities. We’re interested in how many individuals reside in both spaces, so we use a measure called Jaccard Similarity:
It’s just the overlap divided by the union, for those interested. And in reality, the numbers we’re dealing with will be 100s of thousands of users. What matters is that it tells us about how many individuals frequent both communities, whilst controlling for the sizes of them.
The Data (Reddit Goldmine!)
Reddit is set up in such a way that is perfect for our needs here. Its unique subreddit format provides us a structured environment, whilst still encompassing almost any imaginable niche. I mean, did you know that there’s a subreddit called r/BarBattlestations, and another called r/Offensive_Wallpapers?
Neither did I until writing this blog.
Reddit also gives us the ability to track individual posting habits, and therefore measure the overlap between communities, making it a true goldmine for network analysis.
Each relevant subreddit will become a node in our network. Each link will represent a high amount of individuals frequenting both subs.
Let’s Build a Network for Market Research, Example
Imagine you are working with a fashion house or fashion retailer in order to understand the adjacent interests of their core audience (this could apply to almost any consumer-facing industry). Sure, you already know they are interested in streetwear, because they’ve shown interest in the brand/store: but what else are these consumers interested in? What niche opportunities could be opened up (I’m talking collaborations, marketing campaigns, events etc.) that would land on-the-nose with your target market?
- Step 1: Choose a central subreddit as the core node in our network—let's start with r/streetwear.
- Step 2: Gather data from our central subreddit's most engaged users - we like to work with 10,000+ people. Analyse their posting history to uncover adjacent subreddits they contribute to. Calculate Jaccard similarities to identify core sectors branching from our centre.
- Step 3: Iterate over the process on each added subreddit to reveal a richer network. Explore smaller subreddits within sectors and uncover detailed connections across the space.
Once we’re satisfied with the depth of our network, we can map out the end product as a pretty data visualisation.
Here’s the real r/Streetwear network, as per the data at the time of writing:
Finding the Value
There are different ways of drawing value from networks, depending on the project aims and the industry we’re working in:
- Discover a clear segmentation of key clusters that surround our topic of interest. These clusters can range from the expected (Outfit Sharing), to the less obvious (Playboi Carti). By seeing the communities gathered in this way, we can have some level of confidence that our target audience can be segmented as such, at least when split by their adjacent interests.
- The most surprising connections that are revealed can also be interesting for our research. Is there anything in our example output that surprised you? For me, I’d have to say that I didn’t expect r/Antiques to be a mere 2-steps from our core node, r/Streetwear. This is one of a handful of intriguing connections here, and these unexpected connections seem to occur no matter what the case study is. Perhaps these niche pockets of life can tell us something about our audience and the way the industry might be shifting (we could run this analysis over time too!).
- Learn something in the connection strengths. For example, how strong is the connection from streetwear to ‘Thrifting’ vs ‘Clothing Startups’ (we haven’t included connection strengths in our visual here). Perhaps there’s a hypothesis that needs to be confirmed or nullified. This is a way of doing that at scale.
Outro (Join our free Inner Circle!)
Congratulations! You're equipped with the basics in the world of network analysis for market research. With this methodology, we can really pull apart the hidden communities stemming from our industry, brand, or even product of interest.
We recently used this approach with a client to break down the online spiritual and wellness communities in great detail. I’m consistently amazed at the quirky sub-communities that pop up - especially when they have 100k+ members.
If you have a research brief that you think might benefit from some network analysis, don't hesitate to reach out to us.
To stay ahead of the rapidly-changing AI landscape, we invite you to join our free Inner Circle. Inside, you'll find advice and powerful case studies to help you apply AI and big data to market research. We promise not to bore you with generic 'Here's 10 ChatGPT prompts for better insights' drivel. Instead, we want to help you cut through the noise by sharing with you what we consider to be truly unique and best-in-class tech approaches.
Thanks for reading, and take care.
Satori Team.