11 February, 2007
Last week we looked at information about data mining. There are some good resources there; but more are always appreciated.
So, now we’re going to talk about applying this knowledge. Let’s talk about what you can do with data mining.
Paul Barnett posted a comment to discuss this topic a bit. He wrote:
Odd thing data mining. In my ten year holiday from computer games (94-2004) I went off to be a creative consultant with bricks and industry. There is a wealth of data mining out there and most of it is, frankly, worthless.
A great example are the loyalty cards that superstores have in the UK. Every single transaction recorded. I mean that amount of data must be worthwhile, right? I mean after you pay to make the cards, record them, adjust your cash tills to scan the cards in, send new cards to people, change them into keyfob swipable ones, set up direct mailing, figure out how to produce vouchers, advertise the heck out of the fact you have a loyalty card and then spend a ton of money gleaming all the information.
Turns out people shop about every few days, they buy milk a lot and now and then buy washing powder.
Of the people I ended up working with, almost all of the big and clever companies just knew their market. I mean they actually just knew it. They could tell you what would and wouldn’t work, they basically understood what they had to focus on. And when I ran into a company that didn’t, the only thing I used to find with any regularity was this..
They had just lost faith in their gut
They had just lost the core people that ‘knew’ the market
So when people talk about data mining I am sort of preprogrammed to raise an eyebrow. Is there really data out there that we don’t know? I mean really, is there? And if so what bleeding use is it going to be for us?
I think you’re right in that most of the good designers know what a typical player does in a typical play session. If not, time to pick a new profession. I also think anyone with an IQ over room temperature knows the whole grocery “loyalty” card thing is just dumb because it’s not going to reveal much of interest.
But, when we collect metrics on an online game, we’re actually not particularly interested in “normal” activity. No, what we really want to see is abnormal activity. This gives us more insight into what’s going on in our game.
For example, say that the average number of experience points (or in-game currency) earned per player per hour jumps suddenly one day. This probably indicates either a location that is rewarding too many experience points is being farmed hard-core, or an exploit was found. In either case this can be a good reason to dig deeper into the data to find out what the cause of this is. Continuing the example, if use of a particular zone also increased, you know where to start looking for your problems. Observing individual players (or reviewing individual logs if you record them) can help pinpoint a problem instead of hoping for a lucky break.
The goal of data mining is to allow you see these problems with a glance at a summary instead of pouring through individual logs to figure out if someone’s cheating or not. Having good data mining helps you catch issues sooner rather than later.
Paul followed up with:
So the data mining we are talking about is actually how to catch stuff that is throwing the game balance out of wack?
Might be interesting to ask how people think we are best off doing that, perhaps that will generate the data mining answers that will be helpful?
From what I have seen people just get better at finding ways to game the system, should we even be bothered that it happens?
If you set general time to get to level X at Z hours. How do you adjust when people have hint books, web sites, guild members, item flow down, buffing and all manner of other stuff.
Wouldn’t the limits become worthless after a while? Or are these metrics just for early play balance and are there to be discarded as the game matures?
Do you really need metrics to know where the current population gravity is within WOW?
So, let’s discuss this issue a bit more. Participation required! :) I’m interested in hearing everyone’s thoughts about this. Is data mining useful, or is it just modern snake oil used to sell middleware? Can we rely on designer instinct as we have before, or has the time come to actually have supporting data for the things like balance?