This is the second in a series of posts (Previous post: [Community tools] Filenames ) regarding varying issues facing tools that support fan artists, machinima artists, addon authors, fansites and those wanting to get information from sources that are not easily available to the general player.
Introduction
While some of that information is available via the web API, a lot of it is hidden in database tables in the WoW’s DB2 file format (henceforth called DB2s). Even though it is hard to reach, this information is helping many players – of all playstyles – make decisions about their gameplay decisions (such as reading guides on fansites) or is even considered vital in some playstyles, e.g. with the omnipresent simcrafting.
More examples
These files contain information for pretty much all the various systems in WoW. For example, the somewhat complex ItemBonus system which is (still) not supported in the official APIs (and as of last official communication isn’t planned either) but is required to be able to properly show stats and effects on items outside of the game. Information on spells outside of a simple/name description is not available through the API either. The same goes for game/world maps, creature/item model information and much more.
Target audience
Access to DB2s is limited to those who have the resources to maintain, or have access to tools to extract and read them. The amount of publicly available tools that support opening DB2s is very low right now and many aren’t maintained and will likely break the next time the DB2 format is updated with the only hope for someone in the community to come in and update them. DB2 data is used in many different projects that players encounter/use on a daily basis, and if you’re not a project with enough resources (e.g. a hobbyist, like I was when I started out) you’re unlikely to easily have access to this information if one of the public tools were to go away. Blizzard benefits from these projects as players, and even Blizzard, often use or refer to these projects which wouldn’t have been possible without the community doing the work.
How would it work?
One way to go about it is to have it work in a similar way to how the WoW client can export interface files with the console command exportInterfaceFiles on the login screen. Like that, it could also be able to export/dump DB2s into a generic format (e.g. CSV) with a command named something like exportDatabaseFiles/dumpDatabaseFiles/dumpDB2s.
As the game already reads these files, most of the information to do this is already present in the client, with the exception of column names and handling for columns the client doesn’t use. Those two things would have to be added as well as serialization to (for example) CSV.
Additionally, since WoW is already capable of handling various startup arguments (see: starting WoW with -help), this idea as well as the interface export could also be set as a startup argument to allow for better automation instead of having to manually enable the console, toggling it and then entering the command.
Another solution could be to supply table schemas and format definitions/explanations with the game itself in a readable format, it will leave the implementation of reading DB2 files up to the developer in question but it’d be easier compared to relying on reverse engineering and manual analysis of field meanings.
I’m sure there’s other and probably better ways of going about it.
Why not add this to the web-based game data APIs instead?
The game data APIs are rather limited in the information they contain and I think it might be way more work to make all the data from DB2s available in the web APIs, which might be better suited for dynamic data that changes often (like auction house postings, character information, or the feedback from the thread by fellow council member Ortho here: Adding toys (and more) to the WoW API) as well as information only available on the server’s end.
Granted, the web APIs are much easier to talk to than having to install WoW, running a command and waiting for things to export, but it is likely much less of a burden to implement compared to having to constantly deal with data format changes, additions and removals for the various APIs. Naturally I have no insight into how the APIs work internally, so I could be wrong in my assumption. If it ends up being easier to implement there, that could be the better way.
Possible concerns
Datamining
This wouldn’t make the lives of dataminers easier as the big names already have more than enough resources to deal with these files. They’re also unlikely to replace their current methods with the method described here. The only difference is that there would be an official source of column names, but most of those have already been named by the community as well and aren’t exactly harmful to have publicly known.
Internal information
DB2s have been getting datamined with diffs automatically posted to fansites for many years now. They don’t really leak or contain any valuable internal or (outside of the occasional WIP system during PTR – which are posted about regardless) unreleased information since devs take care of what they ship to the client these days, either by not putting certain information in there at all or by encrypting certain rows. Any information they do contain will get posted regardless of these being officially exportable or not.
Hotfixes
Hotfixes are sent down to the client when/after connecting to a game server, so those might be missing when running an export command as early on as the login screen or even before that. There would be hotfixes available that are still cached from previous sessions, if the player logged in once before running the command. If the command still works on the character selection screen, that would theoretically solve it, but that isn’t something that can be easily automated. However, even if hotfixes were left out (an initial implementation) of this, it would still be a great addition.
In closing
It’d be great to have this information easily available without having to rely on community tools and by extension reverse engineering. Hopefully this can be taken into consideration. If any fellow council members have ideas on how to make this more easily accessible or on anything else regarding this topic, feel free to drop a reply.