[Community tools] Filenames

As these council forums are meant for discussing and giving feedback towards “everything related to WoW”, inside and outside of the game, this is the first in a series of posts I’ve planned intended to bring attention and suggest ideas to solving various issues facing community-made tools that support various parts of the WoW community.

Community tools? What?

Various members of the WoW community create and maintain various third-party tools that addon authors, fan/machinima/cosplay artists and way more people use to get certain things, such as maps, building models, creature models, textures, sounds or any other assets out of WoW.

While I won’t name any tools in this post to minimize the risk of it getting removed for breaking any rules, many players interact with, or see the result of these tools on a daily basis. Think of WoW machinima/fan art you might see featured on YouTube, Twitter or fansites. This includes “official” machinima commissions by Blizzard such as at the WoW esports trailer for the MDI. I could go on listing examples from cosplayers, addon authors, roleplayers and even D&D campaign creators but I don’t want this post getting too long.

The removal of filenames

To keep artists making art with game assets, machinima creators making machinima not fully recorded in-game, add-on authors making add-ons that use non-interface files, fansites being able to make high-res screenshots of models as well as allowing people sensitive to specific sounds to find which ID they need to block… or any other applicable reason, having files with filenames is the most obvious way to make all of this possible.

However, Blizzard has been steadily removing filenames for a while. Click to expand for a history lesson/specifics.

A few expansions ago it became apparent that Blizzard started a project to replace filenames throughout WoW’s data with file IDs (numbers) instead. We don’t know the exact reasoning for this, but one could assume it has to do with optimization, making files slightly smaller or even combating datamining to an extent.

WoW clients have always stored game data in a proprietary format which uses hash/lookup based retrieval of files. Up to 6.0, the MPQ files contained a list of filenames contained, which allowed mapping those hashes to human-readable names. In 6.0, the system was replaced by new cross-game technology called CASC, which no longer contained that list, making all files unnamed by default.

Until 7.0, a separate list of IDs => filenames was still included. After it’s removal, the only ways to name a file by its official name were:

  1. Finding a filename in another file that refers to the file (e.g. terrain files still had filenames at this point, but these were slowly removed)

  2. Finding the filename inside of the file (models only)

  3. Guessing the name correctly

  4. Trying different combinations of words/characters until you get a filename matching the expected hash/lookup (brute forcing)

Then in patch 8.2, the hashes/lookups, and as such the ability for the community to bruteforce/guess filenames (3 and 4 in previous paragraph) for game assets was removed (outside of interface files, which still have a list) replaced by referencing files only by their ID. This change affected all community tools across the board and to this day is responsible for breaking a few of them that were never updated to deal with this. E.g. tools that were previously able to go from a filename (e.g. arthas.m2) to a file now can only do it by an ID (122965).

To help combat this change to allow community tools to keep using filenames, together with other members of the community, I set up a way for the community to suggest mapping from IDs to filenames and certain members being able to verify and add them to a single mapping (listfile) for everyone to use. This listfile, containing almost 1.5 million filenames, has become the standard way for most community tools to rely on for showing filenames to users.

More recently in 9.2, model files no longer contain their filename either, which was/is one of the last ways we were able to name files by their official name.

With many members of the community (and by extension, Blizzard) relying on many of these tools for their art, machinima, projects and for some even their livelihood. And especially with the removal of model names in 9.2.0, it is becoming extremely difficult to keep doing this way of providing filenames to the community. While we’re looking into better solutions to do this going forward (e.g. through GitHub) or through improved naming tools, this is likely not going to improve things in any significance.

Possible solutions

I feel like Blizzard can take some of their responsibility in this and at the very least consider one of the following solutions, or come up with a better one, to support these parts of the WoW community:

Solution 1: Bring back FileDataComplete.db2

This file existed up to 7.0 and gave us a list of filenames for each file. It was removed, likely in part of the effort to remove filenames from the game in general and switching to IDs instead. After this was removed, we were required to switch to the guessing/bruteforcing of filenames.

Solution 2: Give the community a filename list through another way

Whether or not this is a random list of names intermittently updated on some FTP server, PRs submitted to the listfile on GitHub, added to the web API or even distributed as a text file with the game/available on a CDN.

Solution 3: Bring back filename hashes/lookups

These are the things that were removed from the root manifest back in 8.2 (for retail). While it would still be up to the community to guess filenames, we would now be able to verify (and to an extent bruteforce) these names and automate this process. This would still leave us without names for many files that we can’t guess properly and is not the preferred method, but it’s something. Not a huge effort or extra amount of data as interface files still have these, it’d just be a matter of also shipping these for other types of files but maybe not all (see possible concerns).

I’m sure there’s other ways of going about it, and probably better ones too.

Possible concerns with the solutions

Datamining

Whether or not the reason for filenames to be removed in the first place was to deter dataminers in the ongoing anti-datamining arms war or for optimizing the game/patches, it did make the life of dataminers slightly more annoying, particularly in datamining the latest content. A possible solution to keeping this ‘annoyance’ and not having this benefit the datamining of spoilers and such, would be to not ship filenames/hashes during PTR or not releasing filenames/hashes for files added in the latest patch (or even the latest expansion) as well as leaving out filenames for encrypted files, similar to what is already being done for the ManifestInterfaceData database table used in interface exports. This would still give the community proper filenames for many files, with a delay for filenames of the latest content.

Leaks/unannounced content

In the past, files were sometimes added that were meant for upcoming content instead (e.g. the kultirasquest items back in Legion). While great for hype, this was likely unwanted. While significant effort, a possible solution for this could be to only limit files that get shipped to files referenced from other files. To this day, each patch adds more files that are never used/exclusively for developer/internal use, and while I personally love looking at these (p.s. why is the amazing Hearthstone Tavern still unused this hurts me greatly), straight up not shipping these files in the first place would not only prevent more leaks, but also save on patch/client size.

Patch size/duration

Some of the solutions would add a few dozen more megabytes to the patch process, which would make the patching process take slightly longer for users on slower connections. However, the solution mentioned in the previous paragraph would compensate for this.

Concerns with no solution

If this issue were to remain unsolved, with all the signs pointing to it only getting worse over time, it would significantly harm the ability of the earlier mentioned groups of artists/devs to work with data/assets from WoW and creating things with that.

As someone who started their programming career around 2008 working with data from WoW (which back then was way more accessible) as well as having spent many hours dabbling with assets in Blender, it would be a true shame if this would become even harder and less accessible than it is today.

While I personally don’t rely on making art/machinima/addons for a living, I have friends that do, as well as friends that have moved up in the world thanks to WoW and its (once very open) data and healthy community tools. Hell, I know of Blizzard employees that started out using tools like these and eventually got a job at Blizz thanks to the work they were able to do with the tools!

However, nowadays, new devs are regularly scared off due to the increasing complexity in developing tools to work around the kind of obfuscation that this filename removal project is a part of which in turn scares off new artists as well due to hard to find assets or other issues. This in turn will also make it harder for Blizzard to commission external artists in the future as well as affecting other parts of the community.

In closing

I’m hoping we can at least start a discussion about the pros and cons of all of this and maybe collaboratively come up with solutions that tackle these issues without necessarily making things like datamining easier. I understand it’s a tough problem, but I think it’s one worth solving, even if it takes a while. Hoping fellow community council members/Blizz devs have some ideas about this. Also, a big thanks goes out to my friends and various people affected by the changes mentioned in this post for proof-reading and contributing to this post. Thanks for reading!

85 Likes

It would be super-appreciated and a huge support if we were allowed to see file names once again, at least for non-sensitive files. The lack of naming is just one part of the difficulty for data-handling sites supporting content creators these days, and it shouldn’t be underestimated just how much models, databases, and other resources are used by the creator community to promote the game, and provide resources that even developers rely on. We’ve done this for decades, but lately it’s become that much harder.

9 Likes

With the Alpha underway, I feel this is an important topic to consider.

I don’t have any additional insight to add to this thread, just wanted to bring it back to the front of people’s minds.

2 Likes