Ultimate Starcraft 2 Optimization Guide ((CPU))

This guide is meant for people with 3 CPU cores or more. It also focuses solely on the primary bottleneck in the game and will not waste your time walking you through how to change graphics settings because all of you reading this are perfectly capable of testing that per machine oneself without me getting in the way as i could only do.

Starcraft 2 only uses 2 threads at any one given time. If for example you have a 4 core processor like most i5’s (new ones have 6) microsofts thread scheduler will juggle the game across all 4 cores. However under CPU bound scenarios, which starcraft 2 is very CPU limited, even “maxed out”, the cores could only ever hope for 50% utilization across all 4. This is made worse with an i7 with 4 additional logical core handled threads.

Hyperthreading is often misconcieved as purely a software trick. This is absolutely false. Its basically a texas instruments calculator crunching math, sharing cache with a main core, and a floating point unit. The reason people disable hyperthreading on i7s and get better performance a lot of the time, is because most game engines only use 4 threads, and trying to use more only results in performance degradation. Only a handful of games actually can take advantage of hyperthreading basically speaking. Examples: Witcher 3, Crysis 3, Battlefield 4. Logical processors, the hardware source of hyperthreading perform about around 25% of a full fledged core. It only makes sense to use them for games, if the game is programmed to use more threads asynchronously without penalization. Most games not doing so, is precisely why for years and years and years, an i5 was enough for gaming. That was basically taken as wisdom, along with specific ram speeds that generally ignored overclocked memory controllers faster ram could interface with. With the public mind you tend to get approximations, not specifics.

Developers do not list the maximum number of threads that can be taken advantage of before inducing needless juggling which leads to level 3 cache being dumped as information is passed off between different cores. This is called context switching. Microsofts software engineers for years have taken the “its good enough” approach to programming thread behavior, so to get better results you need to understand whats happening and take matters into ones own hands to get better performance.

What can be done to get better performance in starcraft 2? Well you can overclock the CPU cores. Everyone knows this… The cache and integrated memory controller however are pretty esoteric. Ram frequency, main ram timings, tertiary timings can be set using online charts. This done correctly is a tedious process. Crashing a machine by finding overclock limits will lead to silent data corruption nooblings. It can even corrupt a boot partition! You can also go insane lengths to trim overhead of the OS, by trimming bloat in every possible way short of hacking the kernal and reverse engineering it to detangle the excessive dependencies and often needless processes otherwise interminable. Cortana is that you? This however is not the scope of this information piece. People are notoriously lazy, lets face it, its true; and so the main focus is as plug and play with limited input as possible. You realistically cant expect the average person even confronted with the path to do so, to dig through a task scheduler, or regedit, or other drudgery because its a legitimate painful complex long process with opportunities for instances of service and task with diverse hardware, for things I cannot reasonably foresee. To take it to the polished conclusion you need passion, dedication, and some level of intuition having had a bit of dirt under your nails with some help.

Whats more is this option works on any PC without overclocking or massive changes. I do not represent the company, and I would encourage those who appreciate their work to support it, but there is a free program (Also a paid option) that allows manual control of thread scheduling behavior. It is called process lasso. I find it better than task manager.

Playing with priority can introduce more problems than will be worthwhile with respect to the average person who understandably wont be willing to test every little change. A lot of them are program dependent. What matters more, is knowing what CPU architecture (core count, logical core presence) you have relative to the games needs to avoid context switching. In fact I do not recommend anyone disable hyperthreading in bios as this software used correctly makes it obsolete at best, and wasteful at worst as the logical cores can still at least help out by handling background operating system overhead. Learn what a CPU demanding program can take advantage of by benchmarking it with different settings especially at lower CPU clocks to introduce a visible scaling performance curve. Most of the time experience will be of sufficient expectation to get the desired result without really having to do this so you can get familiar with this for future optimization targets. Once you know it, you know it, like 1+1.

A 4 core 8 thread i7 like this machine will make the most of the hardware if starcraft 2 is set to core 0 and core 2. This is because core 1 and core 3 are logical cores, weaker hardware, and is to be avoided. Ryzen CPUs with SMT will require the same settings as an i7. If you have a 4 or 6 core i5; core 0, and 1 would be perfectly fine as there are no logical cores. If you have less than 3 cores (Old phenom CPUs) there will be no gains to be had. The reason the game in old benchmarks performed better on 3 core over 2 core processors is even with context switching, the OS in itself is a leach for CPU cycles competing with limited 2 core resources the game needs. As core count goes up, the penalization increases.

Another trick you can do especially if you purpose built a machine for starcraft 2 like I did, is take the entire array of background listed services from process lassos UI, checking periodically to permanently assign as they semi randomly pop up and will not be necessarily running all the time, and put them on alternate cores that are not going to be handling the games 2 threads. How these are linked is in a staggered order. Example: Core 0 is a full fledged core, core 1 is the weaker logical core dependent, with core 2 being a fully realized core with core 3 being a logical core, and so on. If you have an 8 thread or more CPU, you can easily avoid giving work to the the logical cores sharing resources with the full fledged cores processing the game threads. This by far is the most profound change one can make to hasten frame consistency where CPU’s struggle. I recommend you disable process lassos logging completely as its just increasing overhead.

Congratulations! You now have within the games limits of coding the maximal amount of resources isolated for the game itself. You may notice even loading the game and surfing the menus is more responsive. Sometimes wildly if the CPU is fairly weak.

This game in numerous scenarios will tank under 60hz no matter how potent your CPU is. Ive bought binned high end cutting edge consumer market CPUs and no matter what, if the game drops even 1 frame, to make the GPU monitor refresh synchronous, there will be frame duplication. Also called stutter. Essentially 1 frame drop will become 2 at the least to mathematically fit into the refresh set limit as the monitor doesnt just magically stop pumping more frames than the render can actually dish out. This is why G sync was marketed using starcraft 2 in one of its demos. I personally have tried getting Freesync to work in the past, but could not get it to work, though its reported it does work at present. To get a smooth experience you will need G sync or Freesync.

If you are one of those people that do not notice screen tearing with vsync off, or the stutter with vsync on in certain situations; maybe you shouldnt bother. That or buy some glasses or something because even the broadcasted games Blizzard puts out have screen tearing. Its honestly disgusting and worse, unnecessary.

How do I test the end result? Well you got yourself some replays right? I will eventually standardize a very intense 4 vs 4 replay we can use so there is parity across systems for comparison. The camera position of any given player is locked throughout, and with the majority of whats onscreen being deterministic, or identical (Ragdoll animations should be turned off), you can measure if the tweak is working. Particularly when jumping in game from a loading screen, the shader caching at the first second or so depending on CPU, will be your most blatant proof, as well as in 4 vs 4 games where unit count clashes into epic intensity. Some people are upset the countdown patched in hiding this. For worker splits I personally appreciate it. I should also mention the before and after of speeding up the replay will stand out because the freed up and isolated resources will be spent more instead on the game and it particularly struggles at 8x speed. Fraps is a free program to get active framerate but use what you want.

6 Likes

While kudos for the insights and work you put into this, isnt this a bit of a overshot for most of poeple with more or less modern computers? I mean Sc2 is 10 years old, and was designed to be run on computers that were common then…With more or less current CPU and GPU, i would imagine sc2 will run smoothly without any magic.

1 Like

A modern CPUs instructions per clock, and raw clock rate hasnt changed much with lithographic jumps for a long time.

I appreciate the jump from 14nm to 5nm is by ratio pretty impressive, but it took YEARS to achieve, and is still limited to engineering samples basically. All that extra cramming space went toward more cores, and more cache, and we are at a point in the physics of how low can you go, where quantum tunneling, or leaky silicon is a real fear for x86 licensed manufacturing.

If the game can only use 2 threads in the 12 thread example, the scaling is only found in single threaded performance metrics really. Haswell for example is pretty ancient, but its not wildly different from an intel 9th gen, and considering its a 4th generation, the game doesnt really gain very much from upgrading.

With fairly modern silicon lottery (a business) derived hardware, and very fast overclocked DDR4, im seeing drops. I will say, this game loves cache memory. It is if I recall correctly, based on a janky implementation of dual threadedness from a modified warcraft 3 engine. The efficiency of such a engine will only accept so much brute force within the limits of x86.

There is a youtube channel called coreteks. He in one of his videos, mentioned an A.I. that can rearrange windows kernals, so that in practice you can take truly otherwise single threaded programs, and actually benefit from throwing moooore coooores at it. I suspect because the lithographic tick being the slow rate it is, and the hardware market not really needing faster CPUs without poorly written software or just old software (Crysis 1 comes to mind, which has the same problem, 2 threads only without a mod to allow a 3rd) that there are incentives to place artificiated performance bottlenecks for the sake of demand.

“Smoothly” is somewhat debatable and a topic I think would be more productively investigated with actual metrics. Because its age, no one is really benchmarking starcraft 2, beyond the odd ram frequency video or forum post. There is a reason for this.

Show me a PC that can play crysis 1 on certain levels oozing in A.I. scripts without massive drops you can see as a number on the screen without liquid nitrogen cooling, and a willing friend to keep that house of cards going.

Starcraft 2 is no different under certain scenarios. Most notably, 4 vs 4 late game. Most people arent throwing royalties at special wecial chips with less nano scale impedance defects just to get more performance. I can only assume most people playing starcraft arent running monster machines (And those are stupidly diminishing returns). This is not a graphics rendering issue. Since HD 530 nvidia tech was bought by intel, and its actually 16nm sitting next to 14nm as of skylakes launch. The same architecture except for pitch of transistors, level 3 cache, and a better clocked IMC survived all the way to 8th generation. 9 isnt much different. It just dumps logical cores, and we no longer get the performance penalty for windows’ spectre exploit patch. The point is, even integrated graphics are more than enough at the low end for this game, but CPUs have stubbornly been the problem, even to the modern day. Especially on notebooks with comparably slow 4.3ghz core/4.3ghz IMC 2133 CAS 10 (tightened down the line) with a gaming specific optimization OS and every trick in the book. In 3Dmark11 at 100% hardware parity I got 187mhz worth of free clocks back scaled to haswell. Like practicing good data hygiene with lots of back up options, and disabling a bit of checksum. I dont tolerate corruption anyway. This is my weakest machine and not specifically procured for starcraft 2, but its not supposed to hit 4.3ghz in 2 core mode either. The max ever allowable is 99% of the examples, locked at 3.7ghz (2 core mode), and by golly could fewer still actually cool the thing under 8 threads maxed, without say a 3D printed lower custom chassis and a lot of other voodoo. Its lowest FPS ive seen is mid 30s. Absolutely maxed out colliding 4 vs 4 armies. The cream of the crop isnt going to be doing a whole lot better bro. Its like a V8 running on 2 cylinders. Modernity doesnt really solve the problem.

3 Likes

Which video is it? Can you download it somehwhere?

As in a before and after proof video? I havent bothered making one. Generally when I go out of my way to help the community, it goes unnoticed anyway, like my 2 plus hour video on how to modify windows 10, to not be a complete bloatfest. Hell, I even made a universal driver abstraction image that worked on any PC hardware which already had all the changes done to the OS, which only required by hand modification of specific program thread behavior, paired to whatever CPU architecture they had but included a guide on the desktop.

Unfortunately this project ended up getting locked out by microsofts software engineering team. You see, having a legit code activated by the users bios for a legal windows OS and in control panel, it reporting it was activated, the version I used timed out. It was essentially a beta, but no warning of any nature existed. It was sort of a, congratulations, you spent literally a month strait doing this project, only to have stupidity ruin it.

If you want a before and after video I suppose I can make one on my week off. Externally, and where you can see that there is hardware parity all the way down to ram tertiary timings.

If you want to see the video of my OS before and after, its on an older 7200rpm drive not currently in this machine, but sitting in a drawer next to it that I can plug into my fathers machine, and upload to youtube if that interests you. Its from an extinct build however.

If I were to redo this project id be doing it under a new build. You see, nvidia under no serviceable workaround, allows you to use older versions of windows anyway with its newer drivers. Specifically, once turing hit the scene. I for the longest time tried to avoid the spectre exploit slow down code, which afflicts all intel CPU’s until 9th gen. In truth the codes efficiency did improve enough to overcome this penalty of performance, but it depends on the workload type. The trend in microsofts OS, is greater and further background service dependency. For example, there was a time in history when you could delete cortana. Eventually it became counter productive performance wise, because it would loop trying to reexecute her base files endlessly and would actually degrade performance. Trying to get rid of her now, for science, yields a semi broken UI making it basically impossible to navigate, and fix without knowing what you are doing. Its becoming more and more interlinked making it harder to squeeze raw wasted clockrate from the entire thing.

To return this project to its former glory would take me another month or so. I would have to benchmark every little change, and the task scheduler alone is exploding in number each builds release where I have to investigate every little thing to find out what it potentially breaks, and if thats acceptable for what I essentially turn into a purely gaming machine.

I assume most people have more than 1 way of navigating the internet and paying for things, like on their phone. Security is not my main concern in this project. My userbase is also expected to understand data hygiene practices, which I explain fully, as I turn off some checksums, as they honestly are trash to begin with and dont stop silent data corruption under say, incorrect power off events anyway.

1 Like

What are you talking about. I asked where is that video, when is that ai, which allows single core app to use more cores. Because i couldn’t find it, there are too many of them and no video has name which would tell me which it is… You did that? If so i think it is great!

You need to post on reddit, starcraft 2 forum is dead!

Also many people are reluctant to change anything, because they are afraid of breaking it. Or it is too technical for them…

Been up for a long time about to crash. Maybe ill edit it in later if you havent found it. I forget which one exactly. Fortunately he doesnt have too many videos. It has to do with CPUs.

1 Like

It doubtfully will ever be released to the public. It would outright change how starcraft runs wildly. It would even be painless for blizzard to increase map size, and maybe even supply. No problems. Unrealistic however. CPU demand probably has a lot more influence than we might otherwise like.

So what is the point of mentioning it in optimization guide? I thought you mentioned that video, so people can find it and use it. That’s a pity. But never mind… Anyway other tips are useful as well.

Anyways, if you have problem with fps in SC2 you can just:

  • oc cpu, since it is single core game, even 4 vs 5 ghz is like 10 fps
  • oc ram, ram 2133mhz vs 3200mhz = 20 fps today in games, after 3200 there is less improvement, but there is still improvement, also if you have slow ram, tightening timings can help, but it is risky
  • get newest gpu, i heard 2000 series helped a lot in sc2 and i was testing it to see and i had almost 144fps min

On my old pc, which was still more than enough for sc2 4ghz i7, geforce 780, i had drops uder 30 fps…

I already clarified the game isnt a “single core” application. If you arent going to bother reading but want to post, you reveal yourself along two lines.

The point of mentioning it, is ever it did become available, it would transform how blizzard could handle this now decade year old product, which they still update regularly. Ive already delivered. The material logged here is sufficient to help windows based machines with the opening caveat of hardware minimum requirements to improve performance mostly in 4 vs 4 play.

Right now im looking at replacing luma based SMAA, since the in game FXAA is absolute garbage, and multisampling hasnt been supported since a bygone era with ZERO way to force it using fancy tools to play with some pretty esoteric options. I managed to get the resident evil 1 HD remake on steam, to have better than FXAA using a third party software, and a forums resources, which otherwise was not feasible.

Presently discovered fidelity FX. Unfortunately its not plug and play, and the required files to get it working, with a unfortunately poor (now essentially broken) guide on how to implement it. Reshade works to inject SMAA. Its not perfect, but fidelity FX is a better algorithm. Might be spending my weekend trying to get it to work with starcraft II. If indeed it does work the way I plan, it would be worthy of eventually being coded into the game. Its 100% open source code. It reads like they are using 32bit floating point precision, where as the driver level AMD side is using 16bit. The difference is negligible ive read. We will just have to see.

Upgraded rig: Ryzen 3700X @ 4.35ghz, 3733 CAS 16 1T (virginized all the way down for Samsung modules), game gets its own m.2 SSD with OS on its very own as well.

Even in 4 vs 4 this game can drop into the 40s with a real CPU. The single threaded performance of AMDs new monster is actually jaw dropping. Clock for clock its beating intel, like the old days. If I had an i9 binned from silicon lottery, it still couldn’t maintain 60hz under the worst case scenario. I will say, in 1 vs 1 however, the avg is wickedly high. 180hz until maybe late game where it drops to 140fps if its not too crazy.

In other news I managed to find the files required to get fidelityFX to work. Its a sharpening technique, which is far more intelligent than the dumb methods. With this GPU im able to enable predication SMAA method instead of luma SMAA method. The combination is excellent.

Blizzard, good job adding the countdown before matches. It hides the shader caching pause at the beginning that made it trickier particularly on slower CPUs, to micro the SCVs to get more minerals at the very start.

I may make another thread, but specifically on making StarCraft prettier than is possible out of the box.

I am very glad that you switch to AMD Ryzen!!! I need to get rid of intel sh*tty and overpriced CPUs too lolol

Hey Lucifer1776,

Could you please give us more specific optimization guide? we really need it to get better fps in games lol since its so bad @@
Thanks so much,

There are other guides that cover everything else. Mine is unique to the internet.

Hey, Do you have step by step guide beside your post? that would be really nice! @Lucifer1776

Well the topic is titled ULTIMATE STARCRAFT 2 OPTIMIZATION GUIDE so one would expect a guide that covers everything, but instead it’s just wall of text full of technical background and very little actual and intelligible content (ie - that there are programs called project lasso and Sweet FX that one can use)

So the original model looks like this. This is easy to steal it into Unreal 5.
https://i.imgur.com/pE6cmTl.jpg

If you want to understand what HyperThreading is, you need to understand a bit about how the CPU works.
Each CPU core has a pool of execution resources that actually execute the instructions. And different types of execution resources, are capable of executing different types of instructions.

The CPU’s scheduler tries to fill as many of these execution resources as possible in order to extract the most performance out of the CPU core. Because if you have execution resources doing nothing, you’re not getting the most performance out of your CPU.

The trouble is, the types instructions the software wants to run don’t always line up perfectly with the types of available execution resources.

One of the ways your CPU increases the probability of finding instructions that can fill the available execution resources is to use out-of-order execution. Basically, instead of just executing instructions in linear order, the scheduler picks instructions out of order, so that it can take them from further back in the thread.

Another way they do it is by using SMT (HyperThreading). Or basically, allow the scheduler to pick instructions from two threads instead of just one.

The goal is the same in both cases: Give the scheduler a wider pool of instructions that it can pick from, to increase the probability that there will be instructions available that can fill what would otherwise be idle execution units.

But here’s the thing about SMT: If the CPU is already filling most of its execution resources with one thread, it means there won’t be many execution resources available for a second thread.

So, SMT doesn’t increase performance much for software that is well optimised for the CPU you are using. It helps for software that doesn’t tend to fill as many of the CPU’s execution resources with one thread.

One unfortunate thing about Starcraft II is that it’s compiled with Intel’s compiler.

Well, I mean, Intel’s compiler is one of the best compilers out there - as long as you’re running the compiled software on an Intel CPU.

See, when Intel’s compiler compiles any code, it adds a check to see if it’s running on a CPU that returns a GenuineIntel vendor ID.

If you have a CPU that does not return a GenuineIntel vendor ID (such as a Ryzen CPU), then the compiled software will then tell your CPU to run a code path that is slower.

If you are able to change what the vendor ID says on your Ryzen CPU to say GenuineIntel instead of AuthenticAMD, it will run the faster code path.

And there’s no reason why it can’t. It’s not a compatibility issue. Intel has just deliberately decided to make non-Intel CPUs run slower code paths.

SMT is useless if in fact the software only erases the localized level 3 cache resources handing off threads to cores in pursuit of higher throughput because: Microsoft.

Im perfectly aware of what SMT is and starcraft can technically use it if you sport say a one core pentium 4. Thats the only scenario literally.

McupdateGenuineintel can be deleted forcibly if you know how from system32. Its required to take advantage of a custom microcode firmware allowing a i7 4720hq to clock, cooling handled, well beyond ordinary.

I recently built a new machine a few days ago with a 3700x. I noticed how absurd the speed is of boot. How would we change the path to enforce intels compiler on ryzen?

Do i simply remove the amd analog for the same file in system 32? – and the game takes the intel path? Experiment pending as im at work. It is slower on ryzen and you nailed why. Im impressed.

1 Like