Severe lag spikes in raid (especially during peak hours)

Pingu-icecrown · August 17, 2021, 7:51am

I’d like to forward this thread over to the Technical Support forums: Your Servers Are Crap and Guilds Are Dying Because of It

I came across it, because I (and my entire guild) have been having severe lag spikes this past week during our raid nights. We’ve been progressing on Mythic Sylvanas, and didn’t have any issues on our first night progressing (this past Wednesday). On Sunday (yesterday) and Monday (today) though, we ran into severe lag spikes, consisting of input lag (delaying ability casts or causing them not to cast at all) as well as rubberbanding, which sometimes caused players to be hit (and killed) by mechanics they had dodged seconds prior, as if their character hadn’t moved or was in another location. Here’s an example from tonight: https://clips.twitch.tv/ThirstyWrongVanillaBudStar-uQ9uSse-gs73Je7-

The thread I linked above mentions many instances of lag in sanctum of domination, but the particular issues I’ve been experiencing are mostly limited to mythic sylvanas. I’m aware of other guilds encountering these issues, though at the time we were raiding we checked some streams of guilds pulling mythic sylvanas, and didn’t notice any of them lagging (though they were all guilds on different servers). So…it seems to vary on which servers its affecting at a given time. We pretty much always get a few seconds of lag on the pull, but the bigger issue is at times when many adds spawn, such as when the Dark Sentinel adds are up, or when all the chains spawn around the room during the intermission in phase 1, we get severe lag spikes. It’s not always consistent on when/if it happens, though tonight it was happening on every pull for at least 2-3 hours. It started to subside towards the end of the night.

All in all, the majority of our raids yesterday and today were near-unplayable. We’ve also seen some lag out of combat when sitting around, almost like post-pull lag.

I ended up using Looking Glass to do a traceroute to my server (Icecrown) and these were the results:

TRACEROUTE:
traceroute to 108.24.0.76 (108.24.0.76), 15 hops max, 60 byte packets
1 Blizzard Blizzard 0.405 ms 0.368 ms 0.367 ms
2 24.105.18.130 (24.105.18.130) 1.173 ms 1.186 ms 1.187 ms
3 137.221.105.10 (137.221.105.10) 1.186 ms 1.204 ms 1.221 ms
4 137.221.66.16 (137.221.66.16) 0.519 ms 0.536 ms 0.537 ms
5 137.221.83.80 (137.221.83.80) 672.504 ms 672.511 ms 672.515 ms
6 * * *
7 137.221.68.32 (137.221.68.32) 6.193 ms 6.204 ms 6.201 ms
8 4-1-3.ear3.LosAngeles1.Level3.net (4.71.135.105) 5.584 ms 5.457 ms 5.438 ms
9 ae-2-3601.edge7.LosAngeles1.Level3.net (4.69.219.45) 5.748 ms 5.706 ms 5.649 ms
10 TWC-level3-40G.Miami.Level3.net (4.68.62.182) 6.582 ms 6.573 ms 6.567 ms
11 ae203-0.CMDNNJ-VFTTP-301.verizon-gni.net (130.81.58.127) 63.749 ms 63.764 ms 63.420 ms
12 pool-108-24-0-76.cmdnnj.fios.verizon.net (108.24.0.76) 68.878 ms 70.771 ms 71.075 ms

MTR:
Start: Tue Aug 17 05:40:28 2021 Blizzard 1.|-- Blizzard 0.0% 10 0.3 0.3 0.2 0.6 0.0
2.|-- 24.105.18.130 0.0% 10 0.8 0.6 0.5 0.8 0.0
3.|-- 137.221.105.10 0.0% 10 1.1 0.9 0.7 1.4 0.0
4.|-- 137.221.66.16 0.0% 10 0.7 0.6 0.5 0.7 0.0
5.|-- 137.221.83.80 0.0% 10 159.2 583.4 98.0 1077. 332.7
6.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
7.|-- 137.221.68.32 0.0% 10 5.9 7.0 5.9 14.5 2.6
8.|-- 4-1-3.ear3.LosAngeles1.Level3.net 0.0% 10 5.5 5.6 5.4 5.8 0.0
9.|-- ae-2-3601.edge7.LosAngeles1.Level3.net 0.0% 10 5.9 8.4 5.7 23.6 6.0
10.|-- TWC-level3-40G.Miami.Level3.net 0.0% 10 6.5 6.7 6.4 8.1 0.0
11.|-- ae203-0.CMDNNJ-VFTTP-301.verizon-gni.net 0.0% 10 63.8 63.8 63.5 64.4 0.0
12.|-- pool-108-24-0-76.cmdnnj.fios.verizon.net 0.0% 10 69.2 68.4 66.7 69.9 0.9

As you can see in the traceroute, 137.221.83.80 is the ip right before the node that’s timing out (and it has high latency, 672 ms). It actually had even higher latency on a different traceroute I did earlier. I also want to mention this thread as well, as it brings up some similar issues: [Wave Broadband] High Latency/Connection Issues - #74 by Vortimer-proudmoore

In the post I linked to from that thread, 137.221.83.84 (a blizzard server in las vegas) was suspected as being a cause of the issues there. That’s nearly identical to the server I mentioned above…so I’m guessing that the problem we’re experiencing is related to what’s being discussed in that thread. I’m not a Wave ISP customer though, and as far as I’m aware no one in my guild is either, hence why I didn’t post in that thread.

To be clear, I’m not really encountering connection issues outside of raiding. I haven’t done any raids other than mythic sylvanas in the past few days, though I did do a heroic raid on friday that was pretty laggy as well (with the raid consisting of players from various wow servers). Everyone in the raid had the same lag issues, so it certainly wasn’t just me. We also tried resetting the instance (by all zoning out and remaking the group to do a soft reset) in the hopes that it’d change the instance server we were connected to and fix the issues, but it didn’t help at all.

Seems to me like there are some widespread issues occurring with either Blizzard’s servers, or some of the servers along the route to Blizzard’s servers, that’s affecting latency in raid for a variety of players.

Vortimer-proudmoore · August 17, 2021, 3:38pm

The Wave ISP problem turned out to be on their end due to a port security configuration change they pushed to their customers renting modems & routers. That in turn caused Blizzard servers (in Las Vegas & New York) to reject up to 90% of packets originating from Wave ISP (they’ve since corrected that).

If most of your team is on the same ISP and rent modems/routers, you could check with the ISP to see if new software was loaded last weekend that might have changed port security settings. If your team is all on different ISPs that doesn’t seem to fit the problem Wave users were experiencing. It sounds more like a raid-instance server loading issue.

If it is a connection issue, then using a VPN to get a different route did workaround the Wave problem. You could try that (worth a shot) until an actual fix is identified and implemented for your specific problem. ProtonVPN was suggested as a free alternative in that thread you linked. I had one with my McAfee Anti-Virus that worked as a temporary fix.

As an aside, when the traceroute shows three asterisks it means that server is configured to not respond to a TR/MTR/Ping request, but it is still (probably) relaying packets. The most insightful thing I’ve picked up from Looking Glass is the value right after the IP address as that is the percentage of packets being dropped on that hop. Your MTR shows mostly zero dropped packets (only hop 6, but that’s “hidden”), but it does have some inconsistent delay/lag on hop 5 (my MTR when Wave’s security was hosed was 90% packet loss on hop 6 of my path).

Good luck with a resolution to your problem! Hope a VPN provides a workaround until it’s fixed.