CSV Errors when taking a backup


hi,

we have windows server 2012 r2 failover cluster 5 hyperv nodes in it. servers hp dl380g6 servers , each run set of vms. vms stored on 3 csv volumes presented our hp 3par san via iscsi. windows servers have latest updates (including hotfixes relating csv volumes). using altaro take backup of our vms onto nas device. connections via 2 x 10g nics configured lacp team. graphing suggests these nics close saturation (traffic maybe peaks @ 1.1gbps). vmqs disabled on these interfaces.

the problem we're having that, when full backups run, several csv related errors such follwoing:

- cluster shared volume 'volume11' ('hp3par-lun22_hvcl02-vol11') has entered paused state because of '(c00000b5)'. i/o temporarily queued until path volume reestablished.

- cluster shared volume 'volume10' ('hp3par-lun21_hvcl02-vol10') no longer accessible cluster node because of error '(1460)'. please troubleshoot node's connectivity storage device , network connectivity.

when happens, our vms go offline and, needless say, there's business impact.

as implied above, i've researched as , tried sorts of solutions suggested different threads (latest updates, disabling vmqs etc.) nothing seems help. @ point, i've concluded following:

- it's not san performance issue. have cluster on same san works fine.

- it's not network performance issue; links under 50% usage.

- suspect it's related load. seems happen when full backups occur. when altaro uses change block tracking (a bit differential backup) these errors don't occur since load less.

- seem related redirected io. because reporting node typically not owner node. in first error above, example, owner node node 3 error reported node 1. altaro told me redirected io expected when backup taken due way vss works. unfortunately therefore can't avoid it.

at point, i'm @ bit of loss try next. therefore appreciated.

thanks & regards,

joe

have run validation wizard against cluster see if alerts warnings.  warnings need looked at.

you should not disabling vmq on 10 ge nics.  common issue on 1 ge nics, has not been issue on 10 ge.

it recommended use mpio iscsi instead of nic teaming.

is nas device on own nic?

you may want run configuration against altaro see have say.  may have suggestions on how configure particular environment.


. : | : . : | : . tim



Windows Server  >  High Availability (Clustering)



Comments

Popular posts from this blog

Motherboard replacement

Cannot create Full Text Search catalog after upgrading to V12 - Database is not fully started up or it is not in an ONLINE state

Remote Desktop App - Error 0x207 or 0x607