VRAM Management
![marble](https://farnsworth-prod.uc.r.appspot.com/forums/uploads/userpics/488/n8P2SPAXVTBEI.png)
I am wondering if there has been a change in the way GPU VRAM is managed during an IRay render.
I have a utility called GPU-Z and I have used it ever since I bought my 1070 GPU with 8GB VRAM. I used to be able to judge when I was getting close to the limit by watching the utility report the memory usage as the scene loads and the render starts. However, what I appear to be seeing now is that the memory usage leaps to near maximum suggesting that I can't add more characters to the scene. However, when I did add another character (total now 3), the memory in use hardly changed. I am pretty certain that was not the case when I first bought the card.
Just to be sure, I tried the scene with a single character and no props, clothing or buildings/rooms and the total memory was more than half the total 8GB. The easy conclusion is that adding all those props and two more clothed characters would send the memory way over the limit but it didn't. I can only assume that there is some on-th-fly compression going on.
Comments
Try to install the latest nvidia driver. I was many problem with my 970 gtx ans yesterday evening i install the latest update ans i have not problem now.
The thing is ... I don't think it is actually a problem. I think that is how it is supposed to work - to use as much as is available and compress when necessary. I update the NVidia drivers quite regularly, especially since Microsoft seems to want to install their driver versions whenever they like.
If you run in windows-10, VRAM is now "shared", among all cards, as one large chunk... But only for "windows programs", (Direct-X and OpenGL), operating through windows. Via WDDM drivers.
What remains, is available for "direct access", which is what IRAY uses, to talk to CUDA and VRAM. Windows, greedily, will hog up to 3GB on EACH 11-12GB card, reserved for itself. Up to 2GB on cards with 4-8GB of VRAM. It uses all cards as if they are one, by threads, even without SLI. Even if it only needs 0.1GB... It still locks-out the rest. There is no way to change this, at the moment. (No settings or "do not use", options. There is a TCC-Mode, but that is not a "setting" for WDDM, it is an alternate driver-mode.)
Linux and Mac do not have this issue, at the moment, but Mac will soon have it too. Linux, I assume you will have some control over it.
I would hope that IRAY/NVIDIA will get with Microsoft, and create some kind of "forced mode", other than TCC-mode, which is the only way to stop windows from allocating that memory in your GPU. (Daz is not TCC stable. That treats your GPU as if it were just CUDA hardware, and not an actual video-card. Thus, windows will not "borrow", or "lock-out" that memory. IRAY handles it, but not Daz's version, and not Daz, while using IRAY.)
Windows 8.1 and lower, do not have this issue, as far as I know. The new WDDM drivers are a Windows-10 thing. (Made for the sake of making games faster and allowing more to be done, with the new 8+ core CPU's.)
Not sure if I understand correctly but I have a 970 and a 1070. The 970 is the "System" device used by Windows as the default for the display, etc. The 970 and the CPU are disabled in the DAZ Studio IRay settings (Advanced). So I assume I'm getting all 8GB for my renders?
No - something is definitely wrong. Scenes I have easily rendered in the past are now dropping to CPU. I have updated the NVidia drivers in the last couple of hours and that has made no difference (except for a Blue Screen Windows 10 crash). I am using Scene Optimiser to reduce textures and hiding or remving anything that does not need to be in camera view.
VRAM is a nightmare. GPU prices are way beyond my means to keep upgrading and 8GB should be enough - it has been for at least a year now.
Windows 10, with the new WDDM drivers, if that is what you have, will use a set level of VRAM. It reserves it for itself and games (direct-x and openGL), from ALL GPU's, each. Doesn't matter if the GPU is the actual display GPU or not. Well, it will use MORE for the display GPU. (Shown as dedicated-memory, which is per-card.)
If IRAY used "windows", to access CUDA and VRAM, they would have access to all VRAM, as if it were one large chunk. However, it doesn't. IRAY talks directly to the card, and asks it "How much is FREE", which it replies, about 2-4GB short of what it actually could have, free. (If it were not being reserved by windows.)
IRAY needs to assign graphics directly to each GPU that it is using the CUDA cores for. Or it is not "fast". Games, do not. They use the "other parts of the chip", besides CUDA, and don't mind waiting a micro-second to "move an image", if needed. Normally they just stay where they are, and it passes processing code to WDDM, to manage what is doing what.
Thus, the TCC mode, which hides the card from windows as a display-driver.
https://social.technet.microsoft.com/Forums/windows/en-US/15b9654e-5da7-45b7-93de-e8b63faef064/windows-10-does-not-let-cuda-applications-to-use-all-vram-on-especially-secondary-graphics-cards?forum=win10itprohardware
https://answers.microsoft.com/en-us/windows/forum/windows_10-hardware-winpc/windows-10-does-not-let-cuda-applications-to-use/cffb3fcd-5a21-46cf-8123-aa53bb8bafd6
Both Microsoft and Nvidia, have been made aware of this issue. Hopefully, Nvidia slaps microsoft with a golden-ruler, that says, "Yield to hardware access (IRAY), or we will cut you off from using our GPU's VRAM this way, at all." It does this to Radeon cards too. But those are mostly used for games, at the moment. So they only see the "Gains", of WDDM drivers. (I joked and said that I figured an AMD guy was in on the WDDM development, knowing it would cripple the business-side of things. Not many businesses upgrade to windows-10, unless they needed to.)
This is why it was once dangerous to start writing/accessing RAM at a hardware level. Windows couldn't see what was going on, and it would ALSO try to write there. Which is why they encouraged/demanded that you use windows to do all your direct-access to RAM. (Which was slow, defeating the purpose of direct-access.) Now, it just has a "watch-dog", which is a part of the chip that self-monitors "reservations", so windows is now the one that is slower, due to this. Thus, the need for faster RAM and CPU's. It is a funny virtual peeing-contest, between developers of windows and developers who are forced to work within windows.
Here are my 3 Titans, rendering... (ATTACHED IMAGES) Notice the "dedicated GPU memory", per card. Also the "reported activity". All three cards are running full force. But WDDM can't see that, because IRAY is talking directly to the cards. None of the numbers are accurate, or add-up to any realistic reflected reality. The numbers are the same, if I am NOT rendering too. GPU2 is the one that my video-screen is hooked-up to. Thus, it has even MORE memory reserved/unavailable. (It is a 65" 4K TV.)
Notice the older titan, GPU1, with the same 12GB, has 1 GB less VRAM reserved. Why??? Who knows... Newer cards seem to have more "reserved", left Unavailable for rendering.
In TCC mode, as opposed to WDDM mode, the cards showed 11.85GB available, but Daz Crashes, after 1 render, if the cards are setup in TCC mode. (I assume it is fighting with OpenGL, done through windows WDDM drivers, but you tell it to use all three cards for CUDA, in TCC direct access rendering, so it can't "clean-up" the memory. That is what the log-files indicate. They need individual settings for OpenGL and IRAY, but they do not. They are just all lumped-in as one setting, in the rendering area.)
IRAY works fine in TCC mode, Daz doesn't. (Windows enjoys TCC mode too.) I have asked Daz if they could look-into supporting TCC mode, as it is a requirement for professional development. (I assume they are in no rush, because you can use IRAY-Remote-Render, which works in TCC mode, on another computer. However, I need THIS computer and my remote one, to both work in TCC mode. I don't need just faster rendering, I also need faster DAZ development, of the same size scenes. Which my cards are ALSO used to render. But limited by this WDDM limit. As you see. 5.2GB from 12GB = 6.8GB, which is what ALL my scenes are limited to, at the moment. Less, actually, because the DAZ-program, itself, is part of that WDDM reservation. The scene in DAZ, makes it reserve even more in windows, limiting the "free memory" for rendering. It is a catch-22.
Many thanks for all that information - I get most of it. And here was I thinking that I had got around the reservation issue by dedicating the 970 to the display and leaving the 1070 for IRay only. :(
Nevertheless, it is still performing worse than it did until recently - in terms of VRAM management. I can't render scenes that would not have been a problem before.
In the advacned settings for IRay in DS, you can specify texture compression which will help with getting larger scenes on a card. Over all, textures do take up the most VRAM and the standard size for most image maps these days tend to be 4096x4096. Even if the same map is used on two different surfaces, its loaded twice. Its been discussed here and there by many. Here is my rundown on it if it helps - https://mattymanx.deviantart.com/journal/Rendering-large-scenes-in-Daz-Studio-with-Iray-715978708
Yes - I use Scene Optimizer to reduce texture sizes. All my characters have been run through that to reduce the textures by half.
Wow.., I did not realise that, I would have thought that Iray would use instances of the same texture used multiple times, thats very disappointing.
S.K.
I would greatly appreciate if someone could educate me on where to find the "official" recognition and explanation of the "Windows 10 reserving big chunks of VRAM" issue. As a software guy, I'm always keeping in mind that appearances can be deceiving, and what seems obvious might not always be that way under the hood.
And after a lot of research all I've found is a few industry forum posts from a couple of years ago that declare it's a problem, with users making Microsoft and others aware of the problem, but I've never found an official acceptance (from Microsoft, NVIDIA, or any other reliable industry tech source) that it really is a problem. In fact, what I've seen is a lot of people in forums saying the contrary: it's not a problem, and it's a mix of mis-reporting memory monitors, incorrect understanding of what the "reserved memory" reports are saying, incorrect pagefile settings, and memory that, while reported to be reserved, isn't actually reserved in practice. But without a reliable industry confirmation, I take both sides' pronouncements with a big grain of salt. I even tried to search thru the Windows WDDM docs to find a mention of this being a possibility, but came away just shaking my head in confusion. My respect goes out to anyone who can make sense of all that. Which is just one more reason I take these discussions with a grain of salt. How many of us really understand what's going on?
And being someone who really (really) dislikes Windows, I certainly understand those who assume it's just another Windows mess, and therefore it's true by default. I've seen a lot of industry posts that are limited to "Oh really? I'm not surprised. Windows sucks". But I also understand that this stuff is very very complicated under the hood, so I usually take internet forum proclamations like this with a grain of salt. Personally, I don't care what the answer is, but I like to be objective, and really would like to find out the truth, whatever it is. If it is a problem, it's just another mess to add to my long list of "I hate Windows" reasons, and not much I can do about it. On the other hand, if it's actually a myth, I'd like to put it to bed and move on.
Thanks much.
Indeed, it would be good to get clarity. For example, do we believe monitors like GPU-Z? I ask because mine has reported a render running at 7.9GB VRAM (though I seriously doubt the scene was that huge) and I have no idea whether any Windows reservation is accounted for in that figure. Also, I do wonder whether the reservation is spread or repeated over multiple cards. I have a 970 and a 1070 installed and the 970 is configured to run my display, play videos, etc., while the 1070 is only there for IRay.
The sense I'm getting from people is that they believe that the "reserved" VRAM numbers are only a temporary thing, and as soon as the application requires more VRAM Windows gives it up. Or something like that..
OK, that would explain some of the confusion.
This Microsoft TechNet discussion has been going on for 18 months with little to no clarification from the Microsoft techies involved in the discussion. The latest post is a few days ago and still complaining about the Windows 10 VRAM reservation (especially on systems with multiple GPUs).
https://social.technet.microsoft.com/Forums/en-US/15b9654e-5da7-45b7-93de-e8b63faef064/windows-10-does-not-let-cuda-applications-to-use-all-vram-on-especially-secondary-graphics-cards?forum=win10itprohardware
Yes, and that's one of the posts I'm referring to. Started in 2016 by a user of an unnamed 3D app I believe, and it's one of the few post that keeps getting referenced over and over as apparently the source of the issue. Everything else I've found seems to be referencing those few posts, pretty much assuming them to be true.
I'm in no way trying to discredit those posts, only to find objective verification from a trusted vendor or other tech resource. Like I say, sometimes what seems obvious isn't the way it actually works under the hood. And choosing the right things to test to really verify your belief can be difficult.
I'm especially concerned that none of the tech publications (at least the ones I could find) seem to mention this as an issue, or have tested and verified it.
I agree - as I said, clarification would be welcome. However, that discussion is under the auspices of Microsoft Technet (which is why I linked it) and the responses from those representing Microsoft technical support do nothing to clarify the issue beyond stock responses and pointing the finger anywhere but Microsoft. There were some interesting questions raised in the discussion but none were answered adequately. When I have looked at this in the past (it is an issue which crops up repeatedly) I have noticed that the only reporting of it is on forums so I have yet to find what you refer to as an official or trusted source. I can only conclude that most people use these GPU cards for gaming (or mining, these days) and few for IRay.
@ the VRAM numbers that matter for DAZ Studio Iray
When you are rendering with Nvidia Iray all that really matters are the "available VRAM" numbers that are indicated in the DAZ Studio log screens.
https://www.daz3d.com/forums/discussion/172866/quick-guide-finding-information-about-vram-directly-in-daz-studio
Just have a look at those numbers during render time on your own system.
2017-06-01 07:11:49.303 Iray INFO - module:category(IRAY:RENDER): 1.0 IRAY rend info : CUDA device 0 (GeForce GTX 1080 Ti): compute capability 6.1, 11 GiB total, 9.17244 GiB available"
- - -
Is there any DAZ Studio and windows 10 user out there who can post a screenshot of this line where it reads
11 GiB total, 11 GiB available"
yes? no?
- - -
If you just have one GPU ( video card) and your display is plugged into that card you may have some VRAM used to run the display and some VRAM may be used by other "system" software.
But if a second "rendering" GPU is plugged into a separate SLI slot the full amount of VRAM should be available for rendering when only using the "rendering" GPU for rendering.
That was how it worked with windows 7. And with windows 10 this changed.
A large amount of VRAM is now not anymore "available" for the render engines no matter if it is a display or a rendering GPU.
That is why people are upset.
- - -
My impression is that it is a small number of people on the DAZ Studio forum who keep repeating that this issue is a myth.
So far I have not seen a single screenshot of the Iray DAZ Studio log file from people claiming that this issue does not exist that proves that on their system Iray can use all the 11 GiB of VRAM of a GTX 1080 Ti set up as separate rendering GPU.
If you can provide a screenshot of the DAZ Studio log file of your working system then please do so.
- - -
This isn't an "us vs. them" issue, it's a search for the truth. I don't think anyone is calling it a myth, just looking for some validation by the developers who are familiar with the internal workings.
Sounds like your assumption is that the "available" VRAM numbers mean that the "unavailable" VRAM is totally inaccessible by any applications. Are we absolutely certain that's true? From what I've read in other postings, that is not what it means, and in fact it can be grabbed by the application. So if that's the case, posting those numbers may not be relevant.
All I'm asking is for some references to folks from Microsoft or NVIDIA or some trusted tech publications that can validate the issue. I'm not saying it's a myth, just asking for some proof that it's really an issue. Like I say, I don't care what the answer is, I just want the facts.
Thanks
OctaneRender developers stated that all Cuda related applications are affected.
OctaneRender and Iray both rely on Cuda.
https://developer.nvidia.com/cuda-zone
The facts that matter to DAZ Studio Iray users are the numbers the DAZ Studio Log files give us.
The facts that matter for Octane Render users are the numbers indicated by the internal render engine tools.
The official channels and threads I bookmarked have also not provided further information.
What we can observe is that since the Windows 10 2017 fall update the windows task manager has been updated to provide more information about multiple GPU.
https://blogs.msdn.microsoft.com/directx/2017/07/21/gpus-in-the-task-manager/
My personal impression after reading through that blog entry is that "available VRAM" is part of a complex system.
However, what matters to users of render engines it the actual amount of VRAM available for use in cuda applications like Iray and Octane.
- - -
edited - removed more detailed speclation why there are no further official statements about this issue.
In short my impression is that there may not be a simple fix to make windows 10 behave the same way windows 7 did.
Microsoft, Nvidia, Intel, AMD may need to work together to come up with a Display Driver model that makes a difference between GPU devices that have a display plugged in and additional GPU devices that are just used for rendering.
Okay, thanks. So is it your understanding that only the Octane Render developers have stated the issue, and no other tech developers? Do you have a link to the Octane Render statements? I'm not questioning their statements, just curious what they actually said. Like you said, this stuff is hugely complex, and what might seem like one thing could be something else entirely, and very difficult to test.
I'm not sure how you would even test the issue. Two absolutely identical machines (including CPU, hardware, drivers, etc.), with one running W7 and the other running W10, and both render the exact same scene (which also uses a lot of VRAM), but one crashes to CPU and the other renders on GPU?
Anyway, do you know of any other tech "authorities" who have mentioned it as a problem? I would think the NIVIDIA folks and anyone else using CUDA would be all over it. As well as the tech journals, and industry professionals who rely on GPU rendering noticing the loss of VRAM. But I can't find anything official.
Thanks again for any input
We start to risk to repeat exactly the same information again that is allready posted in the available older threads.
The original threads are still available. However it seems they are now part of the private area of the Otoy forum.
Still I understand that people do not want to read trough several pages of two year old threads that may require them to create an account on a 3rd party site.
December 2015:
"A few people have noticed that on Windows 10 a large chunk of device memory is unavailable. This occurs even on GPUs which are not connected to a screen.
We can confirm this affects any CUDA application on any type of GPU. GPUs with more VRAM will have a larger amount of unusable memory. On 6GB cards, a bit over 1GB is unusable. CUDA applications effectively are not able to use this memory.
At the moment we don't know any workaround yet. If you often render scenes which use most VRAM on your cards, you may need to delay upgrading to Windows 10."
Source:
https://render.otoy.com/forum/viewtopic.php?f=12&t=51992
- - -
Nvidia provided the following feedback in June 2016 to Otoy:
"It appears that in Win 10, with the Windows Display Driver Model v2, processes will be assigned budgets for how much memory they can keep resident. What we are noticing is that WDDMv2 started to impose a limit on total process allocation size. This is briefly mentioned here:
https://msdn.microsoft.com/en-us/library/windows/hardware/dn932169(v=vs.85).aspx "
Source: https://render.otoy.com/forum/viewtopic.php?f=12&t=51992&start=20#p279386
If you do not believe that with windows 7 rendering GPU were able to use the full amount of VRAM then you have to install windows 7 and test that.
But it really should be enough that you open up the DAZ Studio log file and check this line:
2017-06-01 07:11:49.303 Iray INFO - module:category(IRAY:RENDER): 1.0 IRAY rend info : CUDA device 0 (GeForce GTX 1080 Ti): compute capability 6.1, 11 GiB total, 9.17244 GiB available"
That the available VRAM on a dedicated rendering GPU is not equal the total VRAM is all the "proof" you need to positively confirm that there is an issue.
- - -
Apologies...I didn't know that Otoy is the Octane Render site. And yeah, it's a private forum.
Anyway thanks for the info. Looks like the Otoy and NVIDIA statements are from 2-3 years ago, and NVIDIA was talking about WDDM v2, but I think WDDM is up to v2.4 now in the most recent Windows 10 April update (1803)? And I saw the MS article on "Process Residency Budget", but honestly it made my head explode. No clue what it's actually saying, or whether it still applies in recent updates.
I guess I still don't have a warm fuzzy feeling that I fully understand the issue and its status, especially since there doesn't seem to be an industry upheaval on something that should be a big deal, especially to render professionals. And a bunch of posters in the NVIDIA and other forums have been saying that it's absolutely not true.
Anyway, I guess we'll see what the future holds.
Thanks again for all the info
FWIW, there is a nice explanation of the new Windows 10 Task/Performance Manager, which now shows detailed GPU information, written by the Microsoft lead engineer responsible for the GPU scheduler and memory manager. And it says the following about "Dedicated VRAM":
"Dedicated memory represents memory that is exclusively reserved for use by the GPU and is managed by VidMm. On discrete GPUs this is your VRAM, the memory that sits on your graphics card."
On my machine, with an 11GB GTX-1080ti, after I load a scene it says "6.6/11.0GB" of Dedicated Memory. Which, according to him, means that there is the full 11.0 GB of dedicated memory available, and the particular scene is taking only 6.6 GB of that. BTW, that number matches what's shown in GPU-Z for Dedicated Memory Usage.
So unless I'm mistaken he seems to be saying that the entire VRAM is exclusively dedicated for use by the GPU. Unless VidMm does something goofy that he's failing to mention. Though I'd think they'd be VERY aware of how they programmed VidMm, especially on something important like this. And if they were aware that only, say, 8GB is left after W10 grabs it's own chunk, it would say "6.6/8.0 GB" wouldn't it?
Anyway, here's the link to the blog post. Even if you don't believe my interpretation, the writeup is surprisingly well written if you want to understand the new Task/Performance Manager and what it all means.
https://blogs.msdn.microsoft.com/directx/2017/07/21/gpus-in-the-task-manager/
Thanks. I think that might be a step closer to an understanding. :)
I did a little more investigating, and merged a few scenes into DAZ Studio to see if I could get my VRAM Dedicated Memory usage up close to the limits of the VRAM in my 1070 and 1080ti. And the results are shown in the images below. The merged scenes used a whopping 35GB of my system RAM, but both GPU's were rendering full bore, no crashing to CPU. And my 8GB 1070 was using almost 7GB, and my 11GB 1080ti was using almost 10GB.
I'm just not seeing the issue. Supposedly W10 would take something like 2GB of the 1080ti VRAM? I suppose I could merge some more scenes to see if I can get over 10GB of VRAM usage in the 1080ti just to prove the point.
This seems to confirm my numbers from GPU-Z ... i.e. that in excess of 7 of the 8GB of VRAM on my 1070 was being used. So my original point was more about how accurate are the GPU-Z numbers (I notice that you use other utilities) and also how does the memory management work. It seems to me that as much of the VRAM as possible is used and that tends to push the numbers close to the limits but adding more to the scene doesn't have the expected effect of exceeding those limits and tripping over into CPU mode. So some form of memory management and dynamic compression algorithm seems to be in play here.
The images I posted were from the Windows 10 task/performance manager, which has been updated recently to show GPU info. And you would think that those would be the most reliable numbers, since the Windows 10 VidMm is the scheduler for all of this.
And it also seems to say that each GPU will be loaded up to near its capacity, and the lower VRAM unit won't limit the VRAM usage of the higher VRAM unit. I thought that was a big concern by some folks, that if you put a 1080ti with a 1070 for example, the VRAM usage of each GPU will be limited to the 8GB limit of the 1070. Or am I mis-remembering?
That has never been the case - each card on a system will be used if the scene fits and not otherwise, regardless of any other cards.
That's the more or less the way I understood it from the start. In fact I thought that the scene size would be limited by the smaller VRAM size which is why I never enable my 4GB GTX-970 (only use it for system/display purposes). My 1070 has 8GB and I didn't want that to be curtailed by the smaller 970.