Dual Video Cards... NVLink or not?
Hi all! I'm sorry, I feel like I have been so needy lately, but this place is the best place to ask questions about this because you are all the most expertey experts on DAZ around... so here goes...
I recently picked up another Quadro card, and now that I'm looking at installing it I'm debating on whether or not to connect the two with an NVLink bridge or not. I can easily google search what the benefits and drawbacks are, but I wanted to ask here because I am specifically trying to get a better render experience in DAZ Studio.
I don't really need the added RAM benefit of linking the two (each card has 48 GB, which has been fine), so from a processing perspective, is there an obvious benefit to linking these two that I'd miss out on if I just installed the two indipendantly?
Also, I have read the threads where you all talked about the benefits of SLI vs not, so if this is no different I can just go back and refer to those instead of wasting your time. I just don't know what the best path is here.
Comments
As far as I know the benefit of linking, for Iray, is expanding the amount of memory available for materials. There is actually a hit on processing speed. By the sound of it you would not benefit from adding a nVlink.
OK perfect! I saw this was the case with SLI, and had assumed it would be similar with NVLink, but I'm glad I asked because now I feel better about it. I do appreciate it! Thanks!
...but wouldn't the extra set of cores boost the speed or is that only if the cards are unlinked?
Do you not have a enough RAM for most of your scenes?
If the answer is that you dont have enough, then NVlink might be one answer.
If you rarely run out of RAM, why waste the cash?
If nVlink is not active then each card will contribute to the render, as long as the scene fits into its memory; enabling nVlink will pool memory for materials but will lower performance. Of course it is possible to switch nVlink off, but if it's never or rarely going to be used theer is little point in spending the money.
Your main drawback in having two cards with 48 Gb VRAM each is you're going to need to keep an eye on your VRAM usage depending on how big your scenes are. Once you start using around 67% of the total VRAM on each of those cards during a render, there's a chance you might exceed even 256 Gb RAM, which is the equivalent of a max spec'ed TRX40 or X299 system. One of the reasons I opted not to get two of those things is because the system needed to fully utilize 96 Gb VRAM for iray rendering would probably cost just as much(or even more) than both Quadro cards. 512 Gb RAM is what you would need, 8 x 64 Gb sticks(probably ECC) which are expensive by themselves, plus the motherboard & CPU to go along with it start to really add up. The other reason, as you were inquiring about, NVlink wouldn't really be worth it unless you absolutely needed the extra memory.
...like rendering a large scale city street scene with say 40 different individual G3s all with different hair, fully clothed in different styles, a number vehicles, various props, as well as dozens of emissive lights and atmopsheric effects for depth.
lol, Does anyone even render heavy scenes like that in iray? I know some do big scenes with a LOT of optimizing and such, but to render some nonsense like that with caustics, etc., would probably take weeks.
Oh, this is an interesting point, thank you! I do only have 128 GB of system RAM, and I didn't realize that the requirements would scale beyond what the VRAM consumption would be. I suppose this makes a lot of sense, when I think about it. I've only ever gone above 48 GB a few times, and it's only ever because I use lots of very high-density custom items I literally model out in ZBrush and pump right into DAZ with zero thought of optimization or how resource heavy it would be. The only reason I got the card in the first place was so that I could be super lazy and just have cluttered scenes, the tradeoff being that I can work very quickly and not worry about organization or anything like that.
I agree with the other comments, as well.. if you are just using DAZ the way it's intended, or even semi-responsible with how you manage your resources, then 48 GB will never be anything other than overkill. LOL.
...I've come close to exceeding hardware limits before.
When "optimising" a large scene involves far more effort compared to setting it up, (positioning/posing the characters ,placing the props, etc.) and far longer than the render process, more VRAM is better particularly when rendering in very large format for creating high quality art prints. True, I probably don't need 96 GB but 48 could come in handy. currently, even with the markup, two 3090s are still less than a single A6000 ATM (the latter which are going for on average around 1,500$ above the MSRP of 4,650$).
I used to paint on canvas, sometimes fairly large canvasses, until serious bone & joint arthritis made it pretty much impossible without downing Advil™ like candy (which has other implications). This is why 3D has become my new artistic media.
I use SLI on 1080s and NVLink on 2080rtx's I render huge scenes at 6000 to 10,000 and never wait overnight for a render.
Using Nvlink will actually cause a small loss of performance. So unless you really need the VRAM then you do not the Nvlink. In general, the Nvlink purely exists to expand your VRAM pool.
How much of a performance hit is hard to say, because we just do not have anybody who has the equipment to test this. If by chance you ever do get a Nvlink, it would be totally rad if you ran a performance comparison and posted the results for us. However, we do have test results in other render engines that may give us an idea. This chart comes from Vray.
In the larger scene, the render time without Nvlink was 328 seconds, with Nvlink it took 348 seconds. So 20 seconds longer in a scene that took 328. That is not a big difference, but it may be noticeable with larger and longer renders.
Also, like was said earlier, you will not be able to make use of all of that VRAM without having more RAM. Like before, it is hard to give exact numbers, because of the way the VRAM pooling works. You do not get 100% more VRAM. Some data is still duplicated across the cards. Currently, geometry is duplicated. So if your scene uses a ton of geometry, then the benefit of Nvlink is decreased. But if you use a lot of texture data then yes you may get more benefit. But all of this is predicated by how much RAM you have. With 128GB, I think you will likely run out of RAM before hitting the 48GB VRAM on your cards.
It has already been shown that a user ran out of 64GB of RAM with a 24GB 3090. In fact, they hit the 64GB barrier at 17GB of VRAM in use. So in other words, they were unable to use all of their 3090's VRAM, at least with this scene they were creating. Just going by this video, if you were to duplicate his scene, you would easily run out of your 128GB of RAM before getting even close to the 48GB of VRAM. You might hit about 35GB.
Again, as I said, exactly how much RAM you use can vary wildly. Since the geometry is not duplicated, then it would quite logical to simply use lower subdivision on the models in use. With over a dozen in the picture I would assume you are not close enough to see how much subd they each have. That goes for many of the props as well.
So basically, everything is pointing to no, do not use Nvlink with your current system. You will likely want to have 256GB of RAM, like was suggested earlier, AND you would need to be building scenes larger than 48GB of VRAM. Only if you do both of these will adding Nvlink make sense.
Yeah, its $6000+. I'm not buying the PNY version, which is around $5000-$5500. I'm not sure which one my builder is going with, but he recommended I stay away from PNY. Its probably going to be a Lenovo, depending on what's available.
I was thinking about trying out some of Bob Ross's stuff a while back, but I just don't have the necessary area for it. Does Photoshop have a way to simulate a wet on wet painting effect? I liked the paintings he did when he would go over part of or all of the canvas with black(liquid black or midnight black, I forget) first before painting anything else. Those turned out really nice. Its amazing what he could paint using only half a dozen blotches of different colors & just mixing them. Its too bad his son didn't pick up the torch. I think he had a falling out with the executives of his father's company.
Rendering four G8 figures (SubD2) with light weight clothing, hair and architecture resulted in 6.3Gb's of VRAM and 32.5Gb's of RAM being used.
Thank you for all of the awesome information! I think this, here, is the biggest thing. I was limited by my motherboard... (honestly, I should have gone with more, but I was hitting other money constraints at the time). SO I'm stuck with 128 for now. I'm really just trying to decrease the render time.
I'd be happy to run a test with the two cards, but since the NVLink adapter is like 250 bucks or something crazy, I'll probably just skip out on that for now.
I DO use super high-dense mesh, but I think the biggest issue is the textures. I do typically crank out 8K UVs and textures for each of my props, and I maybe have 10 - 15 per scene, but I could obviously always scale that down. I just haven't, yet, because if I crank it out at super high res then no matter where it is I never have a noticeable loss of detail. It also affords me a certain amount of flexibility when it comes to poor texture and detail work, since I can usually just let SP do the whole thing procedurally and not worry too much about the outcome there. Still, even with all of those textures I don't often fall back to CPU. If I do, I know there's something I can do on my side to optimize.
I really do appreaciate all of the details, and the time. My mind has been completely put at ease with regards to the best thing to do... it just feels strange to have these two things and not link them, but I totally get it. Thanks again!!
...so that's almost a factor of 5. I'm using G3 so would love to see a test using 4 G3s at the same SubD (only have 24 GB of system memory so can't do so myself).
Looking at the graphs it's not a major falloff in performance between linked and non linked cards. But yeah, system memory is the real issue. W7 Pro only supports up to 192 GB so that would mean cutting it tight on heavier scenes as 4 x 48 =192. Too bad they couldn't work Iray like old Reality/Lux so once the scene information was passed on to the render engine the Daz programme and the scene file could be shut down to save on system resources during the render process.
G3 and G8 are similar enough to where I would say the results would be very similar. The difference would be how the characters are set up. Most G3 characters predate several new Iray surface settings and only have a few textures per surface. Now you have characters that may use numerous textures and surfaces plus chromatic sss. Plus a number of G8 characters now offer 8K size textures on various maps. All of this should add up to additional texture data. So in this regard, older G3 characters might actually use less RAM, but again this is only a result of using fewer texture data more than there being a true difference between 3 and 8. The same also goes for clothing and hair. Some new hair and clothing devour memory.
This is what adds to memory more than anything else. When I tested 4.11 versus 4.14 a while back, the memory difference was only around 50MB or so in my simple scenes. I didn't test anything very complicated, but I believe the real driver of the extra memory is coming simply from the assets themselves.
I suppose I could do some test by using the same skin on G3 and G8 and see how it goes. I can certainly say that I routinely go past 24GB. My previous build had 32GB, and I blew past 24GB quite often. I am very glad I have 64GB now. If I ever get my hands on a 3090, I know I can at least use around 17GB of it until I build again. I think my PC can do 128 but I would have to ditch all my current RAM sticks to do that. I would prefer to wait for DDR5 to start to become common.
On that note, I think that DDR5 will be a big boost to CPU rendering. I could be wrong about that, but I think it will help CPU rendering a lot. No it will never approach any half decent GPU, but with faster CPUs and DDR5 I think that CPU rendering may become more bearable. However, GPUs are looking at making a strong leap next generation, so it is all relative.
Thank you for this thread!
I'll just have to plan for 256 GB memory in my next workstation.
And what's the max available in any laptops these days? I would imagine that it would be no more than 128 GB.
...supposedly one of the recent versions of Iray improved CPU rendering times..
Having say a Threadripper 3990X with 256 GB of DDR4-4400 memory, Yeah that may probably be pretty quick. Would be fun to watch a Carrara render on one, Would probably be like a psychedelic light show.
...now where did I put that lotto ticket?.
You would have be absolutely be building scenes far beyond what VRAM is available for those Threadrippers to make sense. We actually have a Threadripper 3970X on the benchmark. It is roughly as fast as a GTX 1080. At first this sounds awesome, but the 1080 released in 2016. Every single RTX card that exists renders Iray faster than the 1080. Yes, every single one, including the the slowest RTX, the 2060, and it is not even that close. We do not have 3050 or 3050ti's yet, but I am confident that even they will be faster than the 1080 at rendering Iray.
The 3990X may have double the cores, but it likely will not double the performance of the 3970X. Even if it does, we are looking at around 5 iterations per second in the best case scenario in our little Iray bench. This is better, for sure. This would push the 3990X past the RTX 2060 and 2070 on the charts. Again, this purely hypothetical, we have no Iray benchmarks for a 3990X yet.
But the 3990X also costs $5000, and the motherboards these things use, plus the ECC RAM would add a lot to the cost. You can buy the Nvidia A6000 for $5000 as well, and render significantly faster. It is important to note that the size of the scenes you can build with these setups will actually be similar! Iray compresses data into GPUs, which is why the amount of RAM and VRAM are so drastically different in the first place. In other words, if you are building a scene that uses 128GB on a 3990X, that scene would fit on a A6000's 48GB of VRAM, making the 3990X redundant for this task. The A6000 will be easily 5 times faster. So you would need to be building scenes that exceed 128 and then some to push past what a 48GB VRAM can handle. The A6000 can possibly even edge up to 256GB of RAM depending on how the scene is created, and if you somehow need more VRAM, you can Nvlink two A6000s to get more than 48. With Nvlink, two A6000s would never run out of VRAM with 256GB RAM.
So think about that for moment. There are a number of people excited about the Threadrippers because they might bring more GPU like speeds to massive renders that GPU could not handle because of VRAM. But in truth it is more complicated than that because of how Iray compresses data. We have seen it can compress data sometimes by a factor of 5 to 1, which honestly is kind of amazing. Because of that compression, super expensive CPUs might not be as attractive as once thought.
I believe we have reached a point to where the VRAM capacity can actually rival the monster CPU only builds in memory capacity. It is just that those GPUs are in the very high end right now. But I bring up CPU rendering because some people are limited to that. If DDR5 does bring big gains, then that could make for some interesting possibilities, like making CPU+GPU more viable. As many of us know, using the GPU plus CPU often leads to very small gains over GPU only, even when the CPU is pretty strong. It is possible that DDR5 might help in this area, allowing the CPU and GPU to swap data faster and thus render faster. But we just have to see. I could be totally wrong.
I have a 3970X, and when renders fall back to CPU, it doesn't feel significantly faster than my previous i7.
...just put that up as a nextreme case for CPU rendering. I'd never actually build a rig like that if I had the resources (well save for the 256 GB system memory to support an A6000 or more likely dual NVLinked 3090s once prices fall back to earth). More likely I'd look at a 12C/24T 2920X with 128 GB sysem memory that would support a single 3090 when the latter ever become "affordable" again as I do other things than just render in Iray.
While I know some have been giving last rites to Carrara, it still is a fairly decent piece of software, and like I mentioned, would benefit highly from a fast high core count/high memory set-up.
There are also still people working with 3DL using Wowie's AweShader script and turning our some pretty incredible lifelike images. That would also be a good reason to build a system around a high CC CPU with a decent amount of memory. Crikey with Pariss' IBL Master and AoA's Advanced lights I got a fairly heavy 3DL scene (reflections, transparency, trasnmaps) at 1,500 x 1,125 with full GI to render in fairly close to realistic quality in less time than it would have taken either with UE or in Iray even with my Titan-X (and that was still on a four core first generaion i7 with only 12 GB of system memory)
Consider what scenes you are are falling back on, and what GPU you have. I am guessing that when you had an i7 you probably had a smaller capacity GPU as well. If so, then with your current build when you drop to CPU you are dropping much larger scenes than you did back when you had that i7. A larger scene is going to take longer. Complex shaders will take longer, too. I would bet that you have scenes with a lot more complex shaders in place than you did previously. Daz products, shaders and textures have all slowly become more complex over time, and things like chromatic sss take far longer to render than mono sss. Just turning on chromatic can cause a large hit to rendering speed.
An easy test would be to render an old scene that you created in your i7 days on the new CPU.
Another thing to consider is that CPU boosting behavior can impact a speed test. Intel CPUs can have boost periods of around 30 seconds, but in some motherboards this time limit can actually be disabled allowing the Intel chip to boost indefinitely. In fact, some board makers actually disable the time limits by default to gain a performance advantage.
A 3970X is a very beefy CPU, and thus needs excellent cooling. It is also possible the chip was throttling, or something was throttling it. There are all kinds of things that can be going on here. But without some real numbers, how fast a render feels can be the result of many different circumstances.
Hello everyone.
First, sorry for my English as a translator. :D
Second, apologies for bringing up a topic that has been around for a while.
The fact is that I made an Nvidia RTX 4090 in the hope of reaching commercial render times.
But I can't render with the GPU.
I attach an image with the system monitor.
Am I doing something wrong or are there hardware limitations?
Because of the memory issue, I had thought about getting an Nvida A6000, or also two 3090s with NVlink, but I'm afraid of spending money and then it doesn't work.
Thank you for reading.
Monitoring is better with GPU-Z.
But first check that in Render Settings > Advanced that your GPU is listed and ticked.
Then when doing a render, look in your log Help > Troubleshooting > View Log File. That will mention if it makes fallback to CPU, and after the render is finished which devices that contributed.
Yes, use GPU-Z (https://www.techpowerup.com/download/techpowerup-gpu-z/) to monitor GPU Load. Never use Task Manager to monitor card performance esp. when you have multi cards. You won't get correct info. there.
Then if you find no GPU Load with GPU-Z, check your DS log to see if you run out of your VRAM...
And if you want to have more cards, going for A6000 will be better than two 3090 even with NVlink. You'll get a real boost.
Thank you very much for the information.
With the GPU-Z, I do see that the card is working.
One last question: it seems to me that Daz can use more than one Nvidia card on the same board. In other words, if I add an A6000, is Daz going to use both, the 4090 + A6000, so the memory would be unified and the render times would be dramatically reduced?
DS can use multiple cards, but if they're different cards memory is not pooled, only computation power. Memory is only pooled if you're using identical cards with Nvlink, and even then it's only for materials.
If you have a 3090 with 24 GB VRAM and an A6000 with 48 GB, you don't have 72GB VRAM total to use, you have 48 GB max, and the cards will be used depending on whether your card fits in their VRAM:
Renders will indeed be much faster than with only the 3090.
Thank you very much, you have clarified my doubts.
I understand that the biggest benefit of having multiple cards lies mainly in the increase in rendering speed, and NVlink is not necessary.
So, if I add another 4090 (which does not implement NVlink), would it increase the rendering speed?
Tell me if I'm wrong.
As far as I have understood, NVlink is no longer an option if one uses the latest versions of DS with relatively recent drivers. Something about Nvidia disabling the option for non-pro cards.
The 'consumer' cards that supported NVlink, were 3090/3090ti, 2080/2080ti and 2070 super