Adding to Cart…
Licensing Agreement | Terms of Service | Privacy Policy | EULA
© 2024 Daz Productions Inc. All Rights Reserved.You currently have no notifications.
Licensing Agreement | Terms of Service | Privacy Policy | EULA
© 2024 Daz Productions Inc. All Rights Reserved.
Comments
That isn't an explanation of anything, Iray does not use all featues of the card and there is no reason to expect it to push the power draw to the maximum.
With RTX 3090, both Daz Studio versions easily hit 330W out of 350W limit (~94%).
With RTX 4090, both Daz Studio versions are hitting 285W out of 450W limit (~63%).
In both cases GPU utilization is 100%, but it is clear that RTX 4090 has hit some performance bottleneck due to architectural differences or it would be pulling a lot more power.
Look at it this way -- you can have 100% CPU utilization calculating number PI using SuperPI software which uses legacy x87 floating point instructions or you could have 100% CPU utilization doing the same calculation using AVX512 SIMD units. I will let you guess which one is going to use more power despite showing the same 100% CPU usage.
~31 iterations per second at 285W does not sound bad at all. And just going from there to a theoretical 100% of power consumption would end up being ~50 iterations per second. Though I assume it would not scale that well, also, I guess we'll never find out because only the new slower Iray (I actually did not know it is that much slower, I simply kept using 4.16 because I am too lazy to switch my mesh lights) will be updated to take full advantage of the new architecture. Well, maybe they'll also fix the performance bottleneck then, otherwise it does feel like we got cheated out of a lot of potential performance.
The 22 iterations at 285W would be 36 iterations at 450W, pretty good as well, but considerably less then 50.
Not exactly. Iray makes FULL use of a supported GPU's internal processing resources (including Tensor cores when ai denoising/upscaling is active) - hence why you see ~100% GPU activity during rendering. The reason for the markedly lower power draw than - say - 100% GPU activity during gaming is because of the lack of outside-of-gpu processing needed on the card's part to do it's thing. Game rendering necessitates maintaining a constant stream of fresh data coming from the CPU over the PCI bus that the GPU has to unpack and integrate into it's processing pipeline. And that sort of constant high bandwidth data transfer takes real watts to accomplish.
Once Iray finishes loading in scene data and starts the actual rendering process, the high bandwidth work is done. Hence the lower watts observed. And why those watts will never be as high as seen in gaming. No matter how optimized Iray is for running on specific generations of Nvidia hardware. Unfortunately those reduced wattage numbers aren't something that you will be able to turn into a rendering performance uplift with some hidden performance tweak later on.
Not all 100% usage is equal. It only takes one aspect of the GPU hitting its max to get a 100% report. The report of 100% is not specific to what it is that is getting bottlenecked. There are many components on these GPUs, they are like their own mini computer within a computer (though they certainly are not so small anymore.)
This card should not be running 285 Watts in Iray. This stat alone is a sign of a bottleneck of some kind. As the 4090 should be running higher than that for rendering.
That goes against every other rendering software out there, and the performance indicates this is not quite right, either. The 4090 doubles the 3090 in most render applications, and sometimes does more than double. Puget ran their bench suit, and the 4090 manages this in every rendering test. They even tested with two 3090s in many of these tests.
https://www.pugetsystems.com/labs/articles/NVIDIA-GeForce-RTX-4090-24GB-Content-Creation-Review-2374/
Check out Vray, they have RTX enabled and a CUDA only mode. The CUDA only mode is interesting, and even with only CUDA cores, the 4090 still out performs two 3090s.
(It is wild to see the 2080ti at the bottom of these charts.)
This is a direct quote from the article:
Iray traditionally is right there with these render engines when it comes to generational performance gains. So everything is pointing to something being off.
It shouldn't be a surprise. The surprise is that Lovelace even works at all. Iray is using CUDA 11.2.2, but Nvidia's own documentation specifically state that CUDA 11.8 is REQUIRED to run CUDA on Lovelace. So this statement, combined with the benchmarks and data we have all indicate that the 4090 should be running faster in Iray. Much faster. My guess is that Iray is giving the 4090 instructions not designed for it, the 4090 has new features previous cards do not have. So without these instructions, you have transistors just sitting there doing nothing. The rest of the GPU can be working hard, the parts it shares in common with Ampere's design, thus the 100%, but the new parts are basically idle because nothing is telling them what to do.
It is like the hiring a bunch of new people, but not telling them what to do. The old staff are still working hard, but the new staff wonder around doing nothing. When the boss finally gives them instructions on what to do, the team will begin to work faster as a group (hopefully). That is the 4090 right now with Daz Iray.
In other news, the 4080 12GB can unlaunched (Nvidia's words, not mine.) It got cancelled. Surprise! So the 4080 16GB is now the only other Lovelace launching that we know of. What this means for the rest of the product stack is anybody's guess at this point. This move is pretty unusual to say the least. Then again, the 4080 12GB was pretty strange to begin with, I am glad they called it off. They should never have named it that, but that is another discussion.
Where did you see them saying this, if I may ask? Because based on my own reading of official Nvidia documentation (the Cuda Best Practices Guide to be exact) the exact opposite is the case due in large part to the Binary Compatibility paradigm they are currently adhering to (which also doubles as the technical explanation for why Lovelace GPUs are already being successfully used with Iray versions that theoretically don't support them yet.)
Sort of. Unlike the jump from Pascal (10XX) to Turing (20XX) GPUs back in 2018, most - if not all - of the hardware upgrades seen in the Ada die design over its predecessor are concerned with better/more optimized management of existing physical processing pipelines (aka cuda/rt/tensor cores and the like) rather the introduction of brand new ones. So rather than thinking of it as there being certain parts of the GPU die not being used, it's more that there are certain parts of the GPU die that can now be used simultaneously as other parts or even for multiple purposes - that prior Cuda software versions simply don't know about.
Very well. I was wrong. I can admit to being wrong about something, unlike some.
I wouldn't be so sure that simultanious instructions add over 100 Watts to power draw. That seems excessive to me, there has to be more to it than that.
@RayDAnt, @outrider42
Ada Lovelace CUDA capability version has been bumped to 8.9.
It means that in order to fully utilize new capabilities, existing CUDA code (OptiX / Iray) will have to be recompiled using CUDA Toolkit 11.8 nvcc compiler.
Even though it is possible to have binary compatibility by compiling for lowest common denominator that will never work optimally and be able to fully utilize an architecture.
The same is with CPUs -- you can optimize for Core architecture, but that code will perform worse on Netburst and vice versa.
Ada Lovelace architecture is different enough to require specific targetting. CUDA executables can contain code for more than one architecture, you just need to specify which ones you want and the compiler will generate different versions of the same kernels for different architectures tailored to their resources (caches, shared memory size, number of SMs, block size, thread count, etc, etc).
What I expect to happen is a new OptiX release followed by new Iray and new driver release. Only then we will see the true speed and power draw.
System/Motherboard: X570 AORUS PRO
CPU: AMD Ryzen 7 5800X @stock
GPU:RTX 2070 Super @stock
System Memory: Corsair VENGEANCE 32 GB @ 3600MHz
OS Drive: Samsung 980 PRO Gen.4, 500GB, NVMe
Asset Drive: Seagate Expansion Desktop 4TB
Power Supply: Corsair TX550M, 550W
Operating System: Windows 11 Pro 22H2
Nvidia Drivers Version: 522.25
Daz Studio Version: 4.20.0.17 64-bit
Benchmark Results
Iteration Rate: 1800/ 286.126 = 6.29 iterations per second
Loading Time: 289.25 - 286.126 = 3.124seconds
I hope I did this the right way :)
So, I upgraded. It was just the right time, I was able to find a 4090 unlike when the 3090s came out, and I had the funds. The main thing I wanted was more RAM - but obviously a speed increase would be nice too...
My original:
System Configuration
System/Motherboard: Gigabyte B550 Vision D
CPU: AMD Ryzen 9 3900XT stock
GPU: GIGABYTE GeForce RTX 2080 SUPER GAMING OC WHITE 8G / stock
System Memory: Corsair Vengeance RGB Pro 64GB DDR4 3200 stock
OS Drive: Samsung SSD 970 EVO 1TB
Asset Drive: Samsung SSD 970 EVO 1TB (identical, but separate from OS)
Power Supply: Corsair HX1200i, 1200 Watt Platnium
Operating System: Windows 10 Pro 21H2
Nvidia Drivers Version: Studio 517.40
Daz Studio Version: 4.21.0.5 Pro Edition 64-bit
Benchmark Results
Total Rendering Time: 5 minutes 56.8 seconds
rend info : CUDA device 0 (NVIDIA GeForce RTX 2080 SUPER): 1800 iterations, 8.921s init, 344.948s render
Iteration Rate: 5.22 iterations per second
Loading Time: 11.852 seconds
And after the upgrade - everything is the same except the GPU:
System Configuration
System/Motherboard: Gigabyte B550 Vision D
CPU: AMD Ryzen 9 3900XT stock
GPU: GIGABYTE GeForce RTX 4090 Windforce 24G / stock
System Memory: Corsair Vengeance RGB Pro 64GB DDR4 3200 stock
OS Drive: Samsung SSD 970 EVO 1TB
Asset Drive: Samsung SSD 970 EVO 1TB (identical, but separate from OS)
Power Supply: Corsair HX1200i, 1200 Watt Platnium
Operating System: Windows 10 Pro 21H2
Nvidia Drivers Version: Studio 522.25
Daz Studio Version: 4.21.0.5 Pro Edition 64-bit
Benchmark Results
Total Rendering Time: 1 minutes 32.94 seconds
rend info : CUDA device 0 (NVIDIA GeForce RTX 4090): 1800 iterations, 8.432s init, 82.253s render
Iteration Rate: 21.88 iterations per second
Loading Time: 10.687 seconds
So, not sure if those are necessarily good numbers - but I'm pleased. 4-5x the speed and more VRAM for bigger scenes.
It should be noted that my 4090 was running at about 275W according to HWInfo64, so all the comments above re: non-optimal GPU usage would seem to be correct.
2x3080s getting me 28 iterations per second, going to watch this space and see if the 4090 can blow that over when it is fully utilized.
Hi,
I would like to ask the people that got the RTX 4090, besides the Dforce issue, do you have any other problem while using Daz Studio with the new card?
I got a very good deal on the 4090 in my country so I'm temping to buy one right now and sell the old card. I don't use Dforce anyway.
Thank you very much.
I already posted my benchmark with Daz Studio 4.16.0.3 which gave me 31.79 iterations per second so you don't have to wait for it to even be fully utilized -- it is already faster, at least in 4.16.
There is some flickering on transparent surfaces (eyelashes, etc) when zooming and panning in Iray preview, other than that nothing major and dForce will be fixed eventually.
One iteresting tidbit, while playing older games like say Mass Effect 3 (Legendary Edition) on 4090 in 1080p the GPU clock varies between 210 and 795 MHz, and RAM between 50-100MHz while the temperature is 40 degrees Celsius. It is ridiculous how little power the card needs to run the game. If nothing else, it can run simpler stuff much more efficiently and pretty much silently.
System Configuration
System/Motherboard: ASUS X99-Deluxe II
CPU: Intel Core i7-6950X @ 4.2 Ghz
GPU: 2 x NVIDIA Titan X, GP102 Pascal, 12GB/Stock, GeForce GTX SLI BRIDGE @ SLI
System Memory: 8 x 8GB DDR4-2666, Corsair Vengeance LPX black, Rev S @ 2666Mhz
OS Drive: 512GB Samsung 950 Pro, M.2 PCle
Asset Drive: 1TB Samsung 850 Pro Series, SATA3
Power Supply: 1200W - Corsair Professional Series HX1200i, 80Plus Platinum
Operating System: Windows 10 Pro 22H2 19045.2130
Nvidia Drivers Version: 522.30
Daz Studio Version: 4.21.0.5 Pro
Benchmark Results
Total Renderingtime 4:19 = 259 sec
1800 IRAY Iterations
Rendering Performance: 6,94 [DEVICE_ITERATION_COUNT(sum of all values) / DEVICE_RENDER_TIME_(largest value)] iterations per second
Loading Time:? [(TRT_HOURS * 3600 + TRT_MINUTES * 60 + TRT_SECONDS) - DEVICE_RENDER_TIME_(largest value)] seconds
Loadindtime ist unclear to me. From Open to workable scene? 13 sec.
The scene does not max out the hardware. runing about 70% (Complxe scenes run at 100% e.g. 4,2 Ghz).
System Configuration
System/Motherboard: ASUS X99-Deluxe II
CPU: Intel Core i7-6950X @ 4.2 Ghz
GPU: 2 x NVIDIA Titan X, GP102 Pascal, 12GB/Stock, GeForce GTX SLI BRIDGE @ SLI
System Memory: 8 x 8GB DDR4-2666, Corsair Vengeance LPX black, Rev S @ 2666Mhz
OS Drive: 512GB Samsung 950 Pro, M.2 PCle
Asset Drive: 1TB Samsung 850 Pro Series, SATA3
Power Supply: 1200W - Corsair Professional Series HX1200i, 80Plus Platinum
Operating System: Windows 10 Pro 22H2 19045.2130
Nvidia Drivers Version: 522.30
Daz Studio Version: 4.21.0.5 Pro
Benchmark Results
Total Rendering Time: 4 minutes 20.93 seconds
2022-10-21 17:10:16.418 Iray [INFO] - IRAY:RENDER :: 1.0 IRAY rend info : Device statistics:
2022-10-21 17:10:16.418 Iray [INFO] - IRAY:RENDER :: 1.0 IRAY rend info : CUDA device 1 (NVIDIA TITAN X (Pascal)): 809 iterations, 1.613s init, 256.166s render
2022-10-21 17:10:16.419 Iray [INFO] - IRAY:RENDER :: 1.0 IRAY rend info : CUDA device 0 (NVIDIA TITAN X (Pascal)): 868 iterations, 1.453s init, 256.189s render
2022-10-21 17:10:16.419 Iray [INFO] - IRAY:RENDER :: 1.0 IRAY rend info : CPU: 123 iterations, 0.891s init, 256.669s render
Rendering Performance: 6,898 [DEVICE_ITERATION_COUNT(sum of all values) / DEVICE_RENDER_TIME_(largest value)] iterations per second
Loading Time:? [(TRT_HOURS * 3600 + TRT_MINUTES * 60 + TRT_SECONDS) - DEVICE_RENDER_TIME_(largest value)] seconds
Loadindtime ist unclear to me. From Open to workable scene? 13 sec.
The scene does not max out the hardware. runing about 70% (Complxe scenes run at 100% e.g. 4,2 Ghz).
The help file reports 2 times in its log. The Total Time includes loading, so we don't want that. Then you should see on the line that reads 1800 iterations, on that line there is a "__.___s render" with number in those blanks. That is your actual render time in seconds. Just subtract this from total render time and you have the loading time.
@m43foto11 Here's how the math works out for you...
Your Total Rendering Time: 4min 20.93sec = 240sec + 20.93sec = 260.93s
Your 3 individual Actual Rendering Times:
CUDA1 256.166s
CUDA0 + 256.189s
CPU + 256.669s
TOTAL = 769.024s divided by 3 = 256.341s average
Total Redering Time - Actual Rendering Time:
260.930s
- 256.341s
= 4.589s = Loading Time
Total Rendering Time - Loading Time:
260.930s
- 4.589s
= 256.341s = 4min 16.341sec
Fwiw (we're talking literal milliseconds here) it is the only the longest actual rendering time that's relevant here (since that is what dictates how long the rendering prcoess completes as a whole.) Not an average in multi GPU/CPU situations. So in this case the math would be:
Total Rendering Time - Longest Actual Renderring Time = Loading Time
or
260.93 seconds - 256.669 seconds = 4.261 seconds
so
Loading Time: 4.261 seconds
Also fyi @m43foto11 your rendering performance number should be 7.01 iterations per second, since:
Rendering Performance: 1800 / 256.669 = 7.01
@RayDAnt Thanks for pointing that out, as I forgot it's just the longest actual rendering time, and not the average.
@m43foto11 If you don't use your CPU for rendering and only use your 2 x nVidia Titan X GPUs, then you could bump up your Rendering Performance to at least 7.0261 iterations per second (1800 / 256.189s).
In my country the price for the RTX 4090 model that I purchased 10 days ago for 2,338 USD has already went up to between 2,524 USD and 2,544 USD.
If any of you plan on upgrading, do it soon, because it seems they are in short supply and the prices don't seem like they will be going down soon.
I'm feeling more and more irritated by the slow down in rendering performance from each new version of DS.
Back when this thread first started, I put together an Excel sheet to calculate iterations based on the formula given by RayDAnt. A while later, I started keeping the test run data in there, which means I can go back and compare. Today, I ran a few more benchmarks on the nVidia 517.48 drivers.
System Configuration
System/Motherboard: ASUS X99-S
CPU: Intel i7 5960X @3GHz
GPU: Zotac 3090 RTX + EVGA 2080Ti
System Memory: 32GB KINGSTON HYPER-X PREDATOR QUAD-DDR4
OS Drive: Samsung M.2 SSD 960 EVO 250GB
Operating System: Windows 10 Pro
So both cards, from 4.15 to 4.20 I have lost 3 iterations/second.
From 4.20 to 4.21 I have lost a further 4 iterations/second.
About an increase of 20 seconds on the benchmark from 4.15 to 4.21.
What worries me is that this is unlikely to change because if someone upgrades from say a 3090 to a 4090 and is expecting a 2x increase in iterations/sec, but gets a 1.8x increase, they are not going to complain because it's still faster and you can't completely predict the exact increase. But it could still be faster if iray or DS wasn't slowing down with each version. I am looking forward to seeing what the 4090 cards can do once DS/iray is updated specifically for them but it's not looking good for people who can't afford to upgrade.
I've tried Genesis 9 out on DS 4.15 and it seems ok. I'll have to do some quality comparisons between 4.15 and 4.20 to see what difference there is, but for most rendering I'll be sticking with 4.15 as long as possible.
System Configuration
System/Motherboard: TUF GAMING X670E-PLUS
CPU: AMD 7900x (stock)
GPU: GeForce RTX 4090 GAMING OC 24G (stock)
System Memory: Corsair VENGEANCE 64GB (2x32GB) DDR5 DRAM 5200MHz
OS Drive: Seagate FireCuda 530 4TB Internal SSD NVMe
Asset Drive: Using the same drive as above.
Power Supply: GIGABYTE UD1000GM PG5 (1000 watt PSU)
Operating System: Windows 10 PRO version 21H2 build 19044.2130
Nvidia Drivers Version: 522.25 OCT/12/2022
Daz Studio Version: DAZ Studio 4.21 (Win 64-bit)
Benchmark Results
DAZ_STATS
IRAY_STATS
Iteration Rate: 21.719 iterations per second
Loading Time: 3.169s
Seems like my results are very similar to what everyone else is getting.
Yes, ~10 iterations per second less than with Daz Studio 4.16.0.3 Release.
What is the best version of Daz Studio for rendering speed? I have the previous versions of daz studio, but I don't know which to install. Is version 4.16.0.3 better than 4.15 ? Is from 4.16 to 4.20 the rendering speed starting to slow down?
Thank you.
It depends on what your rendering goals are. Iray (separately from Daz Studio) is constantly being updated to improve what it is able to accomplish in terms of visual realism/complexity on a per-iteration basis, almost always at the expense of slower per-iteration rendering speeds. If you want the most realistic-looking renders (taking advantage of the latest tech advancements therein) you're gonna want the most recent version of Iray regardless of iteration rates. This benchmark/thread is only useful for gauging rendering performance between different GPUs running the same version of Iray. Not the same GPUs running different versions of Iray.
Before 4.20 most of the render times were really about the same with a few exceptions. Each version might add something new, so having a recent version can be helpful.
4.16 is the last version before they jumped to 4.20. So I would use that one.
You can also use the beta and directly compare them. So if you have 4.16 on the main branch, you can still have 4.21 in beta form. I think this is the way to go if possible. The betas update more frequently, so you can keep the beta branch up to date for all the new stuff that comes along, while still keeping 4.16 for its pure rendering speed.
As for the quality of the render...I just don't see a difference in most pics. The only time I see any difference is when ghost lights are involved, and that is a whole thread in itself. If there are any changes to Iray otherwise, they are so subtle most people are not going to spot them without a magnifying glass. And to be perfectly honest, I thought my renders in 4.16 looked better, too.
But you don't have to take any one's word for it. Since you have 4.16, you have the ability to directly make this comparison for yourself and judge it yourself. You can render the same scene in 4.16 and then 4.21 and see how they compare in both time and quality.
Same here. Already planning out the specs for the system I will build next year. CPU isn't as important to me so will probably get a current gen i5 since I am currently running an i5-6600K BUT I do intend to get the 4090 as well as a beefy SeaSonic 1600W PSU to handle pairing it with my current 3090 and NVLink. Outervision PSU calculator comes out with 1219 W recommended so a 1600W Seaosnic Platinum should do the trick. The room I run my PC in has a circuit that can handle a total of 2300W so I should be able to get by.