Adding to Cart…
Licensing Agreement | Terms of Service | Privacy Policy | EULA
© 2024 Daz Productions Inc. All Rights Reserved.You currently have no notifications.
Licensing Agreement | Terms of Service | Privacy Policy | EULA
© 2024 Daz Productions Inc. All Rights Reserved.
Comments
Thank you for posting a 4060ti benchmark. We finally have a number now.
Intriguing. My 3060 ran the bench in 272.67 seconds, which is 6.6 iterations per second. We have 3060ti numbers, but they are outdated. The 3060ti was breaking 10.6 iterations with DS 4.14. Before somebody freaks out, remember that version Iray runs this benchmark faster. It would be nice to get a new benchmark of a 3060ti with DS 4.21, or the 3070 for that matter.
Still, the 4060ti is faster than a 3060, not drastically, but noticeably faster. 22 seconds in this bench could become much larger in other scenes. The benchmark is not absolute, there is variance from one scene to another because of their makeup. One scene might be heavy on geometry but light on shaders while another scene is the opposite. Geometry favors the RT cores, shading favors the CUDA cores and raw compute power. At any rate, we have always said the main selling point of the 4060ti 16gb is the VRAM. If VRAM is not an issue for you, then you have other options. You can go cheaper with the 3060 12gb, or bump up to the 4070 12gb for more speed.
Thank you for posting!
I'm glad I chose the RTX 4070, since the VRAM is not that important in my case.
I'm gonna post the results of my current 2070 Super, to compare it with 4070 in a few days.
Since it is harder to compile all these numbers up, I scrounged and gathered up some numbers for GPUs that might be of interest now. Most of these are in the recent DS 4.21. Please remember these numbers are not absolute, and can vary between scenes and versions of DS. This just a general idea of what to expect.
Iterations counts
3060 6.6
4060ti 7.23
4070 12.71
4070ti 14.175
4080 16.5 to 19.6
3090 16.7
4090 28.5
A5000 14.36
Titan RTX (Turing) 8
Putting all these in one spot really shows how far apart these GPUs can be in the Lovelace line. It is frankly surprising just how wide the gaps are.
The 4060ti is really close to a Titan RTX, which is faster than a 2080. On its own that sounds pretty cool. Until you see the rest of the numbers.
The 4060ti is somehow roughly half as fast as the 4070ti. The gap to the 4070 is also pretty high.
The 4090 is a ridiculous FOUR TIMES FASTER THAN THE 4060ti. This also means the 4090 is TWICE as fast as the 4070ti. The 4070ti is literally half, offering half the VRAM and speed. Wow.
The 3090 is 2.5 times faster than the 3060, so the gap was much smaller for Ampere, especially considering this is the 3060 and not the 3060ti.
Hopefully this might help some people who are looking to buy a GPU for Iray. The 4060ti 16gb is an interesting card, but the price is still pretty high. The 4070 is only $100 more and much faster, but you do give up 4gb of VRAM. The 4070ti is twice as fast as the 4060ti, which is just crazy. The 4080 is about 2.5 or so times faster. And the 4090 is on another planet. You really need to know how much VRAM you are going to need to justify a 4060ti over a 4070.
Honestly this all just makes the 4090 look like a deal, LOL. What madness is this? But that isn't really news, there have been reviews joking about this. It is just crazy to see the numbers spell it out this way.
Thank you outrider42 for the summary!
As promised, here is the same PC with the brand new 4070!
I've also done a comparison with a real-life use I'd make, a scene from my comics, which is 4K, indoor, two G8 figures, 1600 iterations, long detailed hair, low light and many reflective surfaces.
It took the 2070 Super 46 minutes and 28 seconds. The RTX 4070 renders it in 18:42.
It's a +150% improvement, compared to the +118% improvement in the benchmark scene. So I'd say the benchmark scene is still a valuable tool to evaluate real-life performances.
About the GPU market... I paid €530 for the RTX 2070 Super, in December 2019. 4 years and 2 generations later, it's €710 for a RTX 4070 which does +150% better for +34% price. But to be fair, I think the 2070S was something like a 20% worse at launch in Daz Studio, so it would actually be +180% for +34% of the price, that doesn't sound as bad.
Very nice. It is good to see that you got an even bigger boost from the upgrade in your own scene versus the benchmark. I think it makes sense to skip a generation, too.
It is interesting to look at how the 2070 Super compares to the other Turing cards. So you got 5.7 iterations per second. The Titan RTX is the fastest Turing, and it gets 8 iterations. I find this interesting because the 2070 Super is essentially the 2070ti that we never got. The 4090 gets 28.5 iterations while the 4070ti gets 14.175. We can see how much bigger the gap is between the Lovelace tiers than Turing here.
We can also see the gap between the 2070S and Titan RTX is just 2.3 iterations, even though it is several tiers down the product stack. This goes back to my comment about the 4060ti being just an iteration faster than the 3060. While that is not great at all, it is a difference, and as LenioTG showed, you may get better results in other scenes besides the benchmark. So the gap between the 3060 and 4060ti could be bigger in your scenes, or maybe not. The benchmark is just a rough guide. There is no one size fits all benchmark for Iray because every scene is different and can place different demands on the GPU. Before RTX, this was really easy, but after RT cores came along things got more complicated. But RT cores also render Iray much faster.
not sure where to find the load time. My system is pre-built with no indication of companent manufacturer.
System Configuration
System/Motherboard: Lenovo
CPU: Intel core i7 13700KF @stock
GPU: Nvidia RTX 4080 @stock
System Memory: DDR5 32gig @ 5600mhz
OS Drive: SSD 1T
Asset Drive: Seagate Barracuda 4T
Power Supply: 850W
Operating System: Win 11 build: 22621.2134
Nvidia Drivers Version: 536.99 Studio
Daz Studio Version: 4.21.0.5 64bit
Optix Prime Acceleration: STATE (Daz Studio 4.12.1.086 or earlier only)
Benchmark Results
DAZ_STATS
IRAY_STATS
Iteration Rate: CUDA device 0 (NVIDIA GeForce RTX 4080): 1800 iterations, 1.759s init, 109.165s render
Loading Time: ((TRT_HOURS * 3600 + TRT_MINUTES * 60 + TRT_SECONDS) - DEVICE_RENDER_TIME) seconds
IRAY:RENDER :: 1.0 IRAY rend info : CUDA device 0 (NVIDIA GeForce RTX 4080): 1800 iterations, 1.759s init, 109.165s render
You should see two different lines in the help log, one says TOTAL RENDERING TIME, and the second is DEVICE RENDER TIME. They are fairly close together in the log, with just a couple lines between them. The difference between these two should be the time it took to load the scene into VRAM. The load time is just a reference, as many things can influence how quickly the scene loads. It can be interesting information, but is not vital.
Your iteration rate (1800/109.165) is 16.5 per second, which is where many 4080's sit at. That is very close to the rate my 3090 hits (16.7).
thanks. I dont see an entry called device render time. one would think a 4080 would be faster than a 3090. interesting how it all works.
If you open the log immediately after the render has stopped and close the render window, you will see this at the bottom:
2023-09-13 02:08:05.729 [INFO] :: Finished Rendering
2023-09-13 02:08:05.751 [INFO] :: Total Rendering Time: 1.93 seconds
2023-09-13 02:08:05.778 [INFO] :: Loaded image: r.png
2023-09-13 02:08:05.825 [INFO] :: Saved image:
2023-09-13 02:08:06.924 Iray [INFO] - IRAY:RENDER :: 1.0 IRAY rend info : Device statistics:
2023-09-13 02:08:06.924 Iray [INFO] - IRAY:RENDER :: 1.0 IRAY rend info : CUDA device 0 (NVIDIA GeForce RTX 3090): 32 iterations, 0.660s init, 0.639s render
2023-09-13 02:08:06.924 Iray [INFO] - IRAY:RENDER :: 1.0 IRAY rend info : CUDA device 1 (NVIDIA GeForce RTX 3060): 17 iterations, 0.474s init, 0.548s render
The total rendering time is a few lines above the device stats as shown. I have 2 devices so I have 2 lines for them.
Yeah, I pointed out the 4080 basically equalling the 3090 when we first got people posting benchmarks. The 4080 is also significantly cut down compared to the 4090, a lot more than they normal cut a 80 class (just look at the 4080-4090 core counts.) But there is a software issue, too. The Iray dev team has admitted that the 4000 series is not fully utilized, which lines up with the fact that the 4000 series uses freakishly little power when rendering Iray. The new Iray 2023 promises to fix this, but right now Daz is not interested in giving this update to customers because previous Iray updates broke things like Ghost Lights. So they are scared of something like that happening again, and they are waiting (according to Richard) for a "major" update for DS before adding it in. What does that actually mean? It might be a while. They promised DS5 was coming "soon" over 2 years ago. I hope we are not waiting for DS5 here.
I don't know if that is a good strategy, Iray has already updated 3 times since they released Iray 2023, so I am not sure how that is supposed to work. I have argued that they can at least try to drop Iray 2023 in the beta branch of DS and let us test it for them if they are so scared, since that is what betas are designed for. If it breaks something...don't release it to the general public. Iray 2023 also promises a long list of bug fixes...there may be things that get fixed by Iray 2023 that Iray broke previously. Why not find out?
If you really want to test the new Iray, it is included in Nvidia's Omniverse. Unlike DS, Omniverse is kept up to date with Iray. I have not tried Omniverse myself, since there is not a direct path to it from DS.
At any rate, some day in the future, perhaps near, perhaps far given Daz history, the 4080 should render a bit faster.
I think, that is exactly what we are waiting for.
There has been signs, not updating Iray being one of them, which are whispering that we might see DS5 before Santa this year.
If DS5 is coming *soon* I would expect DS4 to stop getting updates at all for an extended period. We just got 4.21.1.80 in August. Perhaps if that is the final update DS4 ever receives, then maybe a December DS5 could happen. But I frankly have zero faith in this company on this.
I also don't want to be forced to get DS5 just for the new Iray. There is nothing stopping them from putting Iray 2023 in DS4 as its final release and ending DS4 on a high note. DS5 is almost certainly going to have some serious teething issues, so I don't think it is right to push this unproven DS5 on people.
Again, the damage Iray did when it broke products like ghost lights has already been done. There is no point to holding Iray 2023 back.
Iray 2023 doesn't just improve performance, it greatly enhances the caustic sampler. BTW, we already have problems with caustics acting weird in the current DS...so having this improved sure would be nice. The memory leaks might be addressed (I don't know for sure, but they say they have lots of bug fixes.) This isn't just a desire to render faster.
Probably would be if using the beta. I got about 17% faster renders with a 40 series card using the beta. Currently only have the beta installed.
Deal, smheel, my Gigabyte AERO 4090 was $1749.00 + tax at MicroCenter and my credit card is still crying but I have a feeling that going from 6.6 to 28 itenerations will be very nice. Got my 180° power adaptor today so will be installing in the next few days. I already have the 4TB drive and 64 GB of memory installed. Will be running one last set of benchmarks with the new SSD & memory and the 3060 12GB before installing the new GPU.
BTW, what is the latest test file?
Here
About to make a big upgrade jump myself, although on the non-gpu side (goodbye 8700K - it's been a good 6+ years... hello Xeon W5-3435 + Asus Sage SE aka goodbye my wallet...) Since it's been ages since my own last tests, will be sure to do a full gauntlet of GPU and CPU tests before/after the switch. Should be fun seeing how things have changed over the past year or so (really haven't had the time for DAZ/Iray stuff lately.) Might actually get to revamping the thread some at the same time as well...
My last RTX 3060 12GB Benchmark (with the new SSD and memory). Next one will be with the RTX 4090 tomorrow.
System Configuration
System/Motherboard: Gigabyte X570 Aorus PRO WIFI
CPU: AMD Ryzen 7 3700X @ 4.3 GH (mildly overclocked)
GPU: EVGA 3060 12GB @ stock speed
System Memory: 64 GB (2x32GB) G.Skill Trident Z Neo DDR-4 3600 @ 3600
OS Drive: Samsung 870 EVO 4TB SSD
Asset Drive: Same
Power Supply: EVGA Supernova 1200 P2 1200 watt Platinum PSU
Operating System: Windows 10 22H2 (OS Build 19045.3448)
Nvidia Drivers Version: Nvidia Studio Driver 536.23
Daz Studio Version: 4.21.0.5 Pro Edition (64-bit)
Benchmark Results
DAZ_STATS:
2023-09-24 00:24:09.697 [INFO] :: Finished Rendering
2023-09-24 00:24:09.734 [INFO] :: Total Rendering Time: 5 minutes 2.88 seconds
IRAY_STATS
2023-09-24 00:24:56.031 Iray [INFO] - IRAY:RENDER :: 1.0 IRAY rend info : CUDA device 0 (NVIDIA GeForce RTX 3060): 1800 iterations, 1.819s init, 298.902s render
Iteration Rate: 5.942947702060222 iterations per second
Loading Time: 2.896 seconds
It's Alive, IT'S ALIVE!
After an "Amazon Special" ATX 3.0 adaptor scaring the begesertsd out of me (no video AT ALL when I first rebooted after the card install), she is up and running (installed the Gigabyte 4 into 1 adaptor and she booted right up). Man this thing is fast. Ran the card with Silent turned on instead of overclocked. Everything exactly the same as the last run except for a RTX 4090 instead of a RTX 3060.
System Configuration
System/Motherboard: Gigabyte X570 Aorus PRO WIFI
CPU: AMD Ryzen 7 3700X @ 4.3 GH (mildly overclocked)
GPU: Gigabyte RTX 4090 AERO @ stock speed (no factory overclock)
System Memory: 64 GB (2x32GB) G.Skill Trident Z Neo DDR-4 3600 @ 3600
OS Drive: Samsung 870 EVO 4TB SSD
Asset Drive: Same
Power Supply: EVGA Supernova 1200 P2 1200 watt Platinum PSU
Operating System: Windows 10 22H2 (OS Build 19045.3448)
Nvidia Drivers Version: Nvidia Studio Driver 536.23
Daz Studio Version: 4.21.0.5 Pro Edition (64-bit)
Benchmark Results
DAZ_STATS:
2023-09-24 21:37:26.869 [INFO] :: Finished Rendering
2023-09-24 21:37:26.905 [INFO] :: Total Rendering Time: 1 minutes 25.41 seconds
IRAY_STATS
2023-09-24 21:37:37.316 Iray [INFO] - IRAY:RENDER :: 1.0 IRAY rend info : CUDA device 0 (NVIDIA GeForce RTX 4090): 1800 iterations, 1.816s init, 81.424s render
Iteration Rate: 22.10650422479859iterations per second
Loading Time: 3.986 seconds
Nerxt is turning on the factory overclock bios and seewhat we get.
Hmm...both your times are a little off the pace for what we normally see. My 3060 can run this in 272 seconds, so 26 seconds faster. Now your 4090 is putting up a time that looks to be one of the slower 4090 times we've seen. But I think you are using a different Iray version.
Do you have the Daz beta? It is more updated than the version of DS you have. I can understand you may not want to upgrade your primary DS, but you can install the beta without causing any conflicts. The beta might be slightly faster. So I think it is worth checking out.
I think the numbers are skewed by DS version and 4090 clocks/power limits.
I got about 22 iterations with a 450w reference card at stock clocks.
I get about 26 in the beta at stock settings.
Beta +100 core and +1500 vram which isn't a big ask for a 4090 in Studio gets about 28. I managed 29.5 but that was with pushing the card more than I normally would and increasing the voltage.
Hello, I tried benchmarking with my PC.
System Configuration
System/Motherboard: Aorus Ultra
CPU: i9 9900k
GPU: 3x MSI Suprim 4090
System Memory: 64GB
OS Drive: 2TB M.2 SSD
Asset Drive: Same as OS Drive
Power Supply: 1600W
Operating System: Windows 11 22H2
Nvidia Drivers Version: 537.42
Daz Studio Version: 4.21.0.5
Benchmark Results
2023-10-05 01:31:17.057 [INFO] :: Total Rendering Time: 31.64 seconds
2023-10-05 01:31:21.710 Iray [INFO] - IRAY:RENDER :: 1.0 IRAY rend info: CUDA device 0 (NVIDIA GeForce RTX 4090): 566 iterations, 1.552s init, 27.862s render
2023-10-05 01:31:21.710 Iray [INFO] - IRAY:RENDER :: 1.0 IRAY rend info: CUDA device 1 (NVIDIA GeForce RTX 4090): 619 iterations, 1.519s init, 27.908s render
2023-10-05 01:31:21.710 Iray [INFO] - IRAY:RENDER :: 1.0 IRAY rend info: CUDA device 2 (NVIDIA GeForce RTX 4090): 615 iterations, 1.636s init, 27.745s render
Iteration Rate:
GPU 0: 20.31 iterations per second (Fewer iterations, but the same card. Possibly due to my monitors? I use 2x4K monitors.)
GPU 1: 22.18 iterations per second
GPU 2: 22.17 iterations per second
Loading Time: 3.732 seconds
I'm satisfied, but it's not powerful enough for rendering animations with VDB, haha! :'(
Fwiw if you want truly accurate performance figures for each of your GPUs individually, you'll need to run the benchmark separately with just a single card enabled for rendering at a time. Due to the way Iray implements load balancing, there is no guarantee that multiple GPUs/CPUs used in a single render job will get a consistent/proprotionally eqivalent share of the overall workload to perform. As evidenced by your results, this is least an issue when dealing with multiple identical rendering devices (eg. 3 MSI 4090 Suprims.) In which case, margin of error is likely to be the limiting factor anyway. However there's no quarantee that a sudden unrelated burst of traffic across the PCI bus couldn't throw the balances off. Making individual runs the best way to go, unless benching multiple rendering devices in tandem is the goal - in which case, some variation between card workloads due to load balancing is inevitable.
Finally laid the 8700K to rest (it served me well over the past 6+ years, although it's poor stock TIM performance was always a bother.) Did a last round of benchmarks before pulling it all apart.
First the old specs:
System Configuration
System/Motherboard: Gigabyte Z370 Aorus Gaming 7 (custom watercooled)
CPU: Intel 8700K (custom watercooled with MB) @ stock (MCE enabled) - used for dispaly output
GPU: Nvidia RTX A5000 (custom watercoooled) @ stock (WDDM driver mode)
GPU: Nvidia RTX A5000 (custom watercoooled) @ stock (WDDM driver mode)
GPU: Nvidia Titan RTX (custom watercoooled) @ stock (WDDM driver mode)
System Memory: Corsair Vengeance LPX 32GB DDR4 @ 3000Mhz
OS Drive: Samsung Pro 980 2TB NVME SSD
Asset Drive: Sandisk Extreme Pro Portable SSD 4TB
Power Supply: Corsair AX1500i 1500 watts
Operating System: Windows 11 Pro version 22H2 build 22621.2361
Nvidia Drivers Version: 536.23 SRD
Daz Studio Version: 4.21.1.80 Beta 64-bit
And old results:
Benchmark Results: RTX A5000 #1, RTX A5000 #2, Titan RTX, CPU (8700K)
Total Rendering Time: 1 minutes 2.90 seconds
CUDA device 0 (NVIDIA RTX A5000): 695 iterations, 1.473s init, 58.659s render
CUDA device 1 (NVIDIA RTX A5000): 698 iterations, 1.697s init, 58.012s render
CUDA device 2 (NVIDIA TITAN RTX): 386 iterations, 1.798s init, 57.945s render
CPU: 21 iterations, 0.880s init, 58.894s render
Iteration Rate: 30.686 iterations per second
Loading Time: 4.24 seconds
Benchmark Results: RTX A5000 #1, RTX A5000 #2, Titan RTX
Total Rendering Time: 55.59 seconds
CUDA device 0 (NVIDIA RTX A5000): 713 iterations, 1.382s init, 51.419s render
CUDA device 1 (NVIDIA RTX A5000): 712 iterations, 1.446s init, 51.025s render
CUDA device 2 (NVIDIA TITAN RTX): 375 iterations, 1.572s init, 51.084s render
Iteration Rate: 35.007 iterations per second
Loading Time: 4.17 seconds
Benchmark Results: RTX A5000 #1, RTX A5000 #2
Total Rendering Time: 1 minutes 7.98 seconds
CUDA device 0 (NVIDIA RTX A5000): 892 iterations, 1.252s init, 63.524s render
CUDA device 1 (NVIDIA RTX A5000): 908 iterations, 1.214s init, 64.267s render
Iteration Rate: 28.336 iterations per second
Loading Time: 3.71 seconds
Benchmark Results: RTX A5000 #1, Titan RTX
Total Rendering Time: 1 minutes 26.63 seconds
CUDA device 0 (NVIDIA RTX A5000): 1170 iterations, 1.372s init, 82.721s render
CUDA device 2 (NVIDIA TITAN RTX): 630 iterations, 1.364s init, 82.565s render
Iteration Rate: 21.760 iterations per second
Loading Time: 3.91 seconds
Benchmark Results: RTX A5000 #2, Titan RTX
Total Rendering Time: 1 minutes 26.22 seconds
CUDA device 1 (NVIDIA RTX A5000): 1174 iterations, 1.353s init, 82.336s render
CUDA device 2 (NVIDIA TITAN RTX): 626 iterations, 1.515s init, 82.078s render
Iteration Rate: 21.862 iterations per second
Loading Time: 3.88 seconds
Benchmark Results: RTX A5000 #1
Total Rendering Time: 2 minutes 9.33 seconds
CUDA device 0 (NVIDIA RTX A5000): 1800 iterations, 1.953s init, 125.154s render
Iteration Rate: 14.382 iterations per second
Loading Time: 4.18 seconds
Benchmark Results: RTX A5000 #2
Total Rendering Time: 2 minutes 7.92 seconds
CUDA device 1 (NVIDIA RTX A5000): 1800 iterations, 1.339s init, 124.358s render
Iteration Rate: 14.475 iterations per second
Loading Time: 3.56 seconds
Benchmark Results: Titan RTX
Total Rendering Time: 3 minutes 49.8 seconds
CUDA device 2 (NVIDIA TITAN RTX): 1800 iterations, 1.625s init, 225.221s render
Iteration Rate: 7.992 iterations per second
Loading Time: 4.58 seconds
Benchmark Results: CPU (8700K)
Total Rendering Time: 1 hours 6 minutes 7.10 seconds
CPU: 1800 iterations, 2.853s init, 3961.845s render
Iteration Rate: 0.454 iterations per second
Loading Time: 5.26 seconds
Then the new specs:
System Configuration
System/Motherboard: Asus Pro WS W790E-SAGE SE
CPU: Intel Xeon 3435X (custom watercooled) @ "stock" (semi-overclocked) motherboard settings
GPU: Nvidia RTX A5000 (custom watercoooled) @ stock (WDDM driver mode) - used for dispaly output
GPU: Nvidia RTX A5000 (custom watercoooled) @ stock (WDDM driver mode)
GPU: Nvidia Titan RTX (custom watercoooled) @ stock (WDDM driver mode)
System Memory: V-Color RDIMM 128GB (4x32) DDR5 @ 6000mt/s
OS Drive: Samsung Pro 990 2TB MVME SSD
Asset Drive: Sandisk Extreme Pro Portable SSD 4TB
Power Supply: Corsair AX1500i 1500 watts
Operating System: Windows 11 Pro version 22H2 build 22621.2361
Nvidia Drivers Version: 537.42 SRD
Daz Studio Version: 4.21.1.80 Beta 64-bit
And the new results:
Benchmark Results: RTX A5000 #1, RTX A5000 #2, Titan RTX, CPU (W5-3435X)
Total Rendering Time: 56.42 seconds
CUDA device 0 (NVIDIA RTX A5000): 690 iterations, 1.364s init, 51.621s render
CUDA device 1 (NVIDIA RTX A5000): 668 iterations, 1.175s init, 52.252s render
CUDA device 2 (NVIDIA TITAN RTX): 362 iterations, 1.504s init, 51.742s render
CPU: 80 iterations, 0.727s init, 52.192s render
Iteration Rate: 34.448 iterations per second
Loading Time: 4.17 seconds
Benchmark Results: RTX A5000 #1, RTX A5000 #2, Titan RTX
Total Rendering Time: 57.17 seconds
CUDA device 0 (NVIDIA RTX A5000): 728 iterations, 1.145s init, 53.100s render
CUDA device 1 (NVIDIA RTX A5000): 694 iterations, 1.046s init, 53.131s render
CUDA device 2 (NVIDIA TITAN RTX): 378 iterations, 1.205s init, 52.125s render
Iteration Rate: 33.879 iterations per second
Loading Time: 4.04 seconds
Benchmark Results: RTX A5000 #1, RTX A5000 #2
Total Rendering Time: 1 minutes 10.45 seconds
CUDA device 0 (NVIDIA RTX A5000): 920 iterations, 1.056s init, 66.369s render
CUDA device 1 (NVIDIA RTX A5000): 880 iterations, 0.962s init, 66.824s render
Iteration Rate: 26.936 seconds
Loading Time: 3.62 seconds
Benchmark Results: RTX A5000 #1, Titan RTX
Total Rendering Time: 1 minutes 30.57 seconds
CUDA device 0 (NVIDIA RTX A5000): 1149 iterations, 0.951s init, 87.021s render
CUDA device 2 (NVIDIA TITAN RTX): 651 iterations, 0.934s init, 86.827s render
Iteration Rate: 20.685 iterations per second
Loading Time: 3.55 seconds
Benchmark Results: RTX A5000 #2, Titan RTX
Total Rendering Time: 1 minutes 28.19 seconds
CUDA device 1 (NVIDIA RTX A5000): 1167 iterations, 1.185s init, 83.426s render
CUDA device 2 (NVIDIA TITAN RTX): 633 iterations, 1.252s init, 84.242s render
Iteration Rate: 21.367 iterations per second
Loading Time: 3.95 seconds
Benchmark Results: RTX A5000 #1
Total Rendering Time: 2 minutes 27.86 seconds
CUDA device 0 (NVIDIA RTX A5000): 1800 iterations, 10.422s init, 135.091s render
Iteration Rate: 13.324 iterations per second
Loading Time: 12.77 seconds
Benchmark Results: RTX A5000 #2
Total Rendering Time: 2 minutes 10.54 seconds
CUDA device 1 (NVIDIA RTX A5000): 1800 iterations, 1.262s init, 126.968s render
Iteration Rate: 14.177 iterations per second
Loading Time: 3.57 seconds
Benchmark Results: Titan RTX
Total Rendering Time: 3 minutes 59.44 seconds
CUDA device 2 (NVIDIA TITAN RTX): 1800 iterations, 8.988s init, 228.143s render
Iteration Rate: 7.890 iterations per second
Loading Time: 11.297 seconds
Benchmark Results: CPU (W5-3435X)
Total Rendering Time: 17 minutes 38.81 seconds
CPU: 1800 iterations, 1.448s init, 1055.065s render
Iteration Rate: 1.706 iterations per second
Loading Time: 3.75 seconds
To sum up in table form (along with power stats using the AX1500i's logging features) that's:
that's a whopping 376% percent increase in CPU rendering performance! Am interested to see that CPU rendering is now helping rather than hurting rendering performance with all GPUs active (a previously discussed side effect of Iray's scheduler/load balancing algorithms - huge gaps in rendering performance between devices can lead to the faster devices receiving fewer - or even having to recalculate - iterations.)
Power draw is (excepting Idle stats - not surprising since I kept the 8700K at a fixed 4.7 gigahertz clock speed) across the board higher - in some cases very significantly so. Not sure how I feel about that, although with all the chipset bandwidth of W790 hanging around, I guess Ishouldn't be too surprsied.
ETA: One thing I've left out here is relative temps. Due to the 8700K's poor stock thermal interface quality (the solution to was deliding - something I wasn't prepared to do on a production machine) it would steady-state at around 93c after 10 minutes of rendering. Using the exact same cooling loop setup, the W5-3435X maxes out at only 47c - that's despite pulling 90% more power for a 376% performance increase. Not too damn bad. Now, if only virtually all CPU rendering wasn't literal orders of magnitude lower than GPU...
Am curious how your new system memory is contributing once scene is in memory? > System Memory: V-Color RDIMM 128GB (4x32) DDR5 @ 6000mt/s
like heavier scene load or morph handling and just general DS responsiveness?
saw the mt/s designation for speed. hadn't looked for new system parts yet, so that was interesting.
think that's twice what I have for speed, same memory size. Envious :)
thought i had a pretty high-end PC. Reading your specs, am not even close. haha. Anyway congrats. You should almost make a video to show how fast that all works!
Some 13700k numbers for comparison using the current DS beta
13700K stock, 64gb (2x32) DDR5 6600 CL32, cpu contact frame, Alphacool core 1 Block and custom loop
1.46 iterations per second
max cpu power 226.7w, max cpu temp 84.6c
Thank you for your detailed explanation! Actually, what I was trying to understand was the performance difference between using three 4090s and just one 4090. In my mind, I thought that with three 4090s, I could achieve results three times faster than with a single 4090. However, when I did my renders, it felt like this wasn't quite the case.
I noticed that you use the ASUS PRO WS-W790E-SAGE SE motherboard. I was actually interested in that one. Could you share your thoughts on it?
I've been considering the AMD version to utilize a Threadripper. To be honest, I'm a bit reluctant about using Intel Xeon, but the AMD version only offers PCIe 4.0 and DDR4, while the Intel version supports both PCIe 5.0 and DDR5.
Considering that my goal is to utilize all seven PCIe slots with 4090s for extended animations, in your opinion, which option is better for Daz?
By the way, do you have any recommendations for a PC case that would be suitable for this setup?
I haven't had much of a chance to evlautate things yet (I decided to start over from scratch on everything software-related this time around, from OS to Daz Studio/Daz installed content libraries.) Plus, I am a firm proponent of keeping your installed DAZ content folders slim (only include content needed for a specific project - Daz Studio/DIM supports multiple runtime folders for a reason.) So have never really needed to deal with most of the user-interaction performance issues people tend to have with Daz. So I can't really speak too well to that, although I haven't yet felt the need to tweak things like the Draw settings (normally I am all over those) to improve usability. So there is that. It's definitely a huge improvement over the DDR4-3000 I was coming from, but suspect any major improvments are more to do with the new overall architecture of the chipset rather than RAM speed (fwiw the mt/s - mega transfers per second - distinction is entirely about sounding smart...)
It'd be even faster if I had waited an additional year to upgrade - or ten (this particular machine has been in a constant state of gradual upgrade since 2007.) That's just the way these things work.
I really should do a full write-up on this PC some time (Tower 900 builds are rare enough as it is, and I don't think I've seen a single other attempt at turning one into a performance-focused workstation - despite how capable they are of that.) Yes, it's a bulky, expense platform. But if you're living in the realm of real upgradable workstation chassis already, that's kind of all a moot point anyway, so...
Your longest GPU render time worked out to be 27.908 seconds, which puts your setup's iteration rate at 64.5, which - judging by the performance figures you typically see in this thread for single 4090s - is pretty much spot-on 3x slower than what you're seeing. It's worth mentioning that these are proportional differences we're talking about here. 30 seconds vs 90 to complete a single task might not seem all that much different, but as soon as that single 30-second job morphs into a 30 minute multi-step process - that 3x increase becomes functionally HUGE.
I think that, for the particular use-case we're talking about here (3D rendering), whatever gets you the most, simultaneously usable (that's a HUGE deal no one ever thinks about until later) IO is the way to go. So can definitely vouch for the Sage series in general. As to the AMD or Intel divide? I'd say go with whichever platform is the fastest/most up-to-date. Right now, that's Intel's. But by this time next year it'll probably be AMD's. Decide by determining how long you're willing to wait to start putting your system together.
7 GPUs is gonna require dual PSUs just for having adequate connections, so you're gonna need something capable of that. Haven't ever really looked into that (4 high-performance GPUs is my limit for a still vaguelly conventional form-factor PC workstation. Above that, I think you're better off just going full-on Rackmount.) But I suspect that if you search around, you'll discover that it's a pretty limited set of case options available. Potentially making your selection process easier.
Iray scales pretty well, but not perfectly. The efficiency drops a tiny amount with each extra GPU, it also depends on the hardware involved, and possibly the scene in question as well. I think I read somewhere that Iray maintains a solid 90%+ even with 8 GPUs, which is pretty impressive. I think when you get down to 30 second renders, it gets very difficult to properly judge scaling, as even the slightest hitches could appear to be a large performance gap. If you can test a larger more complex scene, I think that would show scaling better.
And like raydant said, you need to have a lot more hardware to handle 7 big 4090s at once. For 7 GPUs you could consider a former mining chassis to house them. Given how that market has died down, you might find one on the cheap. Interestingly a number of former mining operators actually transformed their mining business into a GPU server business, renting their GPUs out for tasks like rendering 3D or AI.
Running multiple 4k monitors probably will pull some resources from that GPU, and your iteration count shows that. After all, the GPU is rendering what the display has, plus your render in Iray.
Beta 4.21.1.104
EVGA 3060 Black
5800X
(No Overclocking)
GPU ONLY
(NVIDIA GeForce RTX 3060): 1800 iterations, 0.690s init, 276.875s render - 6.5 Iterations/sec
CPU + GPU
CUDA device 0 (NVIDIA GeForce RTX 3060): 1596 iterations, 1.036s init, 251.003s render - 6.358
CPU: 204 iterations, 0.592s init, 252.310s render - 0.809
CPU + GPU overall - 7.13 Iterations/sec
BTW the benchmark scene will not queue to the Iray server! It has no problem streaming to the preview or a render window.
It's a vague 'integer out of range error' and the transfer to the server aborts immediately.
(The server uses Iray 2023.0.2 (367100.3997) and generally gives the error 'deprecated canvas parameters identified' but that does not stop the queuing)
So far only the benchmark scene has failed to queue.