Daz Studio Iray - Rendering Hardware Benchmarking

1151618202145

Comments

  • skyeshotsskyeshots Posts: 148
    edited March 2021

    I have an NVLink coming for large scenes if needed, though most fit fine on 24 GB.

    Lets get to it:

    Test 1 – Before Driver and Daz Update

    System/Motherboard: MSI MPG Z490 Carbon EK X
    CPU: I9-10850K @ 3.6 ghz
    GPU: MSI RTX 3090 x3 (Ventus 3 OC/Identical)
    System Memory: 32 GB Corsair Dominator Platinum DDR4-3466
    OS Drive: Samsung 970 EVO SSD 1TB – M.2 NVMe
    Asset Drive: Same
    Operating System: Win 10 Pro, 1909
    Nvidia Drivers Version: 460.89
    Daz Studio Version: 4.14

    Device statistics:
    CUDA device 0 (GeForce RTX 3090):      609 iterations, 0.238s init, 32.952s render
    CUDA device 2 (GeForce RTX 3090):      574 iterations, 0.860s init, 31.575s render
    CUDA device 1 (GeForce RTX 3090):      617 iterations, 0.336s init, 32.948s render
    2021-01-23 20:03:54.831 Saved image: C:\Daz 3D\Applications\Data\DAZ 3D\\DAZStudio4 Temp\render\r.png
    2021-01-23 20:03:54.834 Finished Rendering
    2021-01-23 20:03:54.859 Total Rendering Time: 35.89 seconds
    Loading Time GPU 1: 2.938 Seconds
    Loading Time GPU 2: 4.315 Seconds
    Loading Time GPU 3: 2.942 Seconds

    Iteration Rate (effective): 1800/35.89 = 50.1532

     

    Test 2 – After Studio Driver Install and Daz 1.5.02 Update

    System/Motherboard: MSI MPG Z490 Carbon EK X
    CPU: I9-10850K @ 3.6 ghz
    GPU: MSI RTX 3090 x3 (Ventus 3 OC/Identical)
    System Memory: 32 GB Corsair Dominator Platinum DDR4-3466
    OS Drive: Samsung 970 EVO SSD 1TB – M.2 NVMe
    Asset Drive: Same
    Operating System: Win 10 Pro, 1909
    Nvidia Drivers Version: 460.89 Studio Drivers
    Daz Studio Version: 4.15.02

    021-01-23 21:46:05.933 Saved image: C:\Users\\Desktop\Daz with temps.jpg
    Device statistics:
    CUDA device 0 (GeForce RTX 3090):      599 iterations, 1.990s init, 30.729s render
    CUDA device 1 (GeForce RTX 3090):      603 iterations, 1.964s init, 30.775s render
    CUDA device 2 (GeForce RTX 3090):      598 iterations, 1.917s init, 30.828s render
    2021-01-23 21:46:16.652 Saved image: C:\Users\\Desktop\output.duf.png
    2021-01-23 21:46:02.344 Finished Rendering
    2021-01-23 21:46:02.369 Total Rendering Time: 35.18 seconds
    Loading Time GPU 1: 4.451
    Loading Time GPU 2: 4.405
    Loading Time GPU 3: 4.90

    Iteration Rate (effective): 1800/ = 51.1654 iterations per sec

    All (3) cards should bare similar thermal loads, so I have some work to do in terms for temps. I noticed this time the test was fast enough that the louder fans, the ones I have set for 50 degrees, only stepped in for a breif moment and then powered down. This benchmark should probably be updated to a 4K image so we can collect more data here. Its not enough time for the cards to warm up.

    A few points, re: the 30 series cards:
    PCIe x16 allows 32 GB/S with 16x and 16 GB/S at 8x. The 3090 does 19.5 GB/S, which is someplace in between (wccftech.com). With (3) cards on Z490, you may end up with 8x in the top slot and 4x/4x in the lower two. Here, the bottom (2) cards establish the maximum allowable transfer rates @ PCIe 3 4x (8 GB/S). This substantially increases the load times (about double) but, as others here have predicted, did not affect rendering time. Excluding bandwidth from this, each card scored close to 20 iterations per second after the update, which is similar to single and (2) card tests in 16x and 8x, respectively. The scaling is there in terms of actual rendering only.

    Edit: Attached Thermal Report for Test 2

    Temps from Daz Benchmark.PNG
    915 x 656 - 186K
    3 rtx 3090.PNG
    486 x 421 - 17K
    Post edited by skyeshots on
  • outrider42outrider42 Posts: 3,679

    Artini said:

    Thanks a lot. Probably will need to use it on the new computer.

    Does anybody use Daz Studio on dual Intel CPU on Windows 10 Pro?

    I have had opportunity to make a tests a couple years ago on dual Intel Xeon CPUs,

    but could not get Daz Studio to use the second CPU on Windows 7 Pro at that time.

    I am still hoping, that in future one could use rendering in iray on multiple CPUs

    and have much more RAM available for the scenes.

    I wouldn't get my hopes up too much. The 3090 may be expensive, but is still cheaper than any high end CPU based build for Iray.

    There are benchmarks for CPUs on the list from this scene, and they offer a glimpse at what performance you can expect. The Threadripper 3970x is the fastest single GPU on the list. Unfortunately we don't have an updated benchmark, I would like to see if it benefits from Daz 4.14 like GPUs do. Anyway the chip hit 2.623 iterations per second. That places it in the range of a 1070 and 1080. That does sound good...except the 3970x costs $2000. The 3090 hit 14 iterations per second on 4.12. That is over 5 times faster and the 3090 actually costs less. 

    Next I looked up the AMD Epic 7742. Nobody has tested that for Iray, but we do have Blender rendering benchmarks, and the 3970x has blender marks as well we can use this to estimate them against each other.

    So in this test the dual 7742 ran at 26.92 seconds versus the 44.89 for the 3970x. A pretty solid improvement. This same test has been done with the 3080, using CUDA.

    The 3080 rendered the scene in 25.89 seconds, just slightly faster, which is kind of surprising. However, the 3080 is less than $1000 compared to an Epic build that would be over $10,000. Also, the 3080 has a secret weapon: OptiX. Iray also uses OptiX now. With OptiX the 3080 can flex its ray tracing cores. So this CUDA bench is not accurate to what teh 3080 can actually do.

    Now the test is rendered in 11.91 seconds. This was with the 3080, the 3090 is even faster, in Blender the 3090 is right about 20% faster, but I wanted to use this site's numbers for consistancy. Also, the performance levels of each GPU reflect what we see with Iray extremely well. I would even bet that Daz Iray would render this same scene almost exactly the same.

    This speed increase is logical, we have a thread that tried testing strand hair with RTX and the RTX cards were indeed seeing some pretty wild speed increases over pure CUDA based Iray. People who had Turaing GPUs and the previous version of Iray still installed were able to directly compare it to new Iray RTX. The numbers were getting comical.

    Still, it looks like Epic could deliver some great performance, but that comes at a price. A pair of 3090s in Nvlink mode would deliver about 48GB of VRAM (as geometry data is duplicated on both GPUs, it will not be the full 48) as well as incredible performance.

  • It's been quite a while since I posted any benchmarks here, but as I was changing my DS beta from 4.14 to 4.15 I thought this might be a good opportunity to do a comparison of three versions of DS using the same hardware, drivers, OS etc.  I'm currently waiting for a 3090 to be available at something close to a standard price so it's nice to see the speed increase from 4.12 to 4.15 for now.  I am just wondering though why the big increase in loading time onmy system for 4.15 compared to 4.14.

     

    All three tests carried out on:

    System Configuration

    System/Motherboard: ASUS X99-S

    CPU: Intel i7 5960X @3GHz

    System Memory: 32GB KINGSTON HYPER-X PREDATOR QUAD-DDR4

    OS Drive: Samsung M.2 SSD 960 EVO 250GB

    Asset Drive: 2TB WD CAVIAR BLACK  SATA 6 Gb/s, 64MB CACHE (7200rpm)

    Operating System: Windows 10 1909 OS Build 18363.1256

    Nvidia Drivers Version: 460.79

     

    Benchmark Results – 2080Ti

    Daz Studio Version: 4.12.086 64-bit

    Total Rendering Time: 4 minutes 14.22 seconds

    CUDA device 0 (GeForce RTX 2080 Ti): 1800 iterations, 3.899s init, 246.330s render

    Iteration Rate: (1800/246.33) = 7.307 iterations per second

    Loading Time: ((0+240+14.2)-246.33) = 7.870 seconds

     

    Benchmark Results – 2080Ti

    Daz Studio Version: 4.14.0.8 64-bit

    Total Rendering Time: 3 minutes 46.24 seconds

    CUDA device 0 (GeForce RTX 2080 Ti): 1800 iterations, 2.945s init, 219.671s render

    Iteration Rate: (1800/219.67) = 8.194 iterations per second

    Loading Time: ((0+180+46.2)-219.67) = 6.530 seconds

     

     

    Benchmark Results – 2080Ti

    Daz Studio Version: 4.15.02 64-bit

    Total Rendering Time: 3 minutes 40.50 seconds

    CUDA device 0 (GeForce RTX 2080 Ti): 1800 iterations, 8.408s init, 208.864s render

    Iteration Rate: (1800/208.86) = 8.618 iterations per second

    Loading Time: ((0+180+40.5)-208.86) = 11.640 seconds

  • skyeshotsskyeshots Posts: 148

    Dim Reaper said:

    It's been quite a while since I posted any benchmarks here, but as I was changing my DS beta from 4.14 to 4.15 I thought this might be a good opportunity to do a comparison of three versions of DS using the same hardware, drivers, OS etc.  I'm currently waiting for a 3090 to be available at something close to a standard price so it's nice to see the speed increase from 4.12 to 4.15 for now.  I am just wondering though why the big increase in loading time onmy system for 4.15 compared to 4.14.

    I notice this as well.

    Before Update:

    CUDA device 0 (GeForce RTX 3090):      609 iterations, 0.238s init, 32.952s render
    CUDA device 2 (GeForce RTX 3090):      574 iterations,
    0.860s init, 31.575s render
    CUDA device 1 (GeForce RTX 3090):      617 iterations,
    0.336s init, 32.948s render

    Loading Time GPU 0: 2.938 Seconds
    Loading Time GPU 2:
    4.315 Seconds
    Loading Time GPU 1:
    2.942 Seconds

    Versus After Update:

    CUDA device 0 (GeForce RTX 3090):      599 iterations, 1.990s init, 30.729s render
    CUDA device 1 (GeForce RTX 3090):      603 iterations,
    1.964s init, 30.775s render
    CUDA device 2 (GeForce RTX 3090):      598 iterations,
    1.917s init, 30.828s render

    Loading Time GPU 0: 4.451 Seconds
    Loading Time GPU 1:
    4.405 Seconds
    Loading Time GPU 2:
    4.90 Seconds

    ____________________________________________

    From these two sample there is an average of 444% increase on init time and average of 38% increase on loading itme.

  • artphobeartphobe Posts: 97
    edited January 2021

    System Configuration
    System/Motherboard: Gigabyte B550M Aorus Pro
    CPU: Ryzen 5 5600x  (stock)
    GPU: EVGA GeForce RTX 3070 FTW3 Ultra 8GB GDDR6 2010 MHz (stock)
    System Memory: Crucial Ballistix Gaming Memory 16 GB (2 x 8 GB) DDR4 3600 MHz C16 (stock)
    OS Drive: Samsung 860 Evo 1TB
    Asset Drive: Samsung 850 Evo 256GB
    Operating System: Windows 10 Pro
    Nvidia Drivers Version: 460.89
    Daz Studio Version: 4.14.0.10 Pro Edition 64bit
    Optix Prime Acceleration: N/A

    Benchmark Results
    2021-01-26 00:02:41.630 Finished Rendering
    2021-01-26 00:02:41.658 Total Rendering Time: 2 minutes 37.61 seconds

    Edit : Will rerun and upload.

    Post edited by artphobe on
  • ArtiniArtini Posts: 9,455
    edited January 2021

    A bit dissapointed, considering the price of the system.

    Benchmark Results

    Rendering Time: 1 minutes 15.59 seconds
    CUDA device 0 (Quadro RTX 6000):      579 iterations, 5.432s init, 66.669s render
    CUDA device 1 (Quadro RTX 6000):      571 iterations, 4.066s init, 65.997s render
    CUDA device 2 (Quadro RTX 6000):      569 iterations, 4.312s init, 65.482s render
    CPU:      81 iterations, 3.561s init, 66.473s render

    System Configuration

    2 x Intel Xeon Gold 6240 @ 2.6 GHz

    768 GB RAM

    3 x Quadro RTX 6000

    Windows 10 Pro for Workstations

    Nvidia Quadro driver 461.09

    Daz Studio Version: 4.15.0.2 Pro Edition (64-bit)

     

    Post edited by Artini on
  • vukiolvukiol Posts: 66

    Artini said:

    A bit dissapointed, considering the price of the system.

    Benchmark Results

    Rendering Time: 1 minutes 15.59 seconds
    CUDA device 0 (Quadro RTX 6000):      579 iterations, 5.432s init, 66.669s render
    CUDA device 1 (Quadro RTX 6000):      571 iterations, 4.066s init, 65.997s render
    CUDA device 2 (Quadro RTX 6000):      569 iterations, 4.312s init, 65.482s render
    CPU:      81 iterations, 3.561s init, 66.473s render

    System Configuration

    2 x Intel Xeon Gold 6240 @ 2.6 GHz

    768 GB RAM

    3 x Quadro RTX 6000

    Windows 10 Pro for Workstations

    Nvidia Quadro driver 461.09

    Daz Studio Version: 4.15.0.2 Pro Edition (64-bit)

     

    wow, what a rig! 

  • ArtiniArtini Posts: 9,455

    Second run (only 3 x GPU)

    Benchmark Results

    Rendering Time: 1 minutes 22.43 seconds
    CUDA device 0 (Quadro RTX 6000):      649 iterations, 4.415s init, 74.440s render
    CUDA device 1 (Quadro RTX 6000):      583 iterations, 6.550s init, 72.110s render
    CUDA device 2 (Quadro RTX 6000):      568 iterations, 10.744s init, 67.840s render

    System Configuration

    2 x Intel Xeon Gold 6240 @ 2.6 GHz

    768 GB RAM

    3 x Quadro RTX 6000

    Samsung 860 QVO SSD 1 TB

    Windows 10 Pro for Workstations

    Nvidia Quadro driver 461.09

    Daz Studio Version: 4.15.0.2 Pro Edition (64-bit)

  • ArtiniArtini Posts: 9,455

    vukiol said:

    Artini said:

    A bit dissapointed, considering the price of the system.

    Benchmark Results

    Rendering Time: 1 minutes 15.59 seconds
    CUDA device 0 (Quadro RTX 6000):      579 iterations, 5.432s init, 66.669s render
    CUDA device 1 (Quadro RTX 6000):      571 iterations, 4.066s init, 65.997s render
    CUDA device 2 (Quadro RTX 6000):      569 iterations, 4.312s init, 65.482s render
    CPU:      81 iterations, 3.561s init, 66.473s render

    System Configuration

    2 x Intel Xeon Gold 6240 @ 2.6 GHz

    768 GB RAM

    3 x Quadro RTX 6000

    Windows 10 Pro for Workstations

    Nvidia Quadro driver 461.09

    Daz Studio Version: 4.15.0.2 Pro Edition (64-bit)

     

    wow, what a rig! 

    I were just involved in the tests of this computer and Daz Studio seems to be a good test object,

    so I have decided to post the results here.

    I would never consider / afford buying such expensive computer for myself.

    Looking at the other test results, RTX 3090 seems to me a better choice for using with Daz Studio.

     

  • outrider42outrider42 Posts: 3,679
    Iray does not benefit from Quadro, other than the generally high VRAM spec. Quadro can be used in TCC mode, which would grant the user the entire VRAM bank unlike the gaming cards. This desires another GPU to run a display. But otherwise Iray does not use any Quadro specific features.

    The Quadro RTX 6000 should be the same as a RTX Titan. The only difference being clockspeed. So for Iray use the Titan is a better buy. But now the 3090 is out, and it surpasses the RTX Titan and is even cheaper. Of course gamers hate on the 3090, but in a way that is to our advantage as it is possible to buy a 3090 at times.

    If you have an opportunity to do one more benchmark, it would be cool if you bench just one RTX 6000, so we could compare it more directly to the RTX Titan and 2080ti.
  • ArtiniArtini Posts: 9,455

    What is wrong with RTX 3090 for games?

    I do not have a money to buy it myself, but my son has already bought it and he plays games quite a lot.

     

  • ArtiniArtini Posts: 9,455

    Do you know what this NVLink Peer Group Size do in Render Settings Advanced?

     

  • ArtiniArtini Posts: 9,455
    edited January 2021

    Rendering Time: 3 minutes 31.37 seconds

    CUDA device 0 (Quadro RTX 6000):      1800 iterations, 4.230s init, 203.838s render

    image

    Set0a.jpg
    1534 x 1104 - 247K
    Post edited by Artini on
  • ArtiniArtini Posts: 9,455
    edited January 2021

    Rendering Time: 3 minutes 29.34 seconds

    CUDA device 1 (Quadro RTX 6000):      1800 iterations, 4.228s init, 201.877s render

    image

    Set1a.jpg
    1506 x 1104 - 284K
    Post edited by Artini on
  • ArtiniArtini Posts: 9,455
    edited January 2021

    Rendering Time: 3 minutes 30.89 seconds

    CUDA device 2 (Quadro RTX 6000):      1800 iterations, 4.202s init, 203.410s render

    image

    Set2a.jpg
    1514 x 1112 - 290K
    Post edited by Artini on
  • outrider42outrider42 Posts: 3,679
    edited January 2021
    For gamers, the issue with the 3090 is the price. The 3080 has a MSRP of $750. The 3090 only offers games 10% more performance, while it costs a staggering 100% more in price. Also the VRAM is nice, but not necessary at this time. While it can be argued the 3080 should have had 12gb, the 3090 with 24gb is kind of overkill. So the price is just too high, and the 3090 has been ripped in just about every gaming review. People already complained about the 2080ti costing $1200, and the 3090 goes even higher. But the 2080ti was quite a bit more powerful than the 2080, and here the 3090 is just 10% faster than the 3080.

    The Nvlink Peer Group is specific to Nvlink. You set it to the number of GPUs you have linked. This allows the GPUs to pool their VRAM. The 3090 can only Nvlink in pairs. So this would be another advantage for Quadro, since Quadro can link up to 4 I believe. So if you had a Nvlink for your Quadro build you could potentially pool all 3 GPUs together for up to 72gb of VRAM.
    Post edited by outrider42 on
  • ArtiniArtini Posts: 9,455

    Ok, thank you very much for the explanation. I will try to use this Nvlink and see what happens.

    I doubt, I will have enough time to prepare the scene to fill up 72gb of VRAM with distinctive items,

    but maybe I could try to make a group of 5 Genesis 8 and copy it couple of times and see

    how many of them will fit in VRAM.

  • ArtiniArtini Posts: 9,455

    Since the computer has a 768 GB of RAM, it will be interesting to see, if one can take 512 GB of it as a RAM disk.

    I do not know, if it is possible to make it on Windows 10 at all. It was ages, since I have used RAM disks.

     

  • outrider42outrider42 Posts: 3,679
    edited January 2021

    Remember nvlink shares texture data but not mesh data. So you still have a limit of 24gb for mesh data. The texture data in theory can go up to 72. You need a three-way nvlink connector between the cards as well. There should be a small performance penalty for using nvlink. That is what testing in other render engines has found.

    It was indeed quite small. Also in the few Nvlink tests we have seen here the difference was so small as to not be an issue. Its on one of these pages, I may try to dig it up.

    This was interesting because in the other render engines the difference was a bit more of a noticeable hit to performance. It still was not a large hit, I think it was less than 10% in Vray with two 2080tis.

    But maybe it also possible that for Iray the cards will not actually share data unless they need to, which if true would save performance and be really convenient (as you wouldn't need to disable and enable Nvlink for the best performance between scenes of different sizes). But that is purely a guess.

    Post edited by outrider42 on
  • outrider42outrider42 Posts: 3,679
    edited January 2021

    As for building a large scene, one thing you can do is turn texture compression way down. This will effectively disable the texture compression. By default the settings are surprisingly low, Iray actually compresses most textures that we use. This setting is found in Iray advanced settings, you can turn the the setting up to something crazy like 10,000. Basically the way it works any texture under the size you specify is compressed. If you turn it to 10K, then pretty much any texture currently available will not be compressed. So turn that setting up and load a bunch of different Genesis 8 characters if you have access to them, and maybe you'll get there.

    You can also turn the subD levels up on the Genesis characters to like 4 or 5. At those levels they can become quite heavy on any machine.

    You can use an app like MSI afterburner the monitor how much vram is being used. I know some software do not correctly report how much vram is used when using nvlink.

    If you CPU rendered, then in theory you could have access to that full memory bank, assuming your software and OS can support that much. But how fast can these Xeons render by themselves? That would be another test to try if you have the time. We certainly do not have a benchmark for them. I wonder how they stack up against the Threadrippers?

    Post edited by outrider42 on
  • skyeshotsskyeshots Posts: 148

    outrider42 said:

    Iray does not benefit from Quadro, other than the generally high VRAM spec. Quadro can be used in TCC mode, which would grant the user the entire VRAM bank unlike the gaming cards. This desires another GPU to run a display. But otherwise Iray does not use any Quadro specific features.

     

    The Quadro RTX 6000 should be the same as a RTX Titan. The only difference being clockspeed. So for Iray use the Titan is a better buy. But now the 3090 is out, and it surpasses the RTX Titan and is even cheaper. Of course gamers hate on the 3090, but in a way that is to our advantage as it is possible to buy a 3090 at times.

     

    If you have an opportunity to do one more benchmark, it would be cool if you bench just one RTX 6000, so we could compare it more directly to the RTX Titan and 2080ti.

    This is a great conversion. I received my 4th RTX 3090 today, but I had looked very closely at just doing a pair of RTX A6000 cards before making this call (not to be confused with current generation Quadro 6000). The A6000 RTX cards are a step up from the 3090s, with 48 GB VRAM each (96 with NVLink), more cores and the addition of ECC DDR6. I received quotes at $4450 each, but they spec out so close to a 3090 it is kinda hard to justify, especially without benchmarks to consider. Cost becomes an even larger factor if you plan to build out for a 4-card setup (est. $12K addl.)

    The RTX A6000 and 3090s are so similar that the physical NVLINK between my 3090s is the same one used for the A6000 1-slot spaced 2U servers. When I was researching, I reached out Nvidia directly for an answer as far as IRAY support and optimizations. They touch base on the gaming subject here, however, they did not compel me enough to make a purchase. Conversation excerpt below: 

    _____________________

    [12:19:37 PM]Inamdar: Thank you for your patience. Here is the link for A6000 specs : https://www.nvidia.com/en-us/design-visualization/rtx-a6000/
    [12:19:41 PM]Inamdar: NVIDIA has transferred the sales and support of the Iray plugin products—Iray for 3ds Max, Iray for Maya, Iray for Rhino, and Iray Server—to the Iray integration partners, Lightworks, migenius, and 0x1 software.
    …[12:23:52 PM]Inamdar: NVIDIA® RTX™ A6000 is high end card for rendering and 3D visualization. Geforce 3090 card is mainly designed for gaming.
    [12:24:24 PM]Inamdar: https://www.techpowerup.com/gpu-specs/rtx-a6000.c3686

    _____________________

    It would be great to see some side-by-side benchmarks here between the 3090 and a6000. I believe the a6000 will take a very modest lead, contingent on a similar subsystem.

    a6000 quote.PNG
    708 x 120 - 7K
  • outrider42outrider42 Posts: 3,679
    edited January 2021

    Yeah, the A6000 is the full chip that the 3090 is based on, and the 3090 has most of the performance that the A6000 has. Only the best samples of GA102 can become A6000. Only the samples that have all the cores working and at a high clock speed qualify to be A6000. So the A6000 should beat the 3090, but it will indeed be close. So in my view, the only reason to buy a A6000 over a 3090 is going to be about VRAM capacity and Nvlink. The 3090 can only link up in pairs, so the max VRAM you can possibly get caps at 48, no matter how many 3090s you have. However I would wager that 48 would be enough for most people, though again, we do need to consider that mesh data is NOT shared with Nvlink. That is the situation currently. It is possible that down the road Iray Nvlink may get updated to share mesh data as well...but I would not count on it happening.

    The 3090 is a very strange GPU, in fact I don't think Nvidia even knows how to market it. When the 3090 was introduced by Jensen Huang himself, he described it as being for content creators before talking about gaming at all. He even compared it directly to the RTX Titan. And the 3090 is largely a Titan class GPU, but it lacks the Titan class software support that Titans get. There is suspicion that Nvidia did not make the 3090 a Titan because of how much power it draws. The amount of power it uses would be too high for many workstations to use. If you look at the A6000, you will se it actually uses LESS power than the 3090 even though it has more active cores and twice the VRAM. The max power draw on the A6000 caps at 300 Watts, which is high, but not as high as the 350 Watts of the 3090 Founder's Edition, and 3rd party vendors can push the 3090 to nearly 400 Watts. I suppose this could be another consideration in the A6000 versus the 3090.

    The Titan can use TCC mode, while the 3090 cannot, and the Titan also has other features that bridge into Quadro territory. The Quadro cards, or rather, I should say formerly called Quadro cards, have additional features and validations for workstations and professional software. However, NONE of these features actually impact Iray performance. So when it comes to Quadro for Iray, the only real reason to consider them is for their VRAM.

    I hate that Nvidia dropped the "Quadro" naming scheme. I don't know what to call them. I will just keep calling them Quadro as a whole to avoid confusion.

    And yeah, migenius has taken over development of Iray. So for questions specific to Iray it is now best to contact them. https://www.migenius.com/contact

    Migenius also has their own benchmarks for Iray, but they have not updated their list in quite a long time. But it is worth looking at, because they list a variety of hardware, including servers, such as the Google V100x8, that is eight V100s. Their benchmark is a totally different scene and setup, and it does not use Daz. So their results are not directly comparable to ours. Iteration counts will vary greatly from scene to scene.

    I was unaware of Lightworks and 0x1's involvement. It looks as these two are only involved in the developments of their software plugins. I could be wrong, but I think migenius is doing the development, as they are providing Iray plugins for other software. This possibly means that Daz is dealing with migenius for their support with Iray, but again I could be wrong. It would be cool if Daz could tell us, I don't think it is any kind of industry secret or anything.

    Post edited by outrider42 on
  • ArtiniArtini Posts: 9,455

    Thanks for all of the tips.

    My own system has only GTX 1080, so I am not used to thinking big for creating the scenes.

     

  • ArtiniArtini Posts: 9,455
    edited January 2021

    Thanks a lot for this tip about texture compression - I was not aware of it.

    It looks like, when I have changed treshold to 8K, it renders even faster.

    Below is Babina 8 HD rendered with subdivision level 5. Rendering time: 9 minutes 23.18 seconds

    Just have to figure out, how to get rid off those ugly rectangle reflections, that coming from

    https://www.daz3d.com/boss-pro-light-set-for-portraits-promos

    image

    BabinaHD04pic01.jpg
    2048 x 2048 - 1M
    Post edited by Artini on
  • ArtiniArtini Posts: 9,455
    edited January 2021

    Ok, one more benchmark, running only on CPU. It was not that bad. I thought, that it will take ages.

    These time all intel cores was used at 100% - big improvement over my previous test with 2 Xeons.

    Previously I have run tests on Windows 7 Pro and only one Xeon CPU were used during the testing.

    Benchmark Results

    Rendering Time: 22 minutes 33.84 seconds

    IRAY   rend info : CPU:      1800 iterations, 3.113s init, 1347.549s render

    System Configuration

    2 x Intel Xeon Gold 6240 @ 2.6 GHz

    768 GB RAM

    Windows 10 Pro for Workstations

    Daz Studio Version: 4.15.0.2 Pro Edition (64-bit)

    image

    BenchCPUscr2.jpg
    1510 x 1110 - 295K
    Post edited by Artini on
  • ArtiniArtini Posts: 9,455

    These results will arise more questions.

    How to create the scene in Daz Studio, that will occupy more than 700 GB and not get mad?

     

  • outrider42outrider42 Posts: 3,679

    Thank you for taking the time to do these extra tests. It adds to our database and may help somebody make an informed decision on their hardware choice.

    I am not as excited about the Xeon results, if these are accurate. To put this into perspective the GTX 970 ran the same bench 2 minutes faster and that was with an older version of Daz. The 970 was just upper mid range in 2014, and two 18 core Xeons from 2019 are unable to beat that. Both of the Threadrippers benched so far beat that score by a pretty wide margin. The 32 core 3970X did it in 11.5 minutes, so this one AMD is nearly twice as fast as having these two Xeons. The 24 core 3960X scored in the 14 minute range, so that is a very healthy lead for it. And all of these benches came on older versions of Daz, which have been verified to be slower than 4.14. That is not good for Intel. Threadripper has a 64 core chip we still have not seen benched (and probably never will), and AMD also has Epic for their server class chips.

    I understand that there are larger Xeons out there, but this felt underwhelming to me. These Xeons have 36 cores between them but were unable to even come close to Threadripper with fewer cores. That doesn't bode well for Xeon versus AMD Epic, at least for Iray. It would be really cool to see an Epic build Iray benchmark. Maybe you can get your hands on one of those some day to play with.

    I can't imagine what a 500GB scene would be like. Keep in mind this little scene here is quite simple with just G8 and a few shiny balls. Plus this test does not run to full convergence, it stops at 1800 iterations, so the image will take longer to finish completely. If it takes 25 minutes to do this simple scene, a scene complex enough to take up 500GB would probably take days or even a week to render. You could easily get a large scene with highly detailed foliage. Take some of the big outdoor scenes and prop them side by side several times over.

  • ArtiniArtini Posts: 9,455

    Thank you for your kind words. You are absolutly right. I have no access to Threadripper or AMD Epyc.

    It was discussed in the past, because of the atractive price of Threadriper, but it has just ended on that.

    Do you know if anybody has tried to put as many different Genesis 8 characters as possible in one scene

    and render them in iray? Just wonder, what is the limit.

    I will probably try to make some small crowd and see how it goes on 3 Quadros

    if Nvlink will work on this computer.

     

  • outrider42outrider42 Posts: 3,679

    The only limits are what the computer can handle. I have seen posts where somebody out of curiosity tried posting as many Genesis models as they could to see how many they could fit into 'X' GB of VRAM. I think somebody got into the 30's. I could be way off, but I remember there were a lot of T posed Genesis 8 models, LOL. This was an attempt to shove as many G8s as possible, though, not an attempt to actually use more VRAM or RAM like you will be.

    I really do think that by turning down the texture compression and cranking up the subD that you will easily churn out a massive GB eating scene. Especially if the characters have a lot of big textures. Some PA models have 8192 pixel size textures, and some have textures in TIFF that are nearly 100MB per texture. I know some bluejaunte characters have 8192 size textures.

    My own experiments found cranking up the subD could really jump up VRAM use. I had a test I did that involved subd and found that I went from 3135MB of VRAM at subd level 2 all the way up to 6799MB at subd level 5 during the render. There were no other changes to the scene. This scene was 5000 pixels tall with nothing but a Genesis 8 character in it. This VRAM was being reported on my secondary GPU, meaning there is no other data on this GPU besides the scene it is rendering in Iray. So pushing the subd up to 5 more than doubled the VRAM required to render.

    So doing that on a few G8 models should easily add up. And remember that mesh data does not get pooled with Nvlink, so this would be a problem for the 3090 as using lots of mesh data would fill its 24GB faster. I did not test the texture compression, but I bet the results would be pretty high on memory use as well. Like I said earlier, Iray compresses texture a lot by default and most of us don't realize it.

  • FrinkkyFrinkky Posts: 388

    outrider42 said:

    The only limits are what the computer can handle. I have seen posts where somebody out of curiosity tried posting as many Genesis models as they could to see how many they could fit into 'X' GB of VRAM. I think somebody got into the 30's. I could be way off, but I remember there were a lot of T posed Genesis 8 models, LOL. This was an attempt to shove as many G8s as possible, though, not an attempt to actually use more VRAM or RAM like you will be.

    I really do think that by turning down the texture compression and cranking up the subD that you will easily churn out a massive GB eating scene. Especially if the characters have a lot of big textures. Some PA models have 8192 pixel size textures, and some have textures in TIFF that are nearly 100MB per texture. I know some bluejaunte characters have 8192 size textures.

    My own experiments found cranking up the subD could really jump up VRAM use. I had a test I did that involved subd and found that I went from 3135MB of VRAM at subd level 2 all the way up to 6799MB at subd level 5 during the render. There were no other changes to the scene. This scene was 5000 pixels tall with nothing but a Genesis 8 character in it. This VRAM was being reported on my secondary GPU, meaning there is no other data on this GPU besides the scene it is rendering in Iray. So pushing the subd up to 5 more than doubled the VRAM required to render.

    So doing that on a few G8 models should easily add up. And remember that mesh data does not get pooled with Nvlink, so this would be a problem for the 3090 as using lots of mesh data would fill its 24GB faster. I did not test the texture compression, but I bet the results would be pretty high on memory use as well. Like I said earlier, Iray compresses texture a lot by default and most of us don't realize it.

    The jump in polygons from subd2 -> sub5 is an increase of 64x (261,888 - 16,760,832). I'm not surprised your VRAM skyrocketed!

    As an aside, the only thing that matters with regard to texture VRAM usage is the dimensions of the texture map. A 4K jpg will use the same as a 4K tiff, png, tga etc. All textures are converted to nvidia's own format, which is then compressed, so original format is irrelevant (assuming the same bit depth I guess). This is why scene optimisers work by shrinking image dimensions and not just using a more heavily compressed format.

Sign In or Register to comment.