DS & Titan RTX
JeffG
Posts: 120
Thinking of buying one of these monsters:
https://www.nvidia.com/en-us/titan/titan-rtx/
... and I was wondering if DS/Iray takes full advantage of the card's features and what sort of performance I might expect.
Comments
Not yet it doesn't, no. Daz is dependent upon iRay support, which isn't yet ready. Fuchs at the NVIDIA Iray forums said they demo'd an RTX iRay renderer in Solidworks recently, and that he expects it to "come this spring". After a general release we'll have to wait for Daz support as the API will change. You'll still get a pretty blistering speed from the CUDA cores on that Titan though!
As above, the card will work, and it will still be faster than just about every other off the shelf mainstream GPU.
But the new tech (the RT cores) on those cards will not be utilised until iRay is updated to use it. Once that happens, you can expect a big jump in performance
The problem with the Titan is that unless you need that much VRAM it's not really worth the price. The 2080Ti is almost as powerful for a lot less money.
Bought one myself in early January. Iray performance is currently in the neighborhood of 5-10% better than a 2080ti. And to confirm what others have already said, its hardware is not yet fully utilized for Iray rendering (am expecting to see something like a 200-400% performance uplift when that happens...)
Based on my own testing 24GB is currently absolutely overkill for DS/Iray rendering. With that said, 11GB (the next highest amount available) is uncomfortably low - especially when you consider how high performance the card itelf is at rendering compared to virtually anything else. As it stands, my stock Titan RTX has around 14x the rendering performance of my mildly OC'd i7-8700K (at less than twice the power consumption, I might add.) This means that falling back to CPU ever is pretty much a non-option for me from a time=money perspective. And since that 11GB limit is only going to get more restrictive as time goes by, a 2080ti is imo far too likely to become a giant paperweight for the class of 3d rendering it is otherwise perfect for to be worth the investment - assuming you plan to make full use of what Daz Studio plus Iray artistically have to offer.
Well if you ask me 24G vram doesn't make sense .. there's out of core rendering for that. As for speed just 2x 2060 will perform much better. That is, overall the titan rtx has a horrible performace/price ratio and for the same price you can do much better using multiple gpus and out of core rendering.
EDIT. of course OOCR also means using a more featured PBR engine such as Cycles or Octane rather than Iray.
Yeah, unfortunately (or perhaps fortunately since OOC rendering can lead to massive performance drops on high-end GPUs - thereby negating their high-endedness) out-of-core rendering isn't even an option where Iray is concerned. And since Daz is (at least currently) just Iray...
In the OTOY forum they state the performance drop to be about 10% .. and afaik there's Octane for daz studio out there .. Though I admit I'm not up to date with the subject since I mainly use Blender right now.
And you always have the option to export to Blender or Maya for better animation tools and rendering anyway. I mean, in its current state DS doesn't seem exactly a high-end professional animation suite to me. It is rather entry level. So what's the point using a titan ?
https://render.otoy.com/forum/viewtopic.php?f=9&t=53895
https://render.otoy.com/forum/viewtopic.php?f=44&t=69571
https://www.daz3d.com/daz-to-maya
http://diffeomorphic.blogspot.com/
Maybe it's just me, but I don't think there's such a thing as too much VRAM. More is better. That being said, Nvidia in their history typically introduces a card with the same or near the same VRAM as their Titan series in their mainstream cards. In the past, that would be the Ti model. Of course they started this time with the release of the 2080 Ti at 11 gig, so I don't know wht the designation of the lesser priced 24 gig card will be, or if we'll have to wait for the 2180 series to see it happen. That will probably be the case. I think I'm set for a while with my 1080 ti's. What I'm hoping is that when Iray is updated to fully utilize an RTX, nvidia will probably have moved on to the next model at which point I can get some slightly older RTX's at a discount on Ebay.
Strange things happen when you do mixed CPU/GPU rendering on hardware with VASTLY different performance specs.
I've been doing a lot of testing with the new Titan in conjunction with the 6c/12t i7-8700K (still a very good high performing CPU despite being a generation old) in my primary production machine. And in it's current RTX-unsupported state, the Titan RTX is approximately 1400% as fast as the CPU for Iray rendering (and that's with a slight all-core 4.7 overclock to boot.) Due to the size of this performance gap, merely enabling the CPU as an additional rendering device in my system causes the Titan's effective performance to drop by more than 10% (due to it needing to constantly wait on the CPU's own render contributions to clear.) And that's with Iray's current load sharing implementation (where separate device memory isn't shared.) In an Iray equivalent application with OOCR where the Titan RTX would be bottlenecked by the i7-8700K's memory throughput and PCI-E lane traversal, I can only imagine how much larger the performance drop on the Titan itself would be (possibly 60+ percent, given that current gen PCI-E x 16 max throughput is only about 1/3 the speed of the Titan's own NVLink connector, and Nvidia's own design indicates that is how much bandwidth it takes to effectively share memory across multiple devices.)
The most obvious way to try remedying this would be to get a much faster CPU like a 64-thread Ryzen 2990WX (what some parts of the internet tell me is the fastest CPU available right now) whose rendering performance is at least somewhere in the same ballpark as the Titan RTX. The thing is, the 2990WX is only about 370% as fast as my mildly overclocked i7-8700K. Which in turn would put it at just a little over 1/4 the rendering performance of a Titan RTX. And since full RTX support looks likely to be tripling the rendering performance of these cards in the near future (see this recent Puget Systems study) this would put the performance disparity between a 2990WX and a Titan RTX at around 1134% - which isn't all that different from what the i7-8700K vs. Titan RTX is in my system right now. Plus none of this addresses the aforementioned issue of PCI-E x 16 bottlenecking (which not even a move to PCI-E 4.0 would resolve since 3.0 > 4.0 is only going to be a doubling of throughput.)
Another issue worth considering is power consumption. I don't know what kind of a workload OOCR puts on a CPU. But assuming it's similar to normal CPU based rendering, that's gonna be a consistent 100%. At full tilt, my mildly overclocked i7-8700K pulls 140 watts of power (as measured from the ac outlet) over idle. Meanwhile the Titan RTX pulls around 290 watts. While that may seem like a lot more, remember that the Titan (in its current configuration in my system) is 14x faster at rendering than the i7-8700K. This makes that 288 watts equivalent to roughly 1960 watts (almost 2 kilowatts!) of CPU power. Meaning that using an i7-8700K class CPU for a significant part of the rendering load is roughly 6.75 times more expensive (in terms of operating cost) than just using a Titan RTX. Similarly a Ryzen 2990WX tops out somewhere around 500 watts. Making that 290 watts equivalent to 1890 watts or roughly 6.5 times as expensive.
And speaking of cost, yeah, $2500 does seem pretty ridiculous for a GPU. However an i7-8700K is a $350 part at 1/14th the performance. Similarly the Ryzen 2990WX is an $1800 part at just over 1/4th. Meaning that that $2500 is equivalent to anywhere from $4900 to $6804 spent on CPUs for equivalent performance. And while this may seem like an excellent justification for getting a 2080ti (since it has almost the same performance as a Titan RTX at half the price), remember that this still leaves you to the mercies of potentially extreme PCI-E and CPU bandwidth performance limitations when dealing with GPUs that are drastically more powerful at rendering than today's CPUs in an OOCR use case.
I don't know. It doesn't seem like it's worth the cost. Iray gets most of it's punch from the number of CUDA cores available. So the RTX has 4608 cores, which is certainly a lot. My current build, even though the CPU isn't the greatest, has 10752 cores available and it's mighty fast. At the present time, it's hare to overwhelm the 11 gig VRAM, which is still a lot these days. It won't be a lot in a couple of years when the scenes get more complex, but at the moment that's enough VRAM. You can get 3 1080 ti's on EBAY for 1600-1800. So for me, it's a much better value than either a multi card 2080Ti system (same amount of VRAM) or a single Titan RTX (much fewer cores, more VRAM)
While RTX will accelerate raytracing pretty significantly, it only will have an effect on geometry operations. Other aspects of rendering will still be handled by CUDA. Those of you expecting 200 to 400% will probably be disappointed unless you are rendering scenes with very simple shaders. Texture processing (such as SSS shading) will not be accelerated by RTX. Octane has been doing a bit of tricky marketing letting its users believe it will have monstrous speedups, but in actuality, in most cases, they will be more modest gains, highly dependent on your scene. The developers at Redshift released a statement about RTX on their Facebook Group page to prepare users for more reasonable expectations: https://www.facebook.com/groups/RedshiftRender/permalink/531192430663305/ If RTX makes its way into Daz Studio, it is unlikely to be the substantial performance increase we are seeing in gaming. At least not in this first iteration.
Imo it's worth it if you're looking at it as a long-term business investment - especially when you factor in the (admittedly still uncertain) prospects of upcoming RTX acceleration. My last Nvidia GPU purchase lasted a good ten years (I only retired the card from my system recently because it had fallen behind performance-wise my current CPU's iGPU.) If 3D rendering or other similarly taxing multimedia creation projects are a part of your livelihood, then long-term usability should be your focus in buying hardware. And for the reasons already stated I actually honestly think that the Titan RTX is an EXCELLENT buy from that perspective.
Right up until they release the cheaper, better performing Titan RTX Mark II that is... ;)
Heh. There's no such thing as a long term investment where video card tehnology is concerned.
I don't know. 10+ years seems like long-term to me... Infact I'd argue that the two pc components that CAN hold up as long-term investments (as long as they're high quality enough) are GPUs and PSUs. My 10+ year old GPU will still work in my current cutting edge pc. I just choose not to use it because it isn't worth the effort to have it installed.
I'm sorry, but that's a wishful argument at best. The best nvidia desktop graphics card on the market 10 years ago was the GTX275. That thing couldn't render a thumbnail in iRay today. Go back even just 5 years and the 780ti was king of the hill but it's meager 3GB of vRam would be eating dust today. If you would have bought a GTX Titan just 5 years ago (the year it was introduced), today, you would be stuck with a slow card with only 6GB of ram and scraping the bottom barrel in modern software and games. In those 5 years, the titan has gone from 6GB to 24GB. A 4X increase. CPU and GPU technology has been advancing at breakneck speeds and are the last thing you want to consider a long term investment. PSU? yes. Disc drives? just a little less Yes. Even RAM isn't immune to the march of technology. Definitely not silicon. The best argument for buying the highest end hardware is that you need to get work done fast and now. If you need the RAM and you need the speed to do work now, that alone could justify the costs. If you are a hobbyist and you can afford the best, you don't need to justify it no more than any other luxury item. We want what we want.
As you say yourself that's bad mixing gpu and cpu, so just don't. As for pcie lanes you're right. With oocr you can't use risers since the performances will drop to zero there. So you have to find a mobo with multiple 8x or 16x connectors. And definitely go for a quad channel overclocked fast ram to optimize performances. In this scenario I can't see how a drastic drop in performances would be possible.
Texture memory copying just takes a minimal part of the rendering process. The main bottleneck is the 8x or 16x connector, so not a big deal. The ooc ram itself is shared among cards.
EDIT. And as others pointed out gfx cards become obsolete really fast. So staying into mid-class gets a much better price-performance ratio over time.
Doesn't matter how many Cuda cores you have if they aren't being used.
They get used in Iray rendering.
The thing is, if someone is able to afford building an entire high-end desktop (HEDT) machine like the one that what you're describing would entail, then chances are that affording a Titan RTX isn't going to be an issue either. And that still isn't adressing the fact that even the highest end HEDT part of today still isn't going to be able to hold a candle to the rendering performance of a single Titan RTX or 2080ti once RTX comes fully online in applicable software later this year.
Furthermore, building an HEDT system for the purpose of "matching" it's CPU/system memory performance to that of a Titan RTX/2080ti in rendering would mean spending a minimum of $2600 on CPU/Motherboard/RAM alone. Whereas a system like mine (more than capable of overseeing the rendering process but not directly taking part in it) can be had for around $700. $700 + $2500 = $3200 for an i7-8700K/Titan RTX based buuld capable of rendering to 100% of it's performance capability all of the time is a significantly better investment than $2600 + $1200 = $3800 for a Ryzen 2990WX/2080ti based build capable of rendering to 100% of it's performance capability only some of the time.
I don't think you grasp the enormity of the bandwidth scaling factors we're talking about here. PCI-E 3.0 x16 (the current highest speed standard) has a max theoretical bidirectional bandwidth of less than 32GB per second. The NVLink interconnect on Titan RTX/2080ti/high-end Quadro RTX cards (which is already significantly scaled down from its research oriented GPU predecessors) is a 100GB per second connection. And since 32 is approximately 1/3 of 100, the best case scenario you could ever see for an OOCR implementation of vram sharing would be -60% (technically -66% by my math) performance compared to vram pooling over NVLink, much less no vram sharing at all (courtesy of a high capacity vram card like a Titan RTX.) And not even the upcoming move to PCI-E 4.0 will be enough to eliminate this bottleneck (since 4.0 is only going to double PCI-E to 64GB per second - which is still significantly less than 100GB per second.)
It's a 100% factual argument because GPUs and PSUs are the ONLY two computer components whose physical interconnecting standards (ATX and PCI-E respectively) have remained 100% fully forwards/backwards compatible over the past 10+ years (even USB is now falling by the wayside in this respect since it's latest announced protocol advancements include dropping native USB 1.0/2.0 pin support in favor of increased 3.0+ bandwidth.) And with PCI-E 4.0 already announced and still fully backwards compatible with all previous generations, this distinction is only going to continue.
Yes, but none of this has anything to do with compatibility. Compatibility is about whether or not something works in combination with something else AT ALL - not relative performance. My 10+ year old custom passively cooled, high BUILD quality (otherwise it wouldn't have stayed functional for this long) 8800GT can still be used in my current pc in a variety of computing tasks (if only for a marginal performance increase.) Making it, and other graphics cards like it, a good long-term investment (at least in computer hardware timespan terms.)
Your 10 year old gpu is not compatible with iRay today. I could not render in iRay today and likely 10 years from now, your Titan RTX will be incompatible with most of the technology in the future.
It doesn't need to be compatible with most of the technology in the future in order to be a good long-term investment. All it needs is to maintain compatibility for at least the next 2-3 generations/years worth of new graphics hardware - something it is guaranteed to do (unless it breaks) because of PCI-E 4.0 coming out and the fact that it supports TCC driver mode over its NVLink connector.
You're talking about pcie vs nvlink speed. But again, ooc textures copying takes only a minimal part of the rendering process. So yes, pcie is a bottleneck compared to nvlink, but this only gets an overall 10-20% performance drop as stated in OTOY forums. Also two-three card mobos are quite cheap, so you don't really need a HEDT to match the titan performances in the mobo, that would be useless.
Is that 10-20% based on any testing using a Titan RTX or 2080ti utilizing FULL RTX hardware based raytracing acceleration? Because if the answer is no, then those numbers mean nothing in this particular situation. These cards are in a completely different league performance-wise than almost anything that's come before them.
Non HEDT systems don't have enough dedicated PCI-E lanes to fully drive two (much less three) graphics cards at full x16 speed (your motherboard may have 3 x16 sized slots on it. But the reality is that those slots will drop down to x8 or even just x4 if you fully populate them with graphics cards.) Neither can they drive quad-channel memory. And regardless, not even an HEDT system is gonna have a prayer of "matching" Titan RTX level graphics performance.
I can't follow you. RTX has nothing to do with OOC textures. I mean if we're talking of rendering engines. If we're talking of games then it's another story and OOC doesn't apply there of course. If you mean a rendering engine using RTX for raytrace then it depends. If it is tile-based then the system will transfer the textures for one tile then RTX will work just fine.
I know, and 8x is good enough for OOC transfers. As is dual channel ram. For example with two 8x cards and a 2666 dual channel ram you will have 8G transfer per card and 40G bandwidth on ram. Of course it's not as nvlink but it'll work fine to transfer textures.
EDIT. It seems to me that you overestimate the bandwith needed for rendering. To get the idea consider that most hd games run just fine at full speed with pcie 2.0 that's half the pcie 3.0. Also consider that CPU based engines work at full speed with DDR3 ram that's about 10G transfer rate. So you see that bandwith is only a small factor in rendering.
RTX acceleration has to do with the speed at which an RTX supporting card is able to complete whatever graphics workload it has in front of it. Which in the latest round of testing in Iray-like software looks to be an average of around +300%. In the case of an especially high performance dedicated graphics processing card like a Titan RTX or RTX 2080ti (or Titan V for that matter) where every part of its internalized graphics rendering pipeline is an order of magnitude more efficient than what even the fastest HEDT system is capable of achieving (especially in terms of memory transfer bandwidth between GPU/system ram and add-in device cards like GPUs) is logically going to lead to major losses in overall rendering performance any time the GPU is put into the situation of needing to wait for the CPU to supply it with more data. Which is what OOCR is formulated around constantly doing.
What happens when the speed at which the CPU can transfer new tile data to the GPU is significantly slower than the speed at which the GPU is able to fully render those tiles? Because that is what I'm talking about here. A constantly overloaded CPU/PCI-E communications bus (leading to a sluggish operating system) and a GPU constantly idling while waiting for new data to process. Ie. Significant losses of overall GPU rendering performance compared to render scenarios where GPU memory usage is kept strictly in-house.
First let me tell that I find this discussion very interesting and useful since it seems to me that you're a competent and to the point reference. So thank you.
I wasn't aware that Octane can use RTX already. It is a shame that NVLINK seems to be not working yet, so I guess it's a work in progress. Thank you for pointing out this information.
If that will happen then OOC will not be usable anymore. The whole point to OOC technology is that rendering takes much more than texture transfers. That's what's happening with the current cards. If RTX will change that it's to be seen. If we consider that with GTX the performance drop is 10-20% then it means that texture transfers take about 10-20% of the rendering time. Now if RTX increases the rendering performance by 300% then we get a 30-60% performace drop with OOC. So yes in this case we loose a lot. Unless we can speed up the transfer rate.
EDIT. For the sake of completeness it is anyway to be seen if that 300% gain is referenced to real production scenes with large textures and complex shaders. Since RTX accelerates only the raytrace part then when we have complex shaders it can only get as far as a 50% gain as @drzap pointed out, and in this case OOC may work fine enough.
https://www.facebook.com/groups/RedshiftRender/permalink/531192430663305/
Oh absolutely. In fact, see this post from about 2 months ago where I brought up this very issue specifically in the context of Iray rendering:
Keep in mind that this 10x (+1000%) performance prediction serves merely as an upper limit of the increased performance you could expect to see from RTX hardware in scenes without complex shaders.
So yes. RTX acceleration is only going to enhance performance on a scene-by-scene basis. However, in the scenes where it is going have a noticeable impact, that impact is potentially going to be huge (within multiple orders of magnitiude - as thisd most recent Puget Systems study indicates.) No longer having raytracing as the most time-consuming part of the rendering pipeline is gonna mean that most performance slowdowns will be due to complex shading or insufficient texture transfer bandwidth - which is where my trepidations with OOCR come in.
OK, 4 months passed by since the last post here. I purchased two of these Titan RTX and would like to know if there were some updates in the meantime in Iray (RT cores).
Nvidia released its first official RTCore supporting version of Iray about two weeks ago. But as of right now, the only app updated to include it is Iray Server. Daz Studio needs another update (beyond the brand new 4.11 release) to implement it. Imo expect to see it in the next (first) 4.12 beta release.
Thanks for the info. And do you have any information when 4.12 will be released? And because the Titans haven't been updated for Iray in DS yet, does it mean I will render with them slower than with my current two 2080 Ti's? And do you know if Daz Studio/Iray will support nvlink for render sometime?