Iray revelation!
![areg5](https://farnsworth-prod.uc.r.appspot.com/forums/uploads/userpics/739/n5I7FJI4ITLBI.png)
Ok, so I’ve been working with Iray on my latest comic, and the rendering times in some scenes were really slow. I have a homemade rig with a 4790 i7, 16 gig of ram and a GTX 970. So I figured I’ll add a second video card. I went on Ebay and got a GTX 780 Ti Sc. I threw it in, and guess what? My times still sucked. I mean, 3 ½ hr to get the scene to 10%! I tested my system with the Sickleyield benchmark. With both GPU’s and the CPU, it rendered in 2 ½ min. When I only used the GPU’s, it did it in 1:50 min! So I knew both cards were working. So WTF?
I noticed on my monitor, when doing a scene, my CPU was maxed at 100% and ran at 81 C. I didn’t understand that, because I wasn’t running the CPU through Daz for the render. Then I did some research. At the Nvidia site, it said that if the GPU cards run out of memory, the CPU gets recruited to render. So on large scenes, even if you don’t check the CPU box in the render options, it still gets used. Maxed out.
I still didn’t get it. True, my scenes in some of my comics are large, like 250 mb. But so what? The combined RAM in my cards was 7 gig!
Then, I got this Iray Memory Assistant. http://www.daz3d.com/iray-memory-assistant. If you are using Iray, you absolutely need this! Here’s why: when I ran it, I saw that my available GPU RAM for the scene was -2 gig! Huh? Then, I GOT IT! Iray loads the entire scene into the GPU memory, even if the objects are not in the viewport. The objects, and all of their IRAY materials! It doesn’t just look at what the camera is focused at, it looks at things well out of the scene. At the Nvidia site they state that if the GPU memory runs over, the render times are significantly slower.
So I began deleting every object in the scene that wasn’t in the viewport, until the memory assistant said I wasn’t over utilizing the GPU memory. The scene file ended up being around 50 mb. Making them invisible doesn’t work. You have to actually decrease the size of the scene file. A few things happened:
- The renders became ridiculously fast! A scene that previously took 9 hours to get to 60% finished in 45min!!
- The lighting became better utilized. I had to turn the lights down, turn the ISO down, speed up the f-stop etc.
- My CPU was now running at 20-30 % at a reasonable temp.
So, my advice to speed things up in a large scene is to delete absolutely everything that isn’t in the viewport.
One other thing, and I’m sure this is old news but I just figured it out: a lot of my scenes use emissive surfaces, sometimes exclusively. If those are your only “lights,” and the camera headlamp is on auto, then it will be on. The auto function turns the headlamp off if there are other lights in the scene. It doesn’t consider the emissive surface to be a light. So don’t use auto, just turn it off.
If I figure anything else out, I’ll let you guys know. This IRAY thing might just be practical after all.
Comments
It may also be worth your while making resized versions of texture files, at least for items that you use often and that are going to be small in the render - textures are usually the main consumers of memory for DS scenes.
Not a bad idea. The thing that blew me away, was that although the file size was only 250 mb, when it loaded into IRAY it was over 9 gig. In this particular scene, I was using http://www.daz3d.com/suburban-shopping-mall. I converted all of the materials using the IRAY uber base. Add a few characters, and the renders are almost unusably slow. This is very different than 3Delight. Doesn't matter how big the scene is , the render time is unaffected. Anyway, here's where I found out about size slowing things down:
http://irayrender.com/fileadmin/filemount/editor/PDF/iray_Performance_Tips_100511.pdf
Okay, okay.....I'm going to get my scripts and the instructions for setup finished up so I can upload them somewhere. It's a simple couple of scripts, to automatically resize ALL the textures used (or just the ones on selected objects) and to select which ones are used. It does require you to install ImageMagick, but that's pretty simple. And then you can automatically resize your textures by half and/or quarter size (and they get saved right where the originals are, so easy to find.)
That sounds pretty cool! And it all ends up looking the same? Is it linear? If you make all of your textures say 25%, does that make the file size as seen by the GPU a quarter the size?
No,,,cut the resolution in half, you will reduce the memory used to about 1/4th the original....so that would be about 1/16th. You are dealing with area, so it's going to be inverse square...
Would using an iRay section plane, rather than hiding stuff, be useful in reducing memory requirements?
Keep in mind the reason the render engine is doing that is that stuff 'out of frame' may still impact what's in view; reflections, lighting, etc. So it really has no easy way of determining if an object is going to have any impact on what you see.
So it's up to us to cut down the weeds... and also determine what effects are good enough.
Like I was doing a scene in a forest, and obviously wanted to get rid of a lot of stuff out of frame. But it drastically changed the lighting from mostly shadowed with bright spots to a lot more evenly lit. I decided the even lighting was good for what I was trying to do and went with it. Alternatively, I could have come up with some other lighting scheme to capture the general effect without the considerable overhead of accurately having light go through several trees.
Will, you just needed to replicate the profile of your canopy to get the look you wanted. A simple OpenGL render from the top view, desaturated and applied to a plane at the correct height for the canopy. You'd want to remove the section of the plane/profile that was actually being done by the "in view" objects.
Kendall
It depends on the type of image, and how well the GPU compression works with it (as well as how the compression settings in Iray are set.)
Not exactly. Half-resolution (say, going from 4k x 4k to 2k x 2k) will reduce the number of pixels (and therefore, the number of bytes) by 1/4th. However, depending on the image itself (and how Iray compresses it in-memory on the GPU) it may reduce better or worse than before beyond that. However, especially for larger images, the reduction in pixels will always reduce the GPU memory size for an image considerably. My tests showed about a 25%-30% in image storage memory reduction with using half-resolution images. It also helps to adjust your compression settings based on the actual sizes of the images you are using. Reducing images below 256x256 probably won't help much, and the loss of clarity at that size would be considerable.
Unless I'm missing something, it doesn't really matter what the compression does, over all, because when it's time to use the image, it's the uncompressed/inflated size that matters. And that's about 3 bytes/pixel. Going by that, the inverse square holds...any additional reductions due to other factors are 'gravy'.
Also remember that GPU RAM is NOT combined. Iray attempts to load the entire scene into each GPU so your scene must fit into the GPU with the least amount of RAM. In your case, 3GB is your limit - the amount of RAM in the 780ti.
Both cards will be used when the scenes size is less than 3 GB. The 4 GB card, only, will be used for scenes between 3GB and 4 GB. Neither card will be used at over 4 GB.
Great clarification.
Well, that makes sense. But still, the scenes end up using a lot more RAM than the file size. Something else I noticed: I've been using Techpowerup GPU-Z to check on the card utilization. Both card memories are maxed, but the GPU load on the GTX 970 reads at 100%, and on the GTX 780 0 %. The rendering happens faster, so I don't understand that. I have both cards checked off in Advanced settings.
What do you mean?
Remember...when rendering, ALL the images will be decompressed/inflated...at 3 Bytes per pixel.
Also remember that the scene file does not contain the textures or the assets (the figure and morph data, etc.) so it is a totally ineffective indicator of memory use.
Using Optix and setting Optimization 'for speed' also take up a bit of that precious GPU memory.
If your scene is just on the limit then turning Optix off and Optimizing 'for memory' may be enough to keep your GPU active.
This is incorrect. The images will use some form of compression. DXT5 will compress at a ratio of 4:1. DXT1 for RGB (no alpha) will compress at a ratio of 6:1.
The images are decompressed in memory for actual access. They are indeed temporarily stored in a compressed format in the VRAM, but are decompressed into a buffer to be matched to the UV coordinates.
Kendall
That's what I thought...so even if you do use compression, you need to allow for the maximum size that they can swell to, when decompressed.
Sorta. Just because the textures are compressed in VRAM doesn't mean that they'll actually ever be uncompressed for use. Unless there is a hit generated for the surface that holds the texture, it will never be spooled for access. So it is entirely possible to have hundreds of compressed textures stored in the VRAM that never get accessed during the rendering process, thus they only ever take the space of compression. However, on the other side, it is also possible that a texture is used so often that it exists in VRAM constantly in both compressed AND uncompressed form. In this case, the texture can take as much as 1.5x its uncompressed size if the compression algorithm can only get to a ratio approaching 1:1 (like 1.2:1).
Once decompressed, there is a "time to live" applied to the texture which determines the time that the texture will be deleted from access memory if not accessed. During this time, the texture exists twice: once compressed for storage, and once uncompressed for access. Obviously, deleting the texture immediately after access only to have to uncompress it again to reuse it immediately would be a YUGE waste of processing. If it is not accessed within a "reasonable" amount of time, the memory is returned to the pool for reuse. This is why it is so important that unused textures be removed and used textures be as small as the can be. With thousands of cores hitting various areas of the scene, there is a great possibility that a great many textures will be uncompressed at any given time, and may be accessed enough to "live" in memory for significant periods of time.
Kendall
No wonder it's so easy to run out of memory and drop to CPU only...it's even more complex than I was thinking it was.
And it shows why geometry and procedural textures are 'cheaper'. Also tiny, tiling textures will be 'cheaper' than large fullly mapped ones...thinking specifically of floors and walls.
Something I just noticed...because some characters have options, there may be several sets of maps for certain body parts. Let's say there is a scar/no scar option. If the scar loads as the default and then you switch to the no scar version, it is very possible to end up with both sets of maps loaded, because not all the surfaces using that map have been swapped out! So if the lips are still using the 'scar' map...it's loaded for the lips, but the 'no scar' map is loaded for the rest of the face.
I think only nVidia knows for certain, but it seems likely that at a minimum they use some form of run length encoding on the depacked images in memory. I did a test over a year ago for this then got bored and didn't complete it. But basically, it takes comparing a basic black/white checkerboard scene (no compression so no artifacts) against a richly textured scene of the same dimension, and noting any difference in memory sizes. If anyone wants to pick this up be my guest. Remember to restart D|S for each test.
If Iray is doing compression in addition to, or instead of, RLE, that it's their own proprietary process and likely part of a trade secret, and we may never know. Since Iray is an iterative progressive renderer, it requires access to pixels over the entire frame at all times. The entire scene must be available for each sample. I do agree with Kendall that textures may not be processed in memory until the first ray hit. I've never witnessed this first hand, as my scenes tend to be single character or object, and everything in the scene is visible in the camera.
Iray does use several techniques that can affect memory usage and we're probably seeing the effect of those as well: Three big contributors are texture compression (how ever they do it), geometry instancing optimization (the Memory/Speed thingie), and that wacky OptiX Prime ray trace enhancer. So any tests need to accurately track the use of these features, as well.
It's even possible that the "speed/memory" optimization affects this. If the compression scheme is block-based, it would be possible to convert a texel address to a block, and 'decompress' only that block of the image (saving memory) or decompressing the whole image (increasing speed?). Depending on the compression, it might only have to decompress until the texel address is reached.
Without knowing exactly how the compression is implemented internally, there's no way to know exactly HOW much compression helps/hurts memory/speed.
Also, I think the texture compression is independent of Iray, and is used throughout the cards rendering system. Iray just uses it and exposes the configuration parameters.
Not all images will hit it at once, but any assigned to a particular area (let's say lips)...that would included bump, normal, color and any other control maps, WILL be 'maxed out' together. It's looking like texturing for Iray should be much more complex than it is currently being done, in order to maximize efficiency and minimize memory usage.
THIS is the operative statement. I've made this point publicly in the past. PAs are still creating content using workflows designed for 3DL and firefly -- using bitmapped textures for everything. I am sure that things will start to move toward using procedurals and more optimal MDL types, but ATM that is just not the case. Part of it is that there aren't enough "provided" procedurals to cover many basic things in DS, leaving the PAs in a situation where they either develop completely new procedurals, buy them for redistribution, or require a purchase of another PA's products.
EDIT: I am of the opinion that DAZ needs to create a library of procedurals (stone, marble, etc) and sell those like they did the V4/M4 Morph packs. This would allow PAs to leverage a known set of procedurals and can "require" a standard pack of shaders. If the user doesn't want to buy those shaders, then (like before) they end up with default clay.
Kendall
I just did a quidk test...and even small tiling textures will cut memory usage, a lot. But, compression artifacts will multiply. A small 256 x 256 tif was much 'cleaner' when tiled than the jpg version and in the end was about the same overall memory usage, while rendering than the much smaller (file sized) jpg! Being under 512, they are below even the lowest 'trigger' point for the default compression settings.
That isn't really even optimal for 3DL...hasn't been for a while. It's not as great of a hit/concern for memory usage, becasue you do have the luxury of access to as much memory as the system can make available/has installed, but it's not optimal.
The Speed/Memory optimization switch in D|S refers to Iray's geometry instancing interface, which is either flattened or instanced. Iray implements instancing as either on or off. When off (the default), all scene elements are flattened into a single block, and here there *might* be a savings in texture size because unused parts of the texture would be eliminated. However, this is only conjecture, and it's possible the hidden areas of the texture are still in memory but simply not shown.
Flattening consumes more memory -- the opposite of what you'd think if textures were being trimmed -- but takes less time, sometimes considerably less time. Daz chose to use non-standard terms to reference the effect of this feature in a way that masks what it actually does. Iray supports a per-node instancing scheme that doesn't appear to be implemented in the D|S interface. That would be a nice feature to have.