Choosing a new Nvidia Card for iRay Rendering

123578

Comments

  • barbultbarbult Posts: 24,720

    These are the measurements I took today on two different computers. It seems in some cases CPU+GPU can make the render go faster, and in other cases it makes it go slower (than GPU only). In both cases, CPU only was much much slower (not surprising).

    i7-4790
    GTX-980Ti
    Windows 10 Preview
    Scene converged to 90%

    GPU Only: 6 minutes 27.72 seconds (Fastest)
    GPU+CPU:  6 minutes 57.79 seconds
    CPU Only: 1 hours 8.84 seconds

    i7 860 @2.8 GHz
    GTX-760
    Windows 7
    Scene converged to 75% (because i was not patient enough to wait for 90% on this slower system)

    GPU Only: 8 minutes 54.69 seconds
    GPU+CPU:  7 minutes 56.99 seconds (fastest)
    CPU Only: 44 minutes 50.47 seconds

     

  • mjc1016mjc1016 Posts: 15,001
    edited July 2015
    fastbike1 said:

    I'm willing to bet that a substantial part of this population isn't trying to do big Pixar / LucasArts style scenes. These would be the same folks that are running older machines / are self described hobbyists / have limited funds for the hobby. Perhaps I'm wrong. We all have some variable cpacity for self delusion, often no more apparent than in our hobbies.

    Video memory disappears quickly using lots of large hi-res texture maps.  Iray uses about 3 bytes/pixel (average) for images (all images...diffuse, normal, bump, other controls...all of them) so when you start loading a decent poly figure, with hair/clothes and 'nice' textures' you will pass 1 and 2 GB rather quickly.   Not all 'big' scenes are relly that big...just a nearly naked Vicky, with a sword (of course), battling a lone dragon, in a temple (of course) can drop to CPU because it ran out of room.  

    For a 4k skin texture...that comes in around 150 MB per SURFACE! (How many surfaces are there on G3F?)

    For those really big scenes  you need 8 or more GB (the nosebleed Quadros and Titans)  or a networked solution.

    Post edited by mjc1016 on
  • namffuaknamffuak Posts: 4,170
    mjc1016 said:
    fastbike1 said:

    I'm willing to bet that a substantial part of this population isn't trying to do big Pixar / LucasArts style scenes. These would be the same folks that are running older machines / are self described hobbyists / have limited funds for the hobby. Perhaps I'm wrong. We all have some variable cpacity for self delusion, often no more apparent than in our hobbies.

    Video memory disappears quickly using lots of large hi-res texture maps.  Iray uses about 3 bytes/pixel (average) for images (all images...diffuse, normal, bump, other controls...all of them) so when you start loading a decent poly figure, with hair/clothes and 'nice' textures' you will pass 1 and 2 GB rather quickly.   Not all 'big' scenes are relly that big...just a nearly naked Vicky, with a sword (of course), battling a lone dragon, in a temple (of course) can drop to CPU because it ran out of room.  

    For a 4k skin texture...that comes in around 150 MB per SURFACE! (How many surfaces are there on G3F?)

    For those really big scenes  you need 8 or more GB (the nosebleed Quadros and Titans)  or a networked solution.

    OK - my experience on Vram usage. One G2For G3F figure, clothing, hair, a simple prop or two (table, chair, lamp or some such) and Inane Glory's photo studio backdrop and Iray light set - and I hit just over 1.8 GB of Vram on the card. It looks to be VERY easy to blow past 2 GB (I have a separte card for the monitors - if the monitors were on the render card it would exc eed 2 GB).

  • kyoto kidkyoto kid Posts: 41,198

    ...exactly what I have been saying all along. Some of my scenes could probably even make a Titan-X choke.

  • mjc1016mjc1016 Posts: 15,001
    namffuak said:
    mjc1016 said:
    fastbike1 said:

    I'm willing to bet that a substantial part of this population isn't trying to do big Pixar / LucasArts style scenes. These would be the same folks that are running older machines / are self described hobbyists / have limited funds for the hobby. Perhaps I'm wrong. We all have some variable cpacity for self delusion, often no more apparent than in our hobbies.

    Video memory disappears quickly using lots of large hi-res texture maps.  Iray uses about 3 bytes/pixel (average) for images (all images...diffuse, normal, bump, other controls...all of them) so when you start loading a decent poly figure, with hair/clothes and 'nice' textures' you will pass 1 and 2 GB rather quickly.   Not all 'big' scenes are relly that big...just a nearly naked Vicky, with a sword (of course), battling a lone dragon, in a temple (of course) can drop to CPU because it ran out of room.  

    For a 4k skin texture...that comes in around 150 MB per SURFACE! (How many surfaces are there on G3F?)

    For those really big scenes  you need 8 or more GB (the nosebleed Quadros and Titans)  or a networked solution.

    OK - my experience on Vram usage. One G2For G3F figure, clothing, hair, a simple prop or two (table, chair, lamp or some such) and Inane Glory's photo studio backdrop and Iray light set - and I hit just over 1.8 GB of Vram on the card. It looks to be VERY easy to blow past 2 GB (I have a separte card for the monitors - if the monitors were on the render card it would exc eed 2 GB).

    I'm up to a few dozen Iray renders...none of them are very complex scenes.  The most complex was a car, with procedural textures (Iray paint, etc) and that barely fit on my 1 GB card.  Anything more complex than that car/hdri would blow past the 1 GB as if it wan't even there.

  • kyoto kidkyoto kid Posts: 41,198
    edited July 2015

    ...and just whe we thought things were going to settle down.

    http://www.guru3d.com/news-story/nvidia-talks-about-pascal-will-be-fast-and-has-3d-stacked-memory.html

    32 GB VRAM.

    ...the tech curve never sleeps.

    Post edited by kyoto kid on
  • ZilvergrafixZilvergrafix Posts: 1,385
    kyoto kid said:

    ...and just whe we thought things were going to settle down.

    http://www.guru3d.com/news-story/nvidia-talks-about-pascal-will-be-fast-and-has-3d-stacked-memory.html

    32 GB VRAM.

    ...the tech curve never sleeps.

    will be named Kyoto Kid signature series, finally your mega scenes will be entirely allocated on Vram!  cool

  • kyoto kidkyoto kid Posts: 41,198
    edited July 2015

    ...heh, finally. 

    Though it will be a bit longer for this technology to find it's way into enthusiast systems as at the outset it will only be available on enterprise systems intended for heavy duty GPU based computing using the new IBM Power8 chip based CPUs.  Currently Nvidia has not even started work on an x86 based NVLink interface. 

    According to what I have been seeing in various articles, a Pascal GPU can be also accessed through a standard PCIe 3.0 x16 slot however it loses the speed advantage afforded by NVLink which employs "fat pipes " that are two way (4 x 8 lane) connections between devices (such as GPU to GPU or GPU to CPU). anotehr benefot will be a smaller form factor similar to AMD's Fury-X.  However, unlike AMD (which uses four 250MB memory modules in each stack on the Fury-X), Nvidia has settled on the newly developed HBM-2 chip which has capacity of 2 GB .

    Yes, sounds fantastic and mind boggling, but again, it may be a while before we see a consumer Pascal architecture GPU on sale at Newegg or TigerDirect.

    Post edited by kyoto kid on
  • DAZ_SpookyDAZ_Spooky Posts: 3,100
    mjc1016 said:
    fastbike1 said:

    I'm willing to bet that a substantial part of this population isn't trying to do big Pixar / LucasArts style scenes. These would be the same folks that are running older machines / are self described hobbyists / have limited funds for the hobby. Perhaps I'm wrong. We all have some variable cpacity for self delusion, often no more apparent than in our hobbies.

    Video memory disappears quickly using lots of large hi-res texture maps.  Iray uses about 3 bytes/pixel (average) for images (all images...diffuse, normal, bump, other controls...all of them) so when you start loading a decent poly figure, with hair/clothes and 'nice' textures' you will pass 1 and 2 GB rather quickly.   Not all 'big' scenes are relly that big...just a nearly naked Vicky, with a sword (of course), battling a lone dragon, in a temple (of course) can drop to CPU because it ran out of room.  

    For a 4k skin texture...that comes in around 150 MB per SURFACE! (How many surfaces are there on G3F?)

    For those really big scenes  you need 8 or more GB (the nosebleed Quadros and Titans)  or a networked solution.

    We did a bit of testing with a GTX 770 (2GB) card before making the 4GB recommendation. A single Genesis 2 Female, with HD Morphs, high res textures, clothing and hair may or may not fit on a 2GB card (It is definitely borderline.). You can, on average, fit 2-4 G2F with clothing and hair plus an environment onto 4GB.

     

    Note that Displacement tesselation, if you are not careful with it, can and will spike a video card's VRam and throw it off the render, even with a 12GB card.

  • mjc1016mjc1016 Posts: 15,001

    We did a bit of testing with a GTX 770 (2GB) card before making the 4GB recommendation. A single Genesis 2 Female, with HD Morphs, high res textures, clothing and hair may or may not fit on a 2GB card (It is definitely borderline.). You can, on average, fit 2-4 G2F with clothing and hair plus an environment onto 4GB.

    And there it is...depending on the texture set and hair, HD morphs don't necessarily need to be included, especially if you only have a single card.

  • DAZ_SpookyDAZ_Spooky Posts: 3,100
    edited July 2015
    mjc1016 said:

    We did a bit of testing with a GTX 770 (2GB) card before making the 4GB recommendation. A single Genesis 2 Female, with HD Morphs, high res textures, clothing and hair may or may not fit on a 2GB card (It is definitely borderline.). You can, on average, fit 2-4 G2F with clothing and hair plus an environment onto 4GB.

    And there it is...depending on the texture set and hair, HD morphs don't necessarily need to be included, especially if you only have a single card.

    HD Morphs don't usually take up significant video ram, unless you set rendertime sub-D higher than normal. That combination almost always pushed the render over 2GB, but even without HD it is still a borderline condition.  

    Post edited by DAZ_Spooky on
  • Joe CotterJoe Cotter Posts: 3,259

    Ok, historical note for those that find this kind of thing interesting. VRAM originally was a specific type of memory that introduced dual ported configuration to video memory. SDRAM and DDR replaced VRAM and is actually only single ported but has both included features from and has advantages over the older (once revolutionary) VRAM. That's the original, historic use of the term. Language however is dynamic. At first, referring to video memory as VRAM was basically incorrect, but over time it has become so common that the 'common usage' has come to adopt VRAM as a shorthand for any video memory. This is how language evolves over time, so it's no longer necessarily incorrect, just a newer usage that has taken over. Again, just here for entertainment value for anyone who cares. ;)

  • DemiurgentDemiurgent Posts: 97

    For the record, don't be afraid of eBay. I got a shockingly inexpensive Titan M1060 (obsolete? Sure, but for $20 I'll take it gladly) that added 4GB of VRAM and 240 Cudas to my rendering machine. The improvement was stunning, and it was dirt cheap.

  • DAZ_SpookyDAZ_Spooky Posts: 3,100

    For the record, don't be afraid of eBay. I got a shockingly inexpensive Titan M1060 (obsolete? Sure, but for $20 I'll take it gladly) that added 4GB of VRAM and 240 Cudas to my rendering machine. The improvement was stunning, and it was dirt cheap.

    Due to potential driver conflicts, NVIDIA recommends not mixing GeForce cards with either Tesla Cards or Quadro Cards. (Though the Tesla and Quadro cards use the same drivers so you can mix them with each other.) 

  • DemiurgentDemiurgent Posts: 97

    For the record, don't be afraid of eBay. I got a shockingly inexpensive Titan M1060 (obsolete? Sure, but for $20 I'll take it gladly) that added 4GB of VRAM and 240 Cudas to my rendering machine. The improvement was stunning, and it was dirt cheap.

    Due to potential driver conflicts, NVIDIA recommends not mixing GeForce cards with either Tesla Cards or Quadro Cards. (Though the Tesla and Quadro cards use the same drivers so you can mix them with each other.) 

    I'm well aware -- and should have mentioned ;) -- so I pulled my old 620 at the same time as I slotted this in. I've got an obsolete card purely for monitor access and then the Tesla doing the heavy lifting. I actually only use this machine for the render phase, so I can set those things going in Remote Desktop without needing to actually look at the connected monitor.

    (I actually just ordered a deep discount surplus S1070 which will happily slot in next to the existing Tesla. 20 GB VRAM, over a thousand Cudas... I'm going to be doing the happy rendering dance next week.)

  • StratDragonStratDragon Posts: 3,249

    the 20GB VRAM is going to balance down to the available RAM on one single card, You can combine the cores over multiple cards but not the RAM with Iray. So if you have 5 cards with each with 1000 cores and 4GB VRAM (and a mobo that can fit and power it) you would have 5000 cores and 4GB VRAM that can be utized by Iray.

  • kyoto kidkyoto kid Posts: 41,198

    ...from what I have been reading about DX12 and Mantle, this could change.

  • DemiurgentDemiurgent Posts: 97

    the 20GB VRAM is going to balance down to the available RAM on one single card, You can combine the cores over multiple cards but not the RAM with Iray. So if you have 5 cards with each with 1000 cores and 4GB VRAM (and a mobo that can fit and power it) you would have 5000 cores and 4GB VRAM that can be utized by Iray.

    You may be right, but my understanding is the S1070 will be recognized as one (or at most two) unit, rather than the four cards that are inside it. That would imply 8 or 16GB.

    I'm not going to sweat it either way. It was a good deal and will up my rendering game by a ton.

  • StratDragonStratDragon Posts: 3,249

    the 20GB VRAM is going to balance down to the available RAM on one single card, You can combine the cores over multiple cards but not the RAM with Iray. So if you have 5 cards with each with 1000 cores and 4GB VRAM (and a mobo that can fit and power it) you would have 5000 cores and 4GB VRAM that can be utized by Iray.

    You may be right, but my understanding is the S1070 will be recognized as one (or at most two) unit, rather than the four cards that are inside it. That would imply 8 or 16GB.

    I'm not going to sweat it either way. It was a good deal and will up my rendering game by a ton.

    the 20GB VRAM is going to balance down to the available RAM on one single card, You can combine the cores over multiple cards but not the RAM with Iray. So if you have 5 cards with each with 1000 cores and 4GB VRAM (and a mobo that can fit and power it) you would have 5000 cores and 4GB VRAM that can be utized by Iray.

    You may be right, but my understanding is the S1070 will be recognized as one (or at most two) unit, rather than the four cards that are inside it. That would imply 8 or 16GB.

    I'm not going to sweat it either way. It was a good deal and will up my rendering game by a ton.

     

    I'm curious to see that too. Let us know if it sees all of it one clip or not. In the mean time I found one on ebay for $150. Hopefully what you save vs. a new gpu is not what you pay to keep that thing running on your electric bill.

  • DAZ_SpookyDAZ_Spooky Posts: 3,100

    the 20GB VRAM is going to balance down to the available RAM on one single card, You can combine the cores over multiple cards but not the RAM with Iray. So if you have 5 cards with each with 1000 cores and 4GB VRAM (and a mobo that can fit and power it) you would have 5000 cores and 4GB VRAM that can be utized by Iray.

    You may be right, but my understanding is the S1070 will be recognized as one (or at most two) unit, rather than the four cards that are inside it. That would imply 8 or 16GB.

    I'm not going to sweat it either way. It was a good deal and will up my rendering game by a ton.

    4 cards, seen as 4 cards. Even a VCA is individual cards (8 cards, usually, each with 12 GB of RAM) Now, in theory you could fill a VCA with K80 cards, so that would be the equivalent of 15 or 16 cards, depending on if you actually need a Graphics card in a VCA. LOL. 

  • StratDragonStratDragon Posts: 3,249

    the 20GB VRAM is going to balance down to the available RAM on one single card, You can combine the cores over multiple cards but not the RAM with Iray. So if you have 5 cards with each with 1000 cores and 4GB VRAM (and a mobo that can fit and power it) you would have 5000 cores and 4GB VRAM that can be utized by Iray.

    You may be right, but my understanding is the S1070 will be recognized as one (or at most two) unit, rather than the four cards that are inside it. That would imply 8 or 16GB.

    I'm not going to sweat it either way. It was a good deal and will up my rendering game by a ton.

    4 cards, seen as 4 cards. Even a VCA is individual cards (8 cards, usually, each with 12 GB of RAM) Now, in theory you could fill a VCA with K80 cards, so that would be the equivalent of 15 or 16 cards, depending on if you actually need a Graphics card in a VCA. LOL. 

    4 cards as 4 cards, so RAM is whatever the maximum of a single card is not any combination of all the cards combined?

  • DAZ_SpookyDAZ_Spooky Posts: 3,100
    edited July 2015

    the 20GB VRAM is going to balance down to the available RAM on one single card, You can combine the cores over multiple cards but not the RAM with Iray. So if you have 5 cards with each with 1000 cores and 4GB VRAM (and a mobo that can fit and power it) you would have 5000 cores and 4GB VRAM that can be utized by Iray.

    You may be right, but my understanding is the S1070 will be recognized as one (or at most two) unit, rather than the four cards that are inside it. That would imply 8 or 16GB.

    I'm not going to sweat it either way. It was a good deal and will up my rendering game by a ton.

    4 cards, seen as 4 cards. Even a VCA is individual cards (8 cards, usually, each with 12 GB of RAM) Now, in theory you could fill a VCA with K80 cards, so that would be the equivalent of 15 or 16 cards, depending on if you actually need a Graphics card in a VCA. LOL. 

    4 cards as 4 cards, so RAM is whatever the maximum of a single card is not any combination of all the cards combined?

    Each card has to fit the scene in its own RAM in order to be used in the render. This is the case of 4 cards in a case, SLI and even two cards on the same PCIE Slot (Titan-Z, K80 and any of the GTX _90 cards as examples.) 

     

    So for example. If your case has a K2200 (4GB), a K5200 (8GB), a M6000 (12GB) and a K80 (24GB as 2x12GB) and your scene fits on a 4GB card then all the cards plus the CPU(s) will participate in the Render.

    If your scene requires 7GB of Video RAM then the CPU(s) and all but the K2200 will participate in the render. 

    If your scene requires 9GB of Video Ram then the CPU(s), the M6000 and both halves of the K80 will participate in the render but the K2200 and the K5200 will not. 

    If your scene requires more than 12GB of Video Ram then only the CPU will participate in the render as the K80 is 2xK40 on the same PCIE slot. 

    Post edited by DAZ_Spooky on
  • ZilvergrafixZilvergrafix Posts: 1,385
    edited July 2015
    We did a bit of testing with a GTX 770 (2GB) card before making the 4GB recommendation. A single Genesis 2 Female, with HD Morphs, high res textures, clothing and hair may or may not fit on a 2GB card (It is definitely borderline.). You can, on average, fit 2-4 G2F with clothing and hair plus an environment onto 4GB.

    that's why there are options of remove textures on the surface tab, for example, you don't need specular or bump map on iRay, neither SSS map, commonly applied on Genesis figures.

    and getting 4096x4096 textures for the eyes is...not!, I've seen spectacular renderings on CGSociety, where all user create their own works from zero, and if you look their workflow is just a single texture and it's 1024x1024! 

    Post edited by Zilvergrafix on
  • DAZ_SpookyDAZ_Spooky Posts: 3,100
    We did a bit of testing with a GTX 770 (2GB) card before making the 4GB recommendation. A single Genesis 2 Female, with HD Morphs, high res textures, clothing and hair may or may not fit on a 2GB card (It is definitely borderline.). You can, on average, fit 2-4 G2F with clothing and hair plus an environment onto 4GB.

    that's why there are options of remove textures on the surface tab, for example, you don't need specular or bump map on iRay, neither SSS map, commonly applied on Genesis figures.

    and getting 4096x4096 textures for the eyes is...not!, I've seen spectacular renderings on CGSociety, where all user create their own works from zero, and if you look their workflow is just a single texture and it's 1024x1024! 

    You could also reduce all your texture sizes, or use texture atlas or similar utility. That was not the point of the exercise. 

    The recommended size is 4GB because that works well with a reasonable amount of content in the scene, as seen in various online galleries, without modifying the content as loaded.

    2GB cards drop out more often than they participate in the render.

  • mjc1016mjc1016 Posts: 15,001
     

    2GB cards drop out more often than they participate in the render.

    And 1 GB cards will seldom start...unless it's a very simple scene.

  • chorsechorse Posts: 163
    edited July 2015
    I was just wondering; why couldn't the iray rendering memory issue be solved programmically? E.g. If the video card only has 2gb of RAM & the scene is 8gb; why couldn't we break the scene down programically into 4 separate 2gb chunks, and render each chunk separately? So basically a quarter of the scene would be rendered at a time. If you had a 4gb card,then half the scene would be rendered at a time etc. This would seem to be a solution that could allow users to render with Iray without having to upgrade their existing systems.
    Post edited by chorse on
  • Richard HaseltineRichard Haseltine Posts: 102,231
    I was just wondering; why couldn't the iray rendering memory issue be solved programmically? E.g. If the video card only has 2gb of RAM & the scene is 8gb; why couldn't we break the scene down programically into 4 separate 2gb chunks, and render each chunk separately? So basically a quarter of the scene would be rendered at a time. If you had a 4gb card,then half the scene would be rendered at a time etc. This would seem to be a solution that could allow users to render with Iray without having to upgrade their existing systems.

    Presumably because all parts of the sceen need to interact to get the lighting, reflections etc. right.

  • Joe CotterJoe Cotter Posts: 3,259
    edited July 2015

    In relation to what Richard said, it would require an entirely different design to do that, which would cause a significant hit to render time probably*, potentially even when only using the video card. By having a go/no go type of simple decision on using render card or cpu, the code would be much more streamlined and optimized. NVIDIA is investing a not insubstantial amount of money in developing this render engine and they are doing it for a purpose. They want to showcase what a better video card can do, ie the sales of video cards is what funds this render engine, so for their purpose, optimizing this way makes sense. It also makes sense for people who want a render engine that is optimized to run in the video card space as it should perform better then a hybrid.

    *The use of probably/possibly is because while I do have a programming background it isn't in render engines specifically and even if I did have, I would have had to go through the code to be sure. Having said that, the comments are based on general programming constructs and their efficiencies. There are always tradeoffs.

    Post edited by Joe Cotter on
  • kyoto kidkyoto kid Posts: 41,198

    ...Octane's hybrid GPU/CPU render mode separates the Geometry from the Texture so the Geometry stays in Video memory while the Texture data goes to CPU/Physical Memory.  As I have read in other threads here it still is pretty fast in comparison to Iray's or LuxRender's (now abandoned) hybrid mode.

  • ToborTobor Posts: 2,300

    I don't know if any of these solutions are in nVidia's best interest. Their goal is to sell more higher-end cards. As it turns out, though, even 2GB can render a usable scene -- I regularly keep my 620 2GB card in the render list, and while it only has about 100 cores, the more the merrier. HDR dome, three lights, G2F with std def texture, clothes, geometric hair, and some jewelry: 1500 MB, and that includes the VRAM used for the monitor. (I will add that most of the clothes are procedural, using Iray-based shaders.)

    They could turn CUDAs from an otherwise unused core as worker threads to assist the CPU in the floating point math necessary for ray tracing, but again, they'd like you to buy a TITAN, or sign up to a render cloud service, that has invested $500K in nVidia hardware!

     

Sign In or Register to comment.