Impact of the storage subsystem on DAZ Studio

GLEGLE Posts: 52
edited August 2021 in Daz Studio Discussion

Hello everybody.
In this thread I will discuss aspects of storage solutions for DAZ Studio, based on my experioence with various implementations through the years.

***EDIT: PLEASE NOTICE: CPU and GPU performance are way more important for the DAZ Studio workflow. Do not assume that storage can decrease your render times, unless your render window starts showing something after a very long time. Storage performance can only affect load time, not execution time, unless your program needs to handle massive amounts of data. That's not the case for DAZ Studio. I'm discussing storage, because it's not very well documented in this forum, contrary to CPU and GPU performance***

Please keep in mind the following conventions:
- asset: a figure or prop complete with all its attributes, as shown in the viewport
- item: a single file contained in the library folder, irrespective of it being a morph, a texture, a geometry baseline, a pose preset, etc.

GENERAL CONSIDERATIONS
Like most programs, the performance of DAZ Studio will depend on the underlying hardware. This is also true for the storage subsystem: throwing more hardware at the problem will eventually fix it.
The good news is that as of today, we don't need fancy enterprise grade storage solutions to make Studio snappy.

What most of us see when we look at our Studio libraries is the space required to hold it. That's easy to measure.
What we don't consider is the distribution of files in the library and what Studio does when it loads an asset, especially a figure with many morph dials available (even if they are not dialed).
In a typical library, textures will make up the bulk of occupied space. Geometries and morph delta files will be numerous, but their size is going to be negligible.
When an asset is loaded, Studio needs to read all the relevant items in order to make it appear in the viewport. Therefore Studio will read a lot of small files (geometries), and a few large files (textures).
Also when we're browsing the library, the content pane will need to list all compatible files in each folder and load all the miniature icons, so while browsing we're always requesting many small files. The same scenario applies when we're performing "Content DB Maintenance".

We can conclude that Studio is not only in need of ever expanding storage capacity, but it will be extremely sensitive to storage latency (which affects the loading time of many small files) and relatively unaffected by raw storage throughput (needed to transfer large files).
To the best of my knowledge, Studio doesn't allow for manual adjustment of storage behavior, and 4.15 is setup to wait for a long time before giving up or sending another request for the slow or stuck item.
If the storage subsystem cannot cope with the demand or is malfunctioning, the result is going to be abnormally long load times accompanied by "not responding & whiteout" window status. If left unattended, 4.15 will usually not crash and in the end load the asset or throw an error message. The log file will not record much information, and we'll have to infer performance by analysing logged load times for each item and comparing gaps in the timings with the filesize of items.

The last thing to consider is that Studio doesn't multithread. After all items are retrieved from storage, a single process will sequentially load each asset in the scene, and will give back control of the window after everything is done and visible in the viewport.
Why Studio cannot load each asset in parallel, adding them to the viewport immediately, one by one, when they are ready, without locking the program, is beyond me. It would be such a time saver to be able to work on a figure while another is loading, or adjusting a figure pose while its clothing assets are loading, especially when opening a large scene.

STORAGE OPTIONS
Now that we know what Studio is doing with the storage, let's go through some configurations and see how they will affect performance.

A: Single traditional hard disk
This is probably one of the most common configurations, and is bound to give the lowest possible performance. Traditional disks with platters have high latency because of how they are conceived. Low RPM drives, such as the ones found in laptops, are particularly bad. In this particular case, we're making things worse by mixing the library reads with all the other I/O occuring in the OS and other programs.

B: Dedicated single traditional hard disk
Putting the library on a dedicated storage device is the logical step up. We will usually need to buy a new drive as most PCs only come with one factory installed, let's try to aim for a high RPM unit with the lowest possible latency. Avoid "green", "compute"/"blue" and "surveillance"/"purple" drives: they're not built for our purposes. We'll avoid SMR drives at all costs, they are only good for archival roles. "Professional NAS" and "Enterprise" drives are going to be our preferred consumer options. If two disks look similar, we'll buy the one with lowest latency and higher random IOPS, not the one with higher transfer rates in MBps.
This solution also comes with hidden advantages, such as the ability to backup the library without affecting the whole system performance. If external storage is required, we'll go with powered USB 3.0 or better enclosures.

C: SSD
As the tech savvy would expect, this option yields the best performance because of their high speed and low latency. Given the availability of multi terabyte SSDs from reputable manufacturers at quite reasonable price points, it's an easy option.
DAZ Studio does not generate enough I/O to clog a single SSD in normal scenarios, so we can choose a dedicated or shared storage model without much impact on Studio performance.

D: Hybrid
We could split the library and host textures on a traditional hard disk, and the other smaller items on an SSD, but Studio works with relative paths and expects to find all the asset items in the same library. So this would not work. [EDIT: as discussed in subsequent posts, this should work fine as long as the assets are created correctly by their authors.]
We could split the library and host frequently used assets on an SSD, and rarely used ones on an hard disk, but this is cumbersome to maintain.
[EDIT: as suggeested by PerttiA and thenoobducky, using mount to folder will allow us to forego the multiple library folders functionality built in Studio, and automate the split between multiple disks. We will present Studio with a single library directory where various subfolders will be mounted to various drives fit for purpose.]

E: Multiple disks and storage arrays
This can be done, but we will need enterprise grade hardware and careful sizing by an expert to compete with solution C. If not intimidated by the complexity of the setup, the price and the high power consumption, we can get impressive performance, reliability and flexibility going this way, especially if the network is 10Gpbs or better and we plan to work with Studio from multiple computers.
Studio performance will not improve noticeably when jumping from a single SSD to an array of SSDs. Consumer SSDs are not designed to be used in arrays. Let's forget about RAID options provided by mainboards lacking a dedicated controller, latency would be too high.

TROUBLESHOOTING
To anyone seeing over 5 minute loading times for very complex scenes, I recommend checking disks and  motherboard, and trying to replace the disk data cables.
Another symptom of storage worries are missing miniatures when searching the content pane, and "not responding & whiteout" after right clicking to refresh.
DAZ Studio is very sensitive to library performance and will begin to stutter well before any other program would misbehave. Files  might transfer to and from the library directory just fine, but if Studio is too slow at loading, then it's time to look for storage problems.

CONCLUSIONS
A single large SSD is going to provide the best experience with Studio, without being difficult to setup or too costly.
Of course enterprise storage can be leveraged with great success if already installed or if peculiar requirements are in place.

Post edited by GLE on

Comments

  • thenoobduckythenoobducky Posts: 68
    edited August 2021

    GLE said:

    Hello everybody.
    In this thread I will discuss aspects of storage solutions for DAZ Studio, based on my experioence with various implementations through the years.
    Please keep in mind the following conventions:
    - asset: a figure or prop complete with all its attributes, as shown in the viewport
    - item: a single file contained in the library folder, irrespective of it being a morph, a texture, a geometry baseline, a pose preset, etc.

    GENERAL CONSIDEDRATIONS
    Like most programs, the performance of DAZ Studio will depend on the underlying hardware. This is also true for the storage subsystem: throwing more hardware at the problem will eventually fix it.
    The good news is that as of today, we don't need fancy enterprise grade storage solutions to make Studio snappy.
    What most of us see when we look at our Studio libraries is the space required to hold it. That's easy to measure.
    What we don't consider is the distribution of files in the library and what Studio does when it loads an asset, especially a figure with many morph dials available (even if they are not dialed).
    In a typical library, textures will make up the bulk of occupied space. Geometries and morph delta files will be numerous, but their size is going to be negligible.
    When an asset is loaded, Studio needs to read all the relevant items in order to make it appear in the viewport. Therefore Studio will read a lot of small files (geometries), and a few large files (textures).

    Only on the first time a figure is loaded and any other subsequent time the underlying files have changed. Daz cache results in a single file (by default on C drive which hopefully should be your fastest drive). The cache contains mainly formulas from the small files. 

    Geometry data are only loaded on demand. Daz checks the timestamp of files for changes. If the file was changed, Daz will rebuild the cache, incurring an additional length read process and lengthy write process. Frequently installing/removing items might actually make things go slower.

    A faster drive would still benefit for the timestamp check step, but not nearly as much. 

    Also, geometries files can actually be huge for HD morphs, but it also depends on if it is compressed or not. Data file contain a lot of misc items that can take up a bunch of rooms.

    Also when we're browsing the library, the content pane will need to list all compatible files in each folder and load all the miniature icons, so while browsing we're always requesting many small files. The same scenario applies when we're performing "Content DB Maintenance".

    We can conclude that Studio is not only in need of ever expanding storage capacity, but it will be extremely sensitive to storage latency (which affects the loading time of many small files) and relatively unaffected by raw storage throughput (needed to transfer large files).

    There are multiple reports that NVME don't benefit Daz studio as much, but it is doesn't hurt if you got extra money for it. The other thing is Daz will each the existence of a file in every base directory you have added, thus making multiple filesystem inquiry. I suspect there might be a small impact if you have a huge ton of base directory added.

    see: https://unofficial3dforums.com/t/idea-for-solving-slow-load-times/236

    To the best of my knowledge, Studio doesn't allow for manual adjustment of storage behavior, and 4.15 is setup to wait for a long time before giving up or sending another request for the slow or stuck item.
    If the storage subsystem cannot cope with the demand or is malfunctioning, the result is going to be abnormally long load times accompanied by "not responding & whiteout" window status. If left unattended, 4.15 will usually not crash and in the end load the asset or throw an error message. The log file will not record much information, and we'll have to infer performance by analysing logged load times for each item and comparing gaps in the timings with the filesize of items.

    I have observed slow loading time when the drive was occupied with other read/write task such as coping file at the same time. Something to make sure to not do when you want to load something in Daz.

    The last thing to consider is that Studio doesn't multithread. After all items are retrieved from storage, a single process will sequentially load each asset in the scene, and will give back control of the window after everything is done and visible in the viewport.
    Why Studio cannot load each asset in parallel, adding them to the viewport immediately, one by one, when they are ready, without locking the program, is beyond me. It would be such a time saver to be able to work on a figure while another is loading, or adjusting a figure pose while its clothing assets are loading, especially when opening a large scene.

    Daz is multithreaded, but the file loading process appears to be single-threaded. Multi-threading might make it 2-5x faster, which is something I hope Daz dev will one day get around to do but alas, it is not happening right now. What is not acceptable IMO is the interface freeze with no way to cancel it when that is something easy to implement. 

    STORAGE OPTIONS
    Now that we know what Studio is doing with the storage, let's go through some configurations and see how they will affect performance.

    A: Single traditional hard disk
    This is probably one of the most common configurations, and is bound to give the lowest possible performance. Traditional disks with platters have high latency because of how they are conceived. Low RPM drives, such as the ones found in laptops, are particularly bad. In this particular case, we're making things worse by mixing the library reads with all the other I/O occuring in the OS and other programs.

    B: Dedicated single traditional hard disk
    Putting the library on a dedicated storage device is the logical step up. We will usually need to buy a new drive as most PCs only come with one factory installed, let's try to aim for a high RPM unit with the lowest possible latency. Avoid "green", "compute"/"blue" and "surveillance"/"purple" drives: they're not built for our purposes. We'll avoid SMR drives at all costs, they are only good for archival roles. "Professional NAS" and "Enterprise" drives are going to be our preferred consumer options. If two disks look similar, we'll buy the one with lowest latency and higher random IOPS, not the one with higher transfer rates in MBps.
    This solution also comes with hidden advantages, such as the ability to backup the library without affecting the whole system performance. If external storage is required, we'll go with powered USB 3.0 or better enclosures.

    C: SSD
    As the tech savvy would expect, this option yields the best performance because of their high speed and low latency. Given the availability of multi terabyte SSDs from reputable manufacturers at quite reasonable price points, it's an easy option.
    DAZ Studio does not generate enough I/O to clog a single SSD in normal scenarios, so we can choose a dedicated or shared storage model without much impact on Studio performance.

    D: Hybrid
    We could split the library and host textures on a traditional hard disk, and the other smaller items on an SSD, but Studio works with relative paths and expects to find all the asset items in the same library. So this would not work.
    We could split the library and host frequently used resources on an SSD, and rarely used ones on an hard disk, but this is cumbersome to maintain.

    No this should work since Daz only care about the relative path. I have mentioned Daz check file is every base directory previously, Daz treat each files separately thus I can't see why it wouldn't work.  The only annoying part would be getting DIM to install to one folder and then have to manually copy the textures over to the other folder. Although you can set up a symbolic link to redirect file access at the os level to make it appear as if it is one folder.

    E: Multiple disks and storage arrays
    This can be done, but we will need enterprise grade hardware and careful sizing by an expert to compete with solution C. If not intimidated by the complexity of the setup, the price and the high power consumption, we can get impressive performance, reliability and flexibility going this way, especially if the network is 10Gpbs or better and we plan to work with Studio from multiple computers.
    Studio performance will not improve noticeably when jumping from a single SSD to an array of SSDs. Consumer SSDs are not designed to be used in arrays. Let's forget about RAID options provided by mainboards lacking a dedicated controller, latency would be too high.

    TROUBLESHOOTING
    To anyone seeing over 5 minute loading times for very complex scenes, I recommend checking disks and  motherboard, and trying to replace the disk data cables.
    Another symptom of storage worries are missing miniatures when searching the content pane, and "not responding & whiteout" after right clicking to refresh.
    DAZ Studio is very sensitive to library performance and will begin to stutter well before any other program would misbehave. Files  might transfer to and from the library directory just fine, but if Studio is too slow at loading, then it's time to look for storage problems.

    Storage is not really the thing slowing down the load process, though, unless your drive is really busy. If the slowdown is a sudden new behaviour, having conflicting morph/formulas seems to be the major cause, especially if you have recently installed new content. Or Daz could be doing a cache rebuild. If anything invests in a better CPU with the best single-core performance possible might be worth considering.

    The other solution to slow time is to reduce the number of files Daz has to load, especially files that have the most impact on loading ti.

    CONCLUSIONS
    A single large SSD is going to provide the best experience with Studio, without being difficult to setup or too costly.
    Of course enterprise storage can be leveraged with great success if already installed or if peculiar requirements are in place.

    Post edited by thenoobducky on
  • PerttiAPerttiA Posts: 10,024

    thenoobducky said:

    GLE said:

    D: Hybrid
    We could split the library and host textures on a traditional hard disk, and the other smaller items on an SSD, but Studio works with relative paths and expects to find all the asset items in the same library. So this would not work.
    We could split the library and host frequently used resources on an SSD, and rarely used ones on an hard disk, but this is cumbersome to maintain.

    No this should work since Daz only care about the relative path. I have mentioned Daz check file is every base directory previously, Daz treat each files separately thus I can't see why it wouldn't work.  The only annoying part would be getting DIM to install to one folder and then have to manually copy the textures over to the other folder. Although you can set up a symbolic link to redirect file access at the os level to make it appear as if it is one folder.

    "Hybrid" has no problems working.

    On my system, I have used mounted drives and junctions for 4-5 years already to divide datafiles to SSD's and textures to external USB's - DS still sees just one intact Content Library.

    Mounting drives to folders on other drives can be done with standard windows disk tools
    Junctions require SysInternals Junction that can be downloaded from MS https://docs.microsoft.com/en-us/sysinternals/downloads/junction

  • GLEGLE Posts: 52

    Thanks thenoobducky and PerttiA for your insight. Of course for the vast majority of users storage is not going to be the most important bottleneck in Studio. As you say, other components impact overall performance much more, when considering the whole project & render workflow.

    I'm not sure about what Studio does with the cache. I've had the DSON cache and Temporary files directories on an NVMe drive for quite some time. Still, load times went down significantly when I moved my library from a single hard disk to an SSD. You can tell from task manager that some I/O happens at the library when you load your scene, way before the big burst of traffic at the time of texture loading.

    After seeing the benefit from the new SSD, out of curiosity I then moved the lib to my server storage (SAS hard disks, 10Gbps LAN). I noticed the performance went back to levels similar to the single SATA hard disk I used previously. After some investigation I discovered a small flaw in one CAT6 cable termination: one of the 8 copper wires was partially cut where the wire jacket was removed. All other applications were running fine, but Studio hated the increased latency. After repairing the flaw, performance became similar to the SSD.

    About splitting the library, I've tried some configurations and it will not always work as intended. I didn't document my findings, but IIRC you can for example setup a secondary library containing only \data\DAZ3D\people\Genesis 8\Male\morphs or whatever the correct path to the morphs is. If you fool around with the Runtime folder instead, things won't go as smoothly. I wasn't very thorough in these tests, maybe I did something incorrectly. PerttiA also seems to be happy with a split library so that's more evidence "against me".

    I'll update the main post to make it clear that the dissertation is about storage performance per se, not "overall Stiudio performance magically improved with this one simple trick"

  • GLE said:

    Thanks thenoobducky and PerttiA for your insight. Of course for the vast majority of users storage is not going to be the most important bottleneck in Studio. As you say, other components impact overall performance much more, when considering the whole project & render workflow.

    I'm not sure about what Studio does with the cache. I've had the DSON cache and Temporary files directories on an NVMe drive for quite some time. Still, load times went down significantly when I moved my library from a single hard disk to an SSD. You can tell from task manager that some I/O happens at the library when you load your scene, way before the big burst of traffic at the time of texture loading.

    After seeing the benefit from the new SSD, out of curiosity I then moved the lib to my server storage (SAS hard disks, 10Gbps LAN). I noticed the performance went back to levels similar to the single SATA hard disk I used previously. After some investigation I discovered a small flaw in one CAT6 cable termination: one of the 8 copper wires was partially cut where the wire jacket was removed. All other applications were running fine, but Studio hated the increased latency. After repairing the flaw, performance became similar to the SSD.

    About splitting the library, I've tried some configurations and it will not always work as intended. I didn't document my findings, but IIRC you can for example setup a secondary library containing only \data\DAZ3D\people\Genesis 8\Male\morphs or whatever the correct path to the morphs is. If you fool around with the Runtime folder instead, things won't go as smoothly. I wasn't very thorough in these tests, maybe I did something incorrectly. PerttiA also seems to be happy with a split library so that's more evidence "against me".

    I'll update the main post to make it clear that the dissertation is about storage performance per se, not "overall Stiudio performance magically improved with this one simple trick"

    Daz being sensitive to latency is believable, could also explain my observations with hdd contentions. PerttiA is talking about hybrid approach using os level feature that make folders in multiple location appear as a single folder to daz. It is transparent to daz, similar to a network drive. I dont know why manually splitting the folder wouldnt work unless you got absolute path to a file. I use a similar system with material on hdd, data on ssd except data/daz_3d is on ssd, and everything else on ssd.
  • PerttiAPerttiA Posts: 10,024

    GLE said:

    About splitting the library, I've tried some configurations and it will not always work as intended. I didn't document my findings, but IIRC you can for example setup a secondary library containing only \data\DAZ3D\people\Genesis 8\Male\morphs or whatever the correct path to the morphs is. If you fool around with the Runtime folder instead, things won't go as smoothly. I wasn't very thorough in these tests, maybe I did something incorrectly. 

    You don't have to "fool" around with the folder structure, you could even keep DS/DIM/DAZ Central/DAZ Connect believing that the one and only Content Library was on a 250GB C-drive, while in reality all the files would be located on several other drives.

    I no longer replace drives unless they are way smaller than the others, I just add more drives and use junctions to move stuff on them. This allows me to use smaller (1TB SSD/4TB USB) drives instead of having to pay by the nose for one big enough to hold everything, only to find that in a year I would need to replace that one with even bigger and more expensive one.

  • GLEGLE Posts: 52

    If I understand correctly, you are referring to this functionality: https://docs.microsoft.com/en-us/windows-server/storage/disk-management/assign-a-mount-point-folder-path-to-a-drive
    It's an elegant solution, the only caveat I can think of is keeping track of the various disks and mount points if the number of disks increases.

    Another approach that might yield good performance is Storage Spaces. The problem with this solution is that consumer grade hardware doesn't alert you of a failing drive. You will just see deteriorated performance from the drive pool, with all green on the status dashboard. You'll need to infer that there's a problem and manually test and check the drives, possibly one by one and hooked up to a different PC. Server HBAs will in contrast monitor disk health and disable faulty disks, allowing the drive pool to become degraded and reclaim some performance if there is enough redundancy (the pool doesn't have to wait for the failed drive anymore and will run faster, until you replace the dead disk, assign it to the pool and start "resilvering". Performance will then severely deteriorate, until resilvering will be complete and the pool will be back to healthy). If redundancy is insufficient, the pool will be brought offline and you'll need to restore from a valid backup after repairing the pool.

  • PerttiAPerttiA Posts: 10,024

    GLE said:

    If I understand correctly, you are referring to this functionality: https://docs.microsoft.com/en-us/windows-server/storage/disk-management/assign-a-mount-point-folder-path-to-a-drive
    It's an elegant solution, the only caveat I can think of is keeping track of the various disks and mount points if the number of disks increases.

    I have stored the drive letter in the name of the drive, and only need to check that the drive letter assigned to a drive matches the one in the name and everything works.
    Drive mounting can be done without any command line codes in "Computer Management->Storage->Disk Management"

    The only problem with mounting a drive is that once you have mounted ...\Runtime\Textures\ on another drive, the next level of texture folders can be thousands of individual folders and none of them have enough stuff to warrant a whole drive - That is where junctions come in. I can take 50 of the biggest subfolders and move them to another drive and the junctions keep the original folder structure still intact.

    Another approach that might yield good performance is Storage Spaces. The problem with this solution is that consumer grade hardware doesn't alert you of a failing drive. You will just see deteriorated performance from the drive pool, with all green on the status dashboard. You'll need to infer that there's a problem and manually test and check the drives, possibly one by one and hooked up to a different PC. Server HBAs will in contrast monitor disk health and disable faulty disks, allowing the drive pool to become degraded and reclaim some performance if there is enough redundancy (the pool doesn't have to wait for the failed drive anymore and will run faster, until you replace the dead disk, assign it to the pool and start "resilvering". Performance will then severely deteriorate, until resilvering will be complete and the pool will be back to healthy). If redundancy is insufficient, the pool will be brought offline and you'll need to restore from a valid backup after repairing the pool.

    I wouldn't go there... The problem is that once you create the pool, the individual drives are probably not accessible by themselves anymore (not 100% sure as I haven't tried it), but I do have bad experiences from the days gone by when the big thing was drive compression that could double the size of your 40MB HD, and again around the turn of the century when I was pushing the limits of read/write speeds with 4x7200rpm drive RAID setup that were cooled with 3 PSU fans.

Sign In or Register to comment.