P4 Dork is out of the Graveyard - Carrara to SD workflow - warning AI discussion

DiomedeDiomede Posts: 15,169
edited November 2023 in Carrara Discussion

Have been experimenting with Stable Diffusion.  Carrara is ideal for creating quick simple starting images for Stable Diffusion's image2image processor.  Here, I reached back into the depths of my content libraries and pulled out 'Dork,' also known as the Poser 4 man.  Quickly and easily blocked out a scene using a variety of Carrara tools.  10 minutes, max.  I used the spline modeler for the arch.  The path and grass are primitive planes.  The trees are default carrara plant editor trees.  The moon is a primitive cylinder.  The gravestones are vertex grids with added thickness, and duplicated.  I inserted a spotlight by the 'moon' to get the shadows.  That is it.

Here is the raw render.

Then I took the raw render and put it in SD's image2image processor.  I used simple prompts for a man running out of a graveyard under an arch at night with a full moon.  And the result is not just a random image, but one with the details that I wanted.  That was just one pass.  An actual workflow would be to iteratively improve the elements of the image with masking specific sections of the image, more prompts, etc.  Dork is back!

00 Carrara and Stable Diffusion.png
800 x 800 - 571K
Screenshot 2023-11-15 052245.png
947 x 942 - 1M
Post edited by Diomede on
«13

Comments

  • DiomedeDiomede Posts: 15,169
    edited November 2023

    I am putting this post here instead of in the Art Forum because I am emphasizing how simple this particular Carrara scene is, and how easily Carrara can edit all types of 'block out' elements.

    Screenshot 2023-11-15 053728.png
    1818 x 990 - 296K
    Post edited by Diomede on
  • DiomedeDiomede Posts: 15,169
    edited November 2023

    Particularity!

    One could easily put in the prompts without the Carrara render and get a different scene, a 'better' scene, out of Stable Diffusion or Midjourney or Dall-E or whatever.  That is not the point here.  The point here is that you can get this particular scene and setting.  This particular graveyard, this particular path, this particular placement of the moon, etc.  There are other ways to accomplish the same end, but hopefully one can see how quickly and easily Carrara can be used in this way.  I am emphasizing the editing tools, one example being the spline modeler to make the arch.

    Post edited by Diomede on
  • Steve KSteve K Posts: 3,234

    "That must be wonderful, I don't understand it at all"  (Moliere)  But the final product is impressive.  yes

  • DartanbeckDartanbeck Posts: 21,568

    That turned out Fantastic!!! I must say, however... I didn't recall P4 Man looking so good in a raw render. I remember him as looking pretty bad. Must've been that I was so new to 3d back then. Wow. SD really made him look Great!!!

  • DartanbeckDartanbeck Posts: 21,568

    It probably won't be too long before they figure out how to do hands, paws, etc., better.

  • GreybroGreybro Posts: 2,502

    I like content like this. I do play a little SD myself. Very cool.

  • StezzaStezza Posts: 8,060

    I haven't used  Stable Diffusion so wouldn't know where to start.. this means you @Diomede will have to start a thread tutorial from 'Carrara to Stable Diffusion for dummies' wink 

    but wow.... Dork came out well yes

  • DartanbeckDartanbeck Posts: 21,568

    laugh

  • DiomedeDiomede Posts: 15,169

    Stezza said:

    I haven't used  Stable Diffusion so wouldn't know where to start.. this means you @Diomede will have to start a thread tutorial from 'Carrara to Stable Diffusion for dummies' wink 

    but wow.... Dork came out well yes

    Might start such a thread, although I think my current interest in Stable Diffusion will be brief. 

     

  • DiomedeDiomede Posts: 15,169
    edited November 2023

    One more illustration.  Posing, composition, and similar factors drive my current interest in a 3D-to-Stable Diffusion workflow.  A plugin called Controlnet is key.  I will use the Poser 4 Dork Businessman and the Controlnet functions for 'Canny' and 'OpenPose' to demonstrate.

    Oversimplified summary.  Stable Diffusion starts with noisy pixels, just random spots, and then uses fancy math to converge on an image defined by a set of words or another image.  A dataset correlates the words with elements of other images which have been tagged with the words.  Describe in words what you want the image to be, and the program uses the labels/tags in the dataset/model to converge the noisy image to what correlates to the words you listed.  You can also use a second image as a prompt instead of words.  Controlnet cn help with that.  Controlnet has functions which can take elements from one image and use them as guideposts for a second image.  The 'canny' function identifies the edges in one image.  The 'open pose' function identifies the angles of human limbs in one image.  So, a 3D program like Carrara or Daz Studio or Blender or Poser or... can be used to quickly specify a pose, or arrange a composition, or... in one image to be used to influence a second image.  Here, I am going to use the ancient Poser 4 Dork businessman in Controlnet to direct Stable Diffusion on how to interpret the verbal prompts.  Clear as mud?

    A - I want a man in a suit sitting in an office, and I want to see the whole man, not just from thighs up.

    a1. Stable Diffusion with No Help from Carrara - old man in suit

    I list a bunch of positive prompts in the first big box, such as detective sitting in office, gray hair, etc.  And the second big box has negative prompts, such as low quality, bad anatomy, etc.  I set it to give me 4 choices, seen in the bottom right of the attached screengrab.  In the attached screengrab, see how the four choices have a variety of poses but I want something specific.  

    a2. A Carrara render of P4 Dork Businessman in sitting pose.

    a3.  Place Carrara render of sitting businessman in Controlnet area of Stable Diffusion, check 'enable,' choose the function (canny in this case), and click the fireball looking thing to preprocess.  A render with black background and white lines appears next to the Controlnet imported Carrara render.  Use stable diffusion to generate again with the same prompts as (a1).  See how the resulting image has a man matching the same pose as the Canny white lines.

    a4.  Repeat the same process with the 'openpose' function.  See how the white outline of canny has been replaced by a multi-colored stick figure matching the sitting pose of the Carrara render.  Running Stable Diffusion with the same verbal prompts again gives a result of a man in a suit in the correct pose.  So either function might work depending on the situation.

    a5 Repeat but instead of sitting,make the old man stand and see the full figure, not just thighs up.  I have a second render of the P4 Dork businessman.

    a6.  Here is the sitting pose render example again, but this time instead of gray hair, the verbal prompts specify long blonde hair.  The background description is changed to a bench in the park.

     

    .

     

     

    dd02 initial pass with no controlnet.png
    1838 x 992 - 363K
    dd01 screenshot sitting.png
    1573 x 983 - 264K
    dd03 pass with controlnet canny.png
    1688 x 961 - 344K
    dd03 pass with controlnet openpose.png
    1576 x 1022 - 401K
    dd08 results of long blonde hair park bench.png
    1571 x 888 - 403K
    dd04 pass with controlnet walking openpose.png
    1598 x 915 - 368K
    Post edited by Diomede on
  • DiomedeDiomede Posts: 15,169

    Bottom line - in my view, 3D programs and 3D skills can be partnered with Stable Diffusion, especially with Controlnet functions like Canny and OpenPose.

  • I am doing this a lot, but gasp, using DAZ studio with Filament blush, so cannot share here

    there is a thread in Art Studio I do though so no problems yes https://www.daz3d.com/forums/discussion/591121/remixing-your-art-with-ai#latest

     

    this video sort of can be shared here as a Carrara Howie Farkes Tea Garden video was used in Stable Diffusion Deforum for the background

    Bunny is DAZ studio iray however

  • DiomedeDiomede Posts: 15,169

    Looking good.  I still am not animating in Stable diffusion.  The animaiton related extensions have conflicts somehow for my install.  Can't identify the source.

  • wsterdanwsterdan Posts: 2,344

    Stezza said:

    I haven't used  Stable Diffusion so wouldn't know where to start.. this means you @Diomede will have to start a thread tutorial from 'Carrara to Stable Diffusion for dummies' wink 

    but wow.... Dork came out well yes

     Yes, absolutely amazing. This is what I think about when someone says, "remixing your art". Very impressive.

  • DiomedeDiomede Posts: 15,169

    wsterdan said:

    Stezza said:

    I haven't used  Stable Diffusion so wouldn't know where to start.. this means you @Diomede will have to start a thread tutorial from 'Carrara to Stable Diffusion for dummies' wink 

    but wow.... Dork came out well yes

     Yes, absolutely amazing. This is what I think about when someone says, "remixing your art". Very impressive.

    Thanks, Walt.  Always more to learn.

  • DiomedeDiomede Posts: 15,169

    Another Example. Here is the Stable Diffusion AI alteration of a Carrara render.

  • DiomedeDiomede Posts: 15,169
    edited November 2023

    Here is the original Carrara render using the Poser 7 Simon and Sydney models.  I also rendered out a Depth pass, which I used in the Controlnet extension for Stable Diffusion.

    Post edited by Diomede on
  • DiomedeDiomede Posts: 15,169
    edited November 2023

    The Carrara scene and render took about 15 minutes to generate my starter scene.  I was playing around for a couple hours nefore I settled on the Stable Diffusion image.  I had decent alternatives within 20 minutes of getting started but it can be fun to play around with different settings.  I ended up using two of Controlnet's extensions at the same time, one is the Carrara depth pass. and the other was 'canny.' to find the outlines.  And it is possible to mask out only part of the image within Stable Diffusion for further manipulation.  Here is an example of a image I ultimately rejected.

    .

     

     

    Goerge Dragon Carrara Output 01_Depth.png
    512 x 768 - 72K
    13 using inpaint mask out the foreground woman and pole and edit the prompt to emphasise foreground and remove armor references.png
    1084 x 1008 - 445K
    Post edited by Diomede on
  • DiomedeDiomede Posts: 15,169
    edited November 2023

    One more.  Fifteen minutes for a quick Cary Grant North by Northwest highlight.  Poser 4 Dork Businessman and the Wright brothers plane from the Carrara content browser.

    Poser 4 / Carrara 5 level scene.

    Run through Stable Diffusion Controlnet 'canny' adjustment one time with simple Cary Grant running toward camera with biplane over shouder in a farm field as the prompt., no other adjustments.

     

     

     

    Cary Grant larger.png
    1024 x 1024 - 938K
    Cary Grant Carrara Output 01.png
    512 x 512 - 70K
    Post edited by Diomede on
  • I am curious how Brash and Moxie would look realistic now

  • DiomedeDiomede Posts: 15,169

    WendyLuvsCatz said:

    I am curious how Brash and Moxie would look realistic now

    Don't tell anyone, but I have been reviewing how to make a LoRa for each.  yes

  • Diomede said:

    WendyLuvsCatz said:

    I am curious how Brash and Moxie would look realistic now

    Don't tell anyone, but I have been reviewing how to make a LoRa for each.  yes

    yes I should try a LoRa, is probably easier than an Embedding 

  • DiomedeDiomede Posts: 15,169
    edited November 2023

    I don't know enough to have any idea if it is better to make a LoRa or do embedding. I've only tried one embedding that I downloaded.  The effect was too strong for my taste but given my beginner level of skill the issue could easily be that I did not understand how to control it.

     

     

     

    Post edited by Diomede on
  • DiomedeDiomede Posts: 15,169
    edited November 2023

    King Kong climbing the Eifel Tower. 

    Limiting myself to using verbal prompts, I had a very hard time generating King Kong doing the Paris thing.  I could get very good renders that included both Kong and the tower - sitting on a ledge of the tower, standing next to the tower, etc, etc, etc.  I even tried putting the word climbing in ((())) triple parantheses!  I attach an example 4 images of the kind of results I got from merely using verbal prompts in Stable Diffusion.  They are really good images of Kong, just not what I am specifically looking for.

    aa05 results without Carrara setup.png
    944 x 933 - 1M
    Post edited by Diomede on
  • DiomedeDiomede Posts: 15,169
    edited November 2023

    Ghoulish Objects from the Carrara Native Content Browser Graveyard

    There is a very rudimentary gorilla model in the native content browser.  I rigged it and posed it in a climbing position.  There is also a simple Eifel Tower model.  I stuck it on a Carrara terrain with a plateau filter.  I placed the Gorilla on the side of the tower. 

     

     

    aa03 eiffel tower from browser.png
    1721 x 992 - 362K
    Doc1.pnKong Paris 02g.png
    600 x 600 - 130K
    Post edited by Diomede on
  • DiomedeDiomede Posts: 15,169
    edited November 2023

     This time, I started with the Carrara render posted above.  This time, I used the standard render in Controlnet using the 'canny' function, not a depth pass. 

    The result

     

    aa01 gorilla from content browser rigged with carrara rigging.png
    1541 x 854 - 327K
    aa02 carrara scene setup.png
    1621 x 947 - 266K
    aa03 eiffel tower from browser.png
    1721 x 992 - 362K
    aa04 after stable diffusion.jpg
    512 x 406 - 62K
    Post edited by Diomede on
  • you are really getting the hang of this yes

    in a year we will be watching Tom and Jerry cartoons  AI filtered with real fur etc

    already possible for single frames

    I imagine video games, old films will all get the AI treatment

  • DiomedeDiomede Posts: 15,169

    Thanks Wendy.  You are correct about the likely application to old films and similar.  Under American law and most other places by treaty, films before 1928 are now public domain.  Someone could AI process the silent works of Buster Keaton and Charlie Chaplin, for example.

  • DiomedeDiomede Posts: 15,169

    Brash and Moxie?

    As an experiment, I tried a simple scene with my custom Brash and Moxie figures.  I loaded Brash 2 and Moxie 3 with their formal uniforms in Carrara.  Composed props and set using simple primitives.  Rendered out a base photoreal render and a depth pass.  Placed the Photoreal render in the 'canny' channel, the depth pass in the depth channel, reduced each channel to 0.5, and prompted Stable Diffusion to generate an image in 1940s comic style, but put anime among the negative prompts.

    Base Render

    Result

     

    SD output hi 2.png
    1024 x 1024 - 966K
  • DiomedeDiomede Posts: 15,169
    edited November 2023

    Here is the result of using the same verbal prompt and the same image prompts, but choosing a different model.

    Post edited by Diomede on
Sign In or Register to comment.