画像生成モデル

Waifu Diffusion 1.4 Epoch 2 + 東北ずん子PJ公式イラスト LoRA作成テスト

種類 LoRA ベースモデル Waifu Diffusion 1.4 Epoch 2 生成例生成例1 <lora:tohoku_zunko-20230404.1-epoch-000010:1:OUTD>, (tohoku zunko girl:1), 1girl, solo, hairband, long hair, green hairband, japanese clothes, yellow eyes, smile, green hair, open mouth, upper body, white background, kimono, simple background, ahoge, very long hair, short kimono, looking at viewer, masterpiece, best quality, ultra-detailed Negative prompt: lowres, ((bad anatomy)), ((bad hands)), text, missing finger, extra digits, fewer digits, blurry, ((mutated hands and fingers)), (poorly drawn face), ((mutation)), ((deformed face)), (ugly), ((bad proportions)), ((extra limbs)), extra face, (double head), (extra head), ((extra feet)), monster, logo, cropped, worst quality, jpeg, humpbacked, long body, long neck, ((jpeg artifacts)), deleted, old, oldest, ((censored)), ((bad aesthetic)), (mosaic censoring, bar censor, blur censor) Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 4052543271, Size: 512x512, Model hash: 1f108d4ceb, Model: wd-1-4-anime_e2 生成例2 ...

Stable Diffusion に関するメモ

Stable Diffusionは、ミュンヘン大学のCompVis研究グループ、スタートアップ企業のStability AI、Runwayの三者が共同で2022年8月にリリースした、オープンソースの画像生成AIモデル。「プロンプト」という単語列に従って画像を生成する、Text to Imageモデルの一種。 CompVis/stable-diffusion: A latent text-to-image diffusion model Creative ML OpenRAIL-Mという独自のオープンソースライセンスのもと配布され、望ましくない使い方を禁止するいくつかの制限のもと、再利用が認められている。 CreativeML OpenRAIL-Mライセンス原文 - CompVis/stable-diffusion - GitHub 同時期に公開された類似の画像生成AIには、DALL·E2やMidjourneyがある。公式の有償サービスとして、DreamStudioが提供されている。 Stable Diffusionは、学習済みモデルを含めてオープンソースであるため、ユーザのローカル環境や、Googleが提供するGPU環境であるColaboratory上、オープンソースコミュニティHugging Face上での推論実行が可能なほか、Fine tuningなどにより変更を加えたモデルを公開することができる。 Stable Diffusion派生モデルには、アニメ風の絵・萌え絵に特化したWaifu Diffusionや、NovelAIの画像生成サービスなどがある。

NovelAI リークモデルに関する調査メモ

NovelAIは、2022年10月に独自のStable Diffusion派生モデルによる画像生成サービスをリリースした。 Image Generation has arrived, NovelAI Diffusion is here! | Medium しかし、このサービスに使われているとみられる生成モデルが流出してしまった。 animefull-final-pruned: sha256 925997e9 animesfw-final-pruned: sha256 1d4a34af animevae.pt: sha256 f921fb3f 現在では、Stable Diffusion派生の異なるモデルを合成する、モデルマージと呼ばれる手法など、様々な手法により生まれた「NovelAIリーク派生モデル」とみられるモデルが流通してしまっている。 NovelAIリーク派生モデルとみられるモデルの例 Anything V3/V5 Anything V4 OrangeMixs 関連リンク Welcome to sdupdates Discussions! · questianon/sdupdates · Discussion #1 · GitHub New (suspected) NAI model leak - AnythingV3.0 and VAE for it · AUTOMATIC1111/stable-diffusion-webui · Discussion #4516 · GitHub Emulate NovelAI · AUTOMATIC1111/stable-diffusion-webui · Discussion #2017 Anything v4.5 VAE swapped · AUTOMATIC1111/stable-diffusion-webui · Discussion #7044 AIの著作権問題が複雑化 - 週刊アスキー高杉　光一🦋 @14:59さんはTwitterを使っています: 「画像生成AI界隈でよく使われるモデルの系列調査いわゆる"アスカチャレンジ"です元々はNovelAIとリークNAIの関係調査に使われたものですそれを手持ちのすべてのSD1.4系メジャーモデルに当てはめたのがこちらとなります画質はご容赦を #AIart #waifudiffusion #anythingv3 #ACertainThing https://t.co/6aP8g0q4pz」 / Twitter

Stable Diffusion + LoRA に関するメモ

LoRA（Low-Rank Adaptation）は、2021年にEdward Huらが提案した、大規模言語モデルを効率的にFine tuningする手法。 https://github.com/microsoft/LoRA https://arxiv.org/abs/2106.09685 An important paradigm of natural language processing consists of large-scale pretraining on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example – deploying independent instances of fine-tuned models, each with 175B parameters, is prohibitively expensive. We propose Low-Rank Adaptation, or LoRA, which freezes the pretrained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks. Compared to GPT-3 175B fine-tuned with Adam, LoRA can reduce the number of trainable parameters by 10,000 times and the GPU memory requirement by 3 times. LoRA performs on-par or better than finetuning in model quality on RoBERTa, DeBERTa, GPT-2, and GPT-3, despite having fewer trainable parameters, a higher training throughput, and, unlike adapters, no additional inference latency. We also provide an empirical investigation into rank-deficiency in language model adaptation, which sheds light on the efficacy of LoRA. We release a package that facilitates the integration of LoRA with PyTorch models and provide our implementations and model checkpoints for RoBERTa, DeBERTa, and GPT-2 at https://github.com/microsoft/LoRA. ...

Stable Diffusion Web UIと関連ツール・モデルに関するメモ

Stable Diffusion Web UI (AUTOMATIC1111) 推論時に使う基本となるWeb UI。専用の拡張機能を導入できる。拡張機能の中には、推論機能の拡張のほか、訓練データの準備に使えるものもある。 https://github.com/AUTOMATIC1111/stable-diffusion-webui Current Commit Hash: 22bcc7be https://github.com/aoirint/stable-diffusion-webui-docker 拡張機能 Generate-TransparentIMG https://github.com/hunyaramoke/Generate-TransparentIMG Current Commit Hash: 6059579e stable-diffusion-webui-wd14-tagger https://github.com/toriato/stable-diffusion-webui-wd14-tagger Current Commit Hash: 3ba3a735 sd-webui-lora-block-weight https://github.com/hako-mikan/sd-webui-lora-block-weight Current Commit Hash: 9bd7fa16 a1111-sd-webui-locon https://github.com/KohakuBlueleaf/a1111-sd-webui-locon Current Commit Hash: 8e0ebd76 a1111-sd-webui-tagcomplete https://github.com/DominikDoom/a1111-sd-webui-tagcomplete Current Commit Hash: 15538336 Not working? モデル Waifu Diffusion Stable Diffusionベースのアニメ絵に特化した学習済みモデル。haru氏らによって作成、公開されている。モデルベースモデルリリースノート Waifu Diffusion 1.3 Stable Diffusion 1.4 Gist Waifu Diffusion 1.4 Stable Diffusion 2.1 Gist Waifu Diffusion 1.5 Beta 2 Stable Diffusion 2.1 Notion Anything Stable Diffusionベースのアニメ絵に特化した学習済みモデル。NovelAIリーク派生モデルとみられる。 ...