位置: IT常识 - 正文

Diffusion-GAN: Training GANs with Diffusion 解读

编辑:rootadmin
Diffusion-GAN: Training GANs with Diffusion 解读

推荐整理分享Diffusion-GAN: Training GANs with Diffusion 解读,希望有所帮助,仅作参考,欢迎阅读内容。

文章相关热门搜索词:,内容如对您有帮助,希望把文章链接给更多的朋友!

 Diffusion-GAN: 将GAN与diffusion一起训练 

paper:https://arxiv.org/abs/2206.02262

code:GitHub - Zhendong-Wang/Diffusion-GAN: Official PyTorch implementation for paper: Diffusion-GAN: Training GANs with Diffusion

  第一行从左向右看是diffusion forward的过程,不断由 real image 进行 diffusion,第三行从右向左看是由noise逐步恢复成fake image的过程,第二行是鉴别器D,D对每一个timestep都进行鉴别。 

 Figure 1: Flowchart for Diffusion-GAN. The top-row images represent the forward diffusion process of a real image, while the bottom-row images represent the forward diffusion process of a generated fake image. The discriminator learns to distinguish a diffused real image from a diffused fake image at all diffusion steps.

in Figure 1. In Diffusion-GAN, the input to the diffusion process is either a real or a generated image, and the diffusion process consists of a series of steps that gradually add noise to the  image. The number of diffusion steps is not fixed, but depends on the data and the generator. We also design the diffusion process to be differentiable, which means that we can compute the derivative of the output with respect to the input. This allows us to propagate the gradient from the discriminator to the generator through the diffusion process, and update the generator accordingly. Unlike vanilla GANs, which compare the real and generated images directly, Diffusion-GAN compares the noisy versions of them, which are obtained by sampling from the Gaussian mixture distribution over the diffusion steps, with the help of our timestep-dependent discriminator. This distribution has the property that its components have different noise-to-data ratios, which means that some components add more noise than others. By sampling from this distribution, we can achieve two benefits: first, we can stabilize the training by easing the problem of vanishing gradient, which occurs when the data and generator distributions are too different; second, we can augment the data by creating different noisy versions of the same image, which can improve the data efficiency and the diversity of the generator. We provide a theoretical analysis to support our method, and show that the min-max objective function of Diffusion-GAN, which measures the difference between the data and generator distributions, is continuous and differentiable everywhere. This means that the generator in theory can always receive a useful gradient from the discriminator, and improve its performance.【G可以从D收到有用的梯度,从而提升G的性能】

主要贡献:

1) We show both theoretically and empirically how the diffusion process can be utilized to provide a model- and domain-agnostic differentiable augmentation, enabling data-efficient and leaking-free stable GAN training.【稳定了GAN的训练】 2) Extensive experiments show that Diffusion-GAN boosts the stability and generation performance of strong baselines, including StyleGAN2 , Projected GAN , and InsGen , achieving state-of-the-art results in synthesizing photo-realistic images, as measured by both the Fréchet Inception Distance (FID)  and Recall score.【diffusion提升了原始只有GAN组成的框架的性能,例如styleGAN2,Projected GAN】

Diffusion-GAN: Training GANs with Diffusion 解读

Figure 2: The toy example inherited from Arjovsky et al. [2017]. The first row plots the distributions of data with diffusion noise injected for t. The second row shows the JS divergence and the optimal discriminator value with and without our noise injection. 

Figure 4: Plot of adaptively adjusted maximum diffusion steps T and discriminator outputs of Diffusion-GANs. 

To investigate how the adaptive diffusion process works during training, we illustrate in Figure 4 the convergence of the maximum timestep T in our adaptive diffusion and discriminator outputs. We see that T is adaptively adjusted: The T for Diffusion StyleGAN2 increases as the training goes while the T for Diffusion ProjectedGAN first goes up and then goes down. Note that the T is adjusted according to the overfitting status of the discriminator. The second panel shows that trained with the diffusion-based mixture distribution, the discriminator is always well-behaved and provides useful learning signals for the generator, which validates our analysis in Section 3.4 and Theorem 1.

如图4左所示,随着训练过程的变化,扩散的timestep T也会自适应的改变(T通过鉴别器D过拟合的状态而改变); 如图4右所示,用基于扩散的混合分布训练的鉴别器总是表现良好,并为生成器G提供有用的学习信号。

Effectiveness of Diffusion-GAN for domain-agnostic augmentation(未知域增强的有效性)

25-Gaussians Example.

We conduct experiments on the popular 25-Gaussians generation task. The 25-Gaussians dataset is a 2-D toy data, generated by a mixture of 25 two-dimensional Gaussian distributions. Each data point is a 2-dimensional feature vector. We train a small GAN model, whose generator and discriminator are both parameterized by multilayer perceptrons (MLPs), with two 128-unit hidden layers and LeakyReLu nonlinearities.

Figure 5: The 25-Gaussians example. We show the true data samples, the generated samples from vanilla GANs, the discriminator outputs of the vanilla GANs, the generated samples from our Diffusion-GAN, and the discriminator outputs of Diffusion-GAN. 

(1)groundtruth数据集的数据分布,在25个Gaussians example均匀分布; (2)vanilla GANs的输出结果产生了mode collapsing,只在几个model上生成数据; (3)vanilla GANs鉴别器输出很快就会彼此分离。这意味着发生了鉴别器的强烈过拟合,使得鉴别器停止为发生器提供有用的学习信号。 (4)Diffusion-GAN在25个example上均匀分布,意味着它在所有的model上学到了采样分布; (5)Diffusion-GAN的鉴别器输出,D在持续的为G提供有用的学习信号

我们从两个角度来解释这种改进: 首先,non-leaking augmentation(无泄漏增强)有助于提供关于数据空间的更多信息;第二,自适应调整的基于扩散的噪声注入,鉴别器表现良好。

关于 Difffferentiable augmentation. (可微分增强)

As Diffusion-GAN transforms both the data and generated samples before sending them to the discriminator, we can also relate it to differentiable augmentation proposed for data-efficient GAN training. Karras et al introduce a stochastic augmentation pipeline with 18 transformationsand develop an adaptive mechanism for controlling the augmentation probability. Zhao et al. [2020] propose to use Color + Translation + Cutout as differentiable augmentations for both generated and real images.

While providing good empirical results on some datasets, these augmentation methods are developed with domain-specific knowledge and have the risk of leaking augmentation  into generation [Karras et al., 2020a]. As observed in our experiments, they sometime worsen the results when applied to a new dataset, likely because the risk of augmentation leakage overpowers the benefits of enlarging the training set, which could happen especially if the training set size is already sufficiently large.(在数据量足够大的情况下,数据增强带来的负面效果可能大于正面效果)

By contrast, Diffusion-GAN uses a differentiable forward diffusion process to stochastically transform the data and can be considered as both a domain-agnostic and a model-agnostic augmentation method. In other words, Diffusion-GAN can be applied to non-image data or even latent features, for which appropriate data augmentation is difficult to be defined, and easily plugged into an existing GAN to improve its generation performance. Moreover, we prove in theory and show in experiments that augmentation leakage is not a concern for Diffusion-GAN. Tran et al. [2021] provide a theoretical analysis for deterministic non-leaking transformation with differentiable and invertible mapping functions. Bora et al. [2018] show similar theorems to us for specific stochastic transformations, such as Gaussian Projection, Convolve+Noise, and stochastic Block-Pixels, while our Theorem 2 includes more satisfying possibilities as discussed in Appendix B.

本文链接地址:https://www.jiuchutong.com/zhishi/294494.html 转载请保留说明!

上一篇:Vue|非单文件组件(vuecli非根目录打包)

下一篇:【HTML】原生js实现的图书馆管理系统(javascript原生)

  • 苹果华为怎么互传照片(苹果华为怎么互相充电)

    苹果华为怎么互传照片(苹果华为怎么互相充电)

  • 京东自营价保多少天(京东自营价保多长时间)

    京东自营价保多少天(京东自营价保多长时间)

  • 苹果美版xr1984三网通吗(美版xr可以入手吗)

    苹果美版xr1984三网通吗(美版xr可以入手吗)

  • 小米10如何关机(小米如何关机刷机)

    小米10如何关机(小米如何关机刷机)

  • 华为售后换新机的条件(华为售后换新机是翻新机吗)

    华为售后换新机的条件(华为售后换新机是翻新机吗)

  • qq拆开对方的礼物会有提醒吗(qq拆开对方的礼物怎么办)

    qq拆开对方的礼物会有提醒吗(qq拆开对方的礼物怎么办)

  • 显示器打不开(电脑显示器打不开)

    显示器打不开(电脑显示器打不开)

  • c盘的windows文件夹可以删除吗(c盘的windows文件夹可以移动吗)

    c盘的windows文件夹可以删除吗(c盘的windows文件夹可以移动吗)

  • 微信为什么突然被盗了(微信为什么突然自动退出登录)

    微信为什么突然被盗了(微信为什么突然自动退出登录)

  • 荣耀30和华为nova7有什么区别(荣耀30和华为nova5哪个好)

    荣耀30和华为nova7有什么区别(荣耀30和华为nova5哪个好)

  • 笔记本i5和i7的区别(笔记本i5和i7的哪个好)

    笔记本i5和i7的区别(笔记本i5和i7的哪个好)

  • 计算机的内存是指(计算机的内存是什么)

    计算机的内存是指(计算机的内存是什么)

  • 手机卸载记录在哪查看(手机的卸载历史记录)

    手机卸载记录在哪查看(手机的卸载历史记录)

  • ps2019红眼工具在哪里(ps中红眼工具的作用)

    ps2019红眼工具在哪里(ps中红眼工具的作用)

  • ipad恢复出厂设置照片还在吗(ipad恢复出厂设置后怎么激活)

    ipad恢复出厂设置照片还在吗(ipad恢复出厂设置后怎么激活)

  • ipad5可以插卡吗(ipad5能插手机卡么)

    ipad5可以插卡吗(ipad5能插手机卡么)

  • kindle七天试用后必须续费吗(kindle试用期过了之后之前借阅的能看不)

    kindle七天试用后必须续费吗(kindle试用期过了之后之前借阅的能看不)

  • 计算机一般按什么分类(计算机一般按什键盘能启动)

    计算机一般按什么分类(计算机一般按什键盘能启动)

  • 荣耀v20长宽高(华为荣耀v20手机多长多宽)

    荣耀v20长宽高(华为荣耀v20手机多长多宽)

  • 未接通对方能收到吗(未接通会响铃吗)

    未接通对方能收到吗(未接通会响铃吗)

  • 删了好友再加回来巨轮还在吗(删了好友再加回来聊天记录还能恢复吗)

    删了好友再加回来巨轮还在吗(删了好友再加回来聊天记录还能恢复吗)

  • qq隐身会显示什么状态(qq隐身会显示什么状态图片)

    qq隐身会显示什么状态(qq隐身会显示什么状态图片)

  • 苹果手机相机声音如何关闭声音(苹果手机相机声音小是怎么回事)

    苹果手机相机声音如何关闭声音(苹果手机相机声音小是怎么回事)

  • 帝国cms为什么安全(帝国cms为什么安装不了)

    帝国cms为什么安全(帝国cms为什么安装不了)

  • bootstrap的基础使用(bootstrapstandby)

    bootstrap的基础使用(bootstrapstandby)

  • 工会经费的计税依据是含税还是不含税
  • 预缴增值税是否要预缴企业所得税
  • 出租人融资租赁发生的初始直接费用
  • 一般纳税人劳务费税率是多少
  • 固定资产的处理包括
  • 购入固定资产入账
  • 国际货物运输免征所得税
  • 贷款利息收入要减去支付利息支出吗
  • 转贴现视为贷款银行如何进行账务处理?
  • 其他公司向本企业借款
  • 小规模纳税人税收优惠政策变化
  • 个人开增值税普票有没有限额
  • 开具的销项发票是否都要入收入科目吗?
  • 企业所得税怎么做
  • 工会筹备金交给谁
  • 分公司可以再开分公司吗
  • 酒店挂账要做收款凭证吗
  • 限售股转让所得
  • 会计与税法折旧的关系
  • 什么是应付债券简单举例
  • 资本溢价最后转入哪里
  • 员工出差买的保险怎么入账
  • 服装加工费发票税点
  • 建筑企业预缴企业所得税会计分录
  • bios设置密码有什么用
  • php代码生成器
  • 洗车店如何开
  • 公司私户利息收入怎么算
  • PHP:Memcached::setMulti()的用法_Memcached类
  • 以太网默认网关怎么查看
  • 股权收购特殊性税务处理案例
  • php中undefined index
  • react devtools
  • vuex五个核心概念
  • YOLOv5|YOLOv7|YOLOv8改各种IoU损失函数:YOLOv8涨点Trick,改进添加SIoU损失函数、EIoU损失函数、GIoU损失函数、α-IoU损失函数
  • yii2高级应用之自定义组件实现全局使用图片上传功能的方法
  • return函数
  • 人才引进的安家费
  • 货运代理约柜费怎么算
  • 实收资本 增加
  • 先付款后开票怎么做账务处理
  • 织梦如何采集文章
  • vue开发需要掌握哪些知识
  • linux服务器架设指南
  • 数据库my sql
  • SQL2005 provider: 命名管道提供程序 error: 40 无法打开到 SQL Server 的连接
  • 收到证券公司信息
  • 先买再卖影响可取现金吗
  • 运费与快递费的区别在哪
  • 外购货物用于促销的账务处理
  • 增值税劳务费税率是多少
  • 哪些合同不用交社保
  • 期初余额根据什么填
  • 暂估成本以后也没有票回来了
  • 签合同的名称和内容
  • 按照管理会计的解释,成本的相关性是
  • 汽车折旧年限及残值率是多少
  • sql server常用
  • Win10系统中怎么将文件夹进行压缩
  • windows写字板功能
  • ubuntu屏幕截图快捷键
  • 卡巴斯基2019
  • win10windows更新
  • linux日常使用
  • winxp系统怎么安装
  • Qoeloader.exe - Qoeloader是什么进程 有什么用
  • 如何解除系统默认
  • linux快速查看目录大小
  • win7怎么连接手机上网
  • bootstrap导航有哪些
  • android skia opengl
  • js操作list
  • shell 批量执行命令
  • javascript事件委托的用法及其好处简析
  • vue实战案例
  • javascript闭包的作用
  • json python 字段读取
  • 成都税务举报电话多少
  • 个人出租房屋如何计税?看这篇
  • 淄博市地方税务局
  • 免责声明:网站部分图片文字素材来源于网络,如有侵权,请及时告知,我们会第一时间删除,谢谢! 邮箱:opceo@qq.com

    鄂ICP备2023003026号

    网站地图: 企业信息 工商信息 财税知识 网络常识 编程技术

    友情链接: 武汉网站建设