位置: IT常识 - 正文

语义分割系列25-BiSeNetV2(pytorch实现)(语义分割入门教程)

发布时间:2024-01-17
语义分割系列25-BiSeNetV2(pytorch实现)

推荐整理分享语义分割系列25-BiSeNetV2(pytorch实现)(语义分割入门教程),希望有所帮助,仅作参考,欢迎阅读内容。

文章相关热门搜索词:语义分割什么意思,语义分割入门教程,语义分割常用算法,语义分割常用算法,语义分割常用算法,语义分割项目实战,语义分割segnet,语义分割入门,内容如对您有帮助,希望把文章链接给更多的朋友!

继BiSeNetV1之后(语义分割系列16-BiSeNetV1),BiSeNetV2在2021年IJCV上发布。

论文链接:BiSeNetV2

相比于V1版本,V2版本在下采样策略、卷积类型、特征融合等方面做了诸多改进。

本文将介绍:

BiSeNetV2如何设计Semantic Branch和Detail Branch。BiSeNetV2如何设计Aggregation Layer完成特征融合。BiSeNetV2如何设计Auxiliary Loss来帮助模型训练。BiSeNetV2的代码实现与应用。

目录

论文部分

引文

模型

Backbone-Detail Branch

Backbone-Semantic Branch

Aggregation Layer

分割头SegHead

Booster(auxiliary Loss)

BiSeNetV2实现以及在Camvid上应用

BiSeNetV2实现

Camvid dataset

Train

Result

论文部分引文

BiSeNetV1版本的双路分割结构在实时分割的任务中取得了不错的效果,这种网络结构能够保留低级细节和高级语义,同时又不会损害推理速度,很好的权衡了实现准确的语义分割任务和快速的推理速度之间的平衡。

因此,提出了基于双路的分段网络-BiSeNetV2来实现实时的语义分割。

相比于初版BiSeNetV1:

V2简化了原始结构,使网络更加高效使用更加紧凑的网络结构以及精心设计的组件,加深了Semantic Branch的网络,使用更加轻巧的深度可分离卷积来加速模型。设计了更为有效的Aggregation Layer,以增强Semantic Branch和Detail Branch之间的链接。

模型

首先看模型的整体结构:

图1 BiSeNetV2模型结构

 BiSeNetV2主要包含几个结构:

紫色框(backbone)内的双路分支,上为Detail Branch分支,下为Semantic Branch分支。橙色框(Aggregation Layer)内的Aggregation Layer聚合层。黄色框(Booster)内的Auxiliary Loss分支。

首先,我们先介绍紫色框backbone部分。

Backbone-Detail Branch

对于Detail Branch,依旧使用类VGG的网络结构,这一部分结构较为简单,用于快速下采样并得到细分的feature map。

代码部分如下:

import torchimport torch.nn as nnclass DetailBranch(nn.Module): def __init__(self, detail_channels=(64, 64, 128), in_channels=3): super(DetailBranch, self).__init__() self.detail_branch = nn.ModuleList() for i in range(len(detail_channels)): if i == 0: self.detail_branch.append( nn.Sequential( nn.Conv2d(in_channels, detail_channels[i], 3, stride=2, padding=1), nn.BatchNorm2d(detail_channels[i]), nn.ReLU(), nn.Conv2d(detail_channels[i], detail_channels[i], 3, stride=1, padding=1), nn.BatchNorm2d(detail_channels[i]), nn.ReLU(), ) ) else: self.detail_branch.append( nn.Sequential( nn.Conv2d(detail_channels[i-1], detail_channels[i], 3, stride=2, padding=1), nn.BatchNorm2d(detail_channels[i]), nn.ReLU(), nn.Conv2d(detail_channels[i], detail_channels[i], 3, stride=1, padding=1), nn.BatchNorm2d(detail_channels[i]), nn.ReLU(), nn.Conv2d(detail_channels[i], detail_channels[i], 3, stride=1, padding=1), nn.BatchNorm2d(detail_channels[i]), nn.ReLU() ) ) def forward(self, x): for stage in self.detail_branch: x = stage(x) return xif __name__ == "__main__": x = torch.randn(3, 3, 224, 224) net = DetailBranch(detail_channels=(64, 64, 128), in_channels=3) out = net(x) print(out.shape)Backbone-Semantic Branch语义分割系列25-BiSeNetV2(pytorch实现)(语义分割入门教程)

Semantic Branch与Detail Branch平行,主要用于捕获高级语义信息。在这一个分支中,通道数比较少,因为更多信息可以由Detail Branch提供。由于获取高级语义信息需要上下文的依赖和较大的感受野,所以,在这一个分支中,使用快速采样的策略来迅速扩大感受野;使用全局平均池化来嵌入上下文信息。

作者在这部分做了较为精心的设计,主要包括三部分:

Stem Block用于快速下采样;Gather-and-Expansion Layer(GE Layer)用于卷积获取细节信息。Context Embedding Block(CE Layer)用于嵌入上下文信息。

Stem Block 和CE Block结构

Stem Block和CE Block的结构较为简单。

图2 Stem Block 和CE Block结构

代码实现:

import torchimport torch.nn as nnimport torch.nn.functional as Fclass StemBlock(nn.Module): def __init__(self, in_channels=3, out_channels=16): super(StemBlock, self).__init__() self.conv_in = nn.Sequential( nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=2, padding=1), nn.BatchNorm2d(out_channels), nn.ReLU() ) self.conv_branch = nn.Sequential( nn.Conv2d(out_channels, out_channels//2, 1), nn.BatchNorm2d(out_channels//2), nn.ReLU(), nn.Conv2d(out_channels//2, out_channels, 3, stride=2, padding=1), nn.BatchNorm2d(out_channels), nn.ReLU() ) self.pool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1, ceil_mode=False) self.fusion = nn.Sequential( nn.Conv2d(2*out_channels, out_channels, 3, padding=1), nn.BatchNorm2d(out_channels), nn.ReLU() ) def forward(self, x): x = self.conv_in(x) x_branch = self.conv_branch(x) x_downsample = self.pool(x) out = torch.cat([x_branch, x_downsample], dim=1) out = self.fusion(out) return outif __name__ == "__main__": x = torch.randn(3, 3, 224, 224) net = StemBlock() out = net(x) print(out.shape)class CEBlock(nn.Module): def __init__(self,in_channels=16, out_channels=16): super(CEBlock, self).__init__() self.in_channels = in_channels self.out_channels = out_channels self.gap = nn.Sequential( nn.AdaptiveAvgPool2d((1, 1)), # AdaptiveAvgPool2d 把形状变为(Batch size, N, 1, 1)后,batch size=1不能正常通过BatchNorm2d, 但是batch size>1是可以正常通过的 # nn.BatchNorm2d(self.in_channels) ) self.conv_gap = nn.Sequential( nn.Conv2d(self.in_channels, self.out_channels, 1, stride=1, padding=0), # nn.BatchNorm2d(self.out_channels), 同上 nn.ReLU() ) # Note: in paper here is naive conv2d, no bn-relu self.conv_last = nn.Conv2d( in_channels=self.out_channels, out_channels=self.out_channels, kernel_size=3, stride=1, padding=1) def forward(self, x): identity = x x = self.gap(x) x = self.conv_gap(x) x = identity + x x = self.conv_last(x) return xif __name__ == "__main__": x = torch.randn(1, 16, 224, 224) net = CEBlock() out = net(x) print(out.shape)

GE Block结构

图3 GE Block结构(b,c)

对于GE Block,分为是否进行下采样两个模块,不进行下采样的GE Block(b)和进行下采样的GE Block。作者在这里借鉴了MobileNetv2中的倒瓶颈结构设计,为了减少计算量,中间使用一个深度可分离卷积。

下面给出GE Block的代码:

import torchimport torch.nn as nnclass depthwise_separable_conv(nn.Module): def __init__(self, in_channels, out_channels, stride): super(depthwise_separable_conv, self).__init__() self.depthwise = nn.Conv2d(in_channels, in_channels, kernel_size=3, stride=stride, padding=1, groups=in_channels) self.pointwise = nn.Conv2d(in_channels, out_channels, kernel_size=1) def forward(self, x): out = self.depthwise(x) out = self.pointwise(out) return outclass GELayer(nn.Module): def __init__(self, in_channels, out_channels, exp_ratio=6, stride=1): super(GELayer, self).__init__() mid_channel = in_channels * exp_ratio self.conv1 = nn.Sequential( nn.Conv2d(in_channels, in_channels, kernel_size=3, stride=1,padding=1), nn.BatchNorm2d(in_channels), nn.ReLU() ) if stride == 1: self.dwconv = nn.Sequential( # ReLU in ConvModule not shown in paper nn.Conv2d(in_channels, mid_channel, 3, stride=stride, padding=1, groups=in_channels), nn.BatchNorm2d(mid_channel), nn.ReLU(), depthwise_separable_conv(mid_channel, mid_channel, stride=1), nn.BatchNorm2d(mid_channel), ) self.shortcut = None else: self.dwconv = nn.Sequential( nn.Conv2d(in_channels, mid_channel, 3, stride=1, padding=1, groups=in_channels,bias=False), nn.BatchNorm2d(mid_channel), nn.ReLU(), # ReLU in ConvModule not shown in paper depthwise_separable_conv(mid_channel, mid_channel, stride=stride), nn.BatchNorm2d(mid_channel), depthwise_separable_conv(mid_channel, mid_channel, stride=1), nn.BatchNorm2d(mid_channel), ) self.shortcut = nn.Sequential( depthwise_separable_conv(in_channels, out_channels, stride=stride), nn.BatchNorm2d(out_channels), nn.Conv2d(out_channels, out_channels, 1), nn.BatchNorm2d(out_channels), ) self.conv2 = nn.Sequential( nn.Conv2d(mid_channel, out_channels, kernel_size=1, stride=1, padding=0,bias=False), nn.BatchNorm2d(out_channels) ) self.act = nn.ReLU() def forward(self, x): identity = x x = self.conv1(x) x = self.dwconv(x) x = self.conv2(x) if self.shortcut is not None: shortcut = self.shortcut(identity) x = x + shortcut else: x = x + identity x = self.act(x) return xif __name__ == "__main__": x = torch.randn(3, 16, 224, 224) net = GELayer(in_channels=16, out_channels=16, stride=2) out = net(x) print(out.shape)

Semantic Branch的代码:

class SemanticBranch(nn.Module): def __init__(self, semantic_channels=(16, 32, 64, 128), in_channels=3, exp_ratio=6): super(SemanticBranch, self).__init__() self.in_channels = in_channels self.semantic_channels = semantic_channels self.semantic_stages = nn.ModuleList() for i in range(len(semantic_channels)): if i == 0: self.semantic_stages.append(StemBlock(self.in_channels, semantic_channels[i])) elif i == (len(semantic_channels) - 1): self.semantic_stages.append( nn.Sequential( GELayer(semantic_channels[i - 1], semantic_channels[i], exp_ratio, 2), GELayer(semantic_channels[i], semantic_channels[i], exp_ratio, 1), GELayer(semantic_channels[i], semantic_channels[i], exp_ratio, 1), GELayer(semantic_channels[i], semantic_channels[i], exp_ratio, 1) ) ) else: self.semantic_stages.append( nn.Sequential( GELayer(semantic_channels[i - 1], semantic_channels[i], exp_ratio, 2), GELayer(semantic_channels[i], semantic_channels[i], exp_ratio, 1) ) ) self.semantic_stages.append(CEBlock(semantic_channels[-1], semantic_channels[-1])) def forward(self, x): semantic_outs = [] for semantic_stage in self.semantic_stages: x = semantic_stage(x) semantic_outs.append(x) return semantic_outsif __name__ == "__main__": x = torch.randn(3, 3, 224, 224) net = SemanticBranch() out = net(x) print(out[0].shape) print(out[1].shape) print(out[2].shape) print(out[3].shape) print(out[4].shape) # from torchsummary import summary # summary(net.cuda(), (3, 224, 224))Aggregation Layer

Aggregation Layer接受了Detail Branch和Semantic Branch的结果,通过图4中的一系列操作进行特征融合。

图4 Aggregation Layer结构

 代码实现:

import torchimport torch.nn as nnimport torch.nn.functional as Fclass AggregationLayer(nn.Module): def __init__(self, in_channels, out_channels): super(AggregationLayer, self).__init__() self.Conv_DetailBranch_1 = nn.Sequential( depthwise_separable_conv(in_channels, out_channels, stride=1), nn.BatchNorm2d(out_channels), nn.Conv2d(out_channels, out_channels, 1) ) self.Conv_DetailBranch_2 = nn.Sequential( nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=2, padding=1), nn.BatchNorm2d(out_channels), nn.AvgPool2d(kernel_size=3, stride=2, padding=1), ) self.Conv_SemanticBranch_1 = nn.Sequential( nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1), nn.BatchNorm2d(out_channels), nn.Upsample(scale_factor=4, mode="bilinear", align_corners=True), nn.Sigmoid() ) self.Conv_SemanticBranch_2 = nn.Sequential( depthwise_separable_conv(in_channels, out_channels, stride=1), nn.BatchNorm2d(out_channels), nn.Conv2d(out_channels, out_channels, kernel_size=1), nn.Sigmoid() ) self.conv_out = nn.Sequential( nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=1), nn.BatchNorm2d(out_channels), ) def forward(self, Detail_x, Semantic_x): DetailBranch_1 = self.Conv_DetailBranch_1(Detail_x) DetailBranch_2 = self.Conv_DetailBranch_2(Detail_x) SemanticBranch_1 = self.Conv_SemanticBranch_1(Semantic_x) SemanticBranch_2 = self.Conv_SemanticBranch_2(Semantic_x) out_1 = torch.matmul(DetailBranch_1, SemanticBranch_1) out_2 = torch.matmul(DetailBranch_2, SemanticBranch_2) out_2 = F.interpolate(out_2, scale_factor=4, mode="bilinear", align_corners=True) out = torch.matmul(out_1, out_2) out = self.conv_out(out) return outif __name__ == "__main__": Detail_x = torch.randn(3, 56, 224, 224) Semantic_x = torch.randn(3, 56, 224//4, 224//4) net = AggregationLayer(in_channels=56, out_channels=122) out = net(Detail_x, Semantic_x) print(out.shape)分割头SegHead

检测头的实现比较简单。

class SegHead(nn.Module): def __init__(self, channels, num_classes): super().__init__() self.cls_seg = nn.Sequential( nn.Conv2d(channels, channels, 3, padding=1), nn.BatchNorm2d(channels), nn.ReLU(), nn.Conv2d(channels, num_classes, 1), ) def forward(self, x): return self.cls_seg(x)Booster(auxiliary Loss)

作者在Semantic Branch中引出了几个Auxiliary Loss分支,对比了集中Auxiliary Loss组合的性能,得出如下结果。

BiSeNetV2实现以及在Camvid上应用BiSeNetV2实现import torchimport torch.nn as nnimport torch.nn.functional as Fclass StemBlock(nn.Module): def __init__(self, in_channels=3, out_channels=16): super(StemBlock, self).__init__() self.conv_in = nn.Sequential( nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=2, padding=1), nn.BatchNorm2d(out_channels), nn.ReLU() ) self.conv_branch = nn.Sequential( nn.Conv2d(out_channels, out_channels//2, 1), nn.BatchNorm2d(out_channels//2), nn.ReLU(), nn.Conv2d(out_channels//2, out_channels, 3, stride=2, padding=1), nn.BatchNorm2d(out_channels), nn.ReLU() ) self.pool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1, ceil_mode=False) self.fusion = nn.Sequential( nn.Conv2d(2*out_channels, out_channels, 3, padding=1), nn.BatchNorm2d(out_channels), nn.ReLU() ) def forward(self, x): x = self.conv_in(x) x_branch = self.conv_branch(x) x_downsample = self.pool(x) out = torch.cat([x_branch, x_downsample], dim=1) out = self.fusion(out) return outclass depthwise_separable_conv(nn.Module): def __init__(self, in_channels, out_channels, stride): super(depthwise_separable_conv, self).__init__() self.depthwise = nn.Conv2d(in_channels, in_channels, kernel_size=3, stride=stride, padding=1, groups=in_channels) self.pointwise = nn.Conv2d(in_channels, out_channels, kernel_size=1) def forward(self, x): out = self.depthwise(x) out = self.pointwise(out) return outclass GELayer(nn.Module): def __init__(self, in_channels, out_channels, exp_ratio=6, stride=1): super(GELayer, self).__init__() mid_channel = in_channels * exp_ratio self.conv1 = nn.Sequential( nn.Conv2d(in_channels, in_channels, kernel_size=3, stride=1,padding=1), nn.BatchNorm2d(in_channels), nn.ReLU() ) if stride == 1: self.dwconv = nn.Sequential( # ReLU in ConvModule not shown in paper nn.Conv2d(in_channels, mid_channel, 3, stride=stride, padding=1, groups=in_channels), nn.BatchNorm2d(mid_channel), nn.ReLU(), depthwise_separable_conv(mid_channel, mid_channel, stride=1), nn.BatchNorm2d(mid_channel), ) self.shortcut = None else: self.dwconv = nn.Sequential( nn.Conv2d(in_channels, mid_channel, 3, stride=1, padding=1, groups=in_channels,bias=False), nn.BatchNorm2d(mid_channel), nn.ReLU(), # ReLU in ConvModule not shown in paper depthwise_separable_conv(mid_channel, mid_channel, stride=stride), nn.BatchNorm2d(mid_channel), depthwise_separable_conv(mid_channel, mid_channel, stride=1), nn.BatchNorm2d(mid_channel), ) self.shortcut = nn.Sequential( depthwise_separable_conv(in_channels, out_channels, stride=stride), nn.BatchNorm2d(out_channels), nn.Conv2d(out_channels, out_channels, 1), nn.BatchNorm2d(out_channels), ) self.conv2 = nn.Sequential( nn.Conv2d(mid_channel, out_channels, kernel_size=1, stride=1, padding=0,bias=False), nn.BatchNorm2d(out_channels) ) self.act = nn.ReLU() def forward(self, x): identity = x x = self.conv1(x) x = self.dwconv(x) x = self.conv2(x) if self.shortcut is not None: shortcut = self.shortcut(identity) x = x + shortcut else: x = x + identity x = self.act(x) return xclass CEBlock(nn.Module): def __init__(self,in_channels=16, out_channels=16): super(CEBlock, self).__init__() self.in_channels = in_channels self.out_channels = out_channels self.gap = nn.Sequential( nn.AdaptiveAvgPool2d((1, 1)), # AdaptiveAvgPool2d 把形状变为(Batch size, N, 1, 1)后,batch size=1不能正常通过BatchNorm2d, 但是batch size>1是可以正常通过的。如果想开启BatchNorm,训练时batch size>1即可,测试时使用model.eval()即不会报错。 # nn.BatchNorm2d(self.in_channels) ) self.conv_gap = nn.Sequential( nn.Conv2d(self.in_channels, self.out_channels, 1, stride=1, padding=0), # nn.BatchNorm2d(self.out_channels), 同上 nn.ReLU() ) # Note: in paper here is naive conv2d, no bn-relu self.conv_last = nn.Conv2d( in_channels=self.out_channels, out_channels=self.out_channels, kernel_size=3, stride=1, padding=1) def forward(self, x): identity = x x = self.gap(x) x = self.conv_gap(x) x = identity + x x = self.conv_last(x) return xclass DetailBranch(nn.Module): def __init__(self, detail_channels=(64, 64, 128), in_channels=3): super(DetailBranch, self).__init__() self.detail_branch = nn.ModuleList() for i in range(len(detail_channels)): if i == 0: self.detail_branch.append( nn.Sequential( nn.Conv2d(in_channels, detail_channels[i], 3, stride=2, padding=1), nn.BatchNorm2d(detail_channels[i]), nn.ReLU(), nn.Conv2d(detail_channels[i], detail_channels[i], 3, stride=1, padding=1), nn.BatchNorm2d(detail_channels[i]), nn.ReLU(), ) ) else: self.detail_branch.append( nn.Sequential( nn.Conv2d(detail_channels[i-1], detail_channels[i], 3, stride=2, padding=1), nn.BatchNorm2d(detail_channels[i]), nn.ReLU(), nn.Conv2d(detail_channels[i], detail_channels[i], 3, stride=1, padding=1), nn.BatchNorm2d(detail_channels[i]), nn.ReLU(), nn.Conv2d(detail_channels[i], detail_channels[i], 3, stride=1, padding=1), nn.BatchNorm2d(detail_channels[i]), nn.ReLU() ) ) def forward(self, x): for stage in self.detail_branch: x = stage(x) return xclass SemanticBranch(nn.Module): def __init__(self, semantic_channels=(16, 32, 64, 128), in_channels=3, exp_ratio=6): super(SemanticBranch, self).__init__() self.in_channels = in_channels self.semantic_channels = semantic_channels self.semantic_stages = nn.ModuleList() for i in range(len(semantic_channels)): if i == 0: self.semantic_stages.append(StemBlock(self.in_channels, semantic_channels[i])) elif i == (len(semantic_channels) - 1): self.semantic_stages.append( nn.Sequential( GELayer(semantic_channels[i - 1], semantic_channels[i], exp_ratio, 2), GELayer(semantic_channels[i], semantic_channels[i], exp_ratio, 1), GELayer(semantic_channels[i], semantic_channels[i], exp_ratio, 1), GELayer(semantic_channels[i], semantic_channels[i], exp_ratio, 1) ) ) else: self.semantic_stages.append( nn.Sequential( GELayer(semantic_channels[i - 1], semantic_channels[i], exp_ratio, 2), GELayer(semantic_channels[i], semantic_channels[i], exp_ratio, 1) ) ) self.semantic_stages.append(CEBlock(semantic_channels[-1], semantic_channels[-1])) def forward(self, x): semantic_outs = [] for semantic_stage in self.semantic_stages: x = semantic_stage(x) semantic_outs.append(x) return semantic_outsclass AggregationLayer(nn.Module): def __init__(self, in_channels, out_channels): super(AggregationLayer, self).__init__() self.Conv_DetailBranch_1 = nn.Sequential( depthwise_separable_conv(in_channels, out_channels, stride=1), nn.BatchNorm2d(out_channels), nn.Conv2d(out_channels, out_channels, 1) ) self.Conv_DetailBranch_2 = nn.Sequential( nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=2, padding=1), nn.BatchNorm2d(out_channels), nn.AvgPool2d(kernel_size=3, stride=2, padding=1), ) self.Conv_SemanticBranch_1 = nn.Sequential( nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1), nn.BatchNorm2d(out_channels), nn.Upsample(scale_factor=4, mode="bilinear", align_corners=True), nn.Sigmoid() ) self.Conv_SemanticBranch_2 = nn.Sequential( depthwise_separable_conv(in_channels, out_channels, stride=1), nn.BatchNorm2d(out_channels), nn.Conv2d(out_channels, out_channels, kernel_size=1), nn.Sigmoid() ) self.conv_out = nn.Sequential( nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=1), nn.BatchNorm2d(out_channels), ) def forward(self, Detail_x, Semantic_x): DetailBranch_1 = self.Conv_DetailBranch_1(Detail_x) DetailBranch_2 = self.Conv_DetailBranch_2(Detail_x) SemanticBranch_1 = self.Conv_SemanticBranch_1(Semantic_x) SemanticBranch_2 = self.Conv_SemanticBranch_2(Semantic_x) out_1 = torch.matmul(DetailBranch_1, SemanticBranch_1) out_2 = torch.matmul(DetailBranch_2, SemanticBranch_2) out_2 = F.interpolate(out_2, scale_factor=4, mode="bilinear", align_corners=True) out = torch.matmul(out_1, out_2) out = self.conv_out(out) return outclass SegHead(nn.Module): def __init__(self, channels, num_classes): super().__init__() self.cls_seg = nn.Sequential( nn.Conv2d(channels, channels, 3, padding=1), nn.BatchNorm2d(channels), nn.ReLU(), nn.Conv2d(channels, num_classes, 1), ) def forward(self, x): return self.cls_seg(x)class BiSeNetV2(nn.Module): def __init__(self,in_channels=3, detail_channels=(64, 64, 128), semantic_channels=(16, 32, 64, 128), semantic_expansion_ratio=6, aggregation_channels=128, out_indices=(0, 1, 2, 3, 4), num_classes = 3): super(BiSeNetV2, self).__init__() self.in_channels = in_channels self.detail_channels = detail_channels self.semantic_expansion_ratio = semantic_expansion_ratio self.semantic_channels = semantic_channels self.aggregation_channels = aggregation_channels self.out_indices = out_indices self.num_classes = num_classes self.detail = DetailBranch(detail_channels=self.detail_channels, in_channels=self.in_channels) self.semantic = SemanticBranch(semantic_channels=self.semantic_channels, in_channels=self.in_channels,exp_ratio=self.semantic_expansion_ratio) self.AggregationLayer = AggregationLayer(in_channels=self.aggregation_channels, out_channels=self.aggregation_channels) self.seg_head_aggre = SegHead(semantic_channels[-1], self.num_classes) self.seg_heads = nn.ModuleList() self.seg_heads.append(self.seg_head_aggre) for channel in semantic_channels: self.seg_heads.append(SegHead(channel, self.num_classes)) def forward(self, x): _, _, h, w = x.size() x_detail = self.detail(x) x_semantic_lst = self.semantic(x) x_head = self.AggregationLayer(x_detail, x_semantic_lst[-1]) outs = [x_head] + x_semantic_lst[:-1] outs = [outs[i] for i in self.out_indices] out = tuple(outs) seg_out = [] for index, stage in enumerate(self.seg_heads): seg_out.append(F.interpolate(stage(out[index]),size=(h,w), mode="bilinear", align_corners=True)) return seg_outCamvid dataset# 导入库import osos.environ['CUDA_VISIBLE_DEVICES'] = '0'os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"import torchimport torch.nn as nnimport torch.nn.functional as Ffrom torch.utils.data import DataLoaderimport warningswarnings.filterwarnings("ignore")from PIL import Imageimport numpy as npimport albumentations as Afrom albumentations.pytorch.transforms import ToTensorV2torch.manual_seed(17)# 自定义数据集CamVidDatasetclass CamVidDataset(torch.utils.data.Dataset): """CamVid Dataset. Read images, apply augmentation and preprocessing transformations. Args: images_dir (str): path to images folder masks_dir (str): path to segmentation masks folder class_values (list): values of classes to extract from segmentation mask augmentation (albumentations.Compose): data transfromation pipeline (e.g. flip, scale, etc.) preprocessing (albumentations.Compose): data preprocessing (e.g. noralization, shape manipulation, etc.) """ def __init__(self, images_dir, masks_dir): self.transform = A.Compose([ A.Resize(448, 448), A.HorizontalFlip(), A.VerticalFlip(), A.Normalize(), ToTensorV2(), ]) self.ids = os.listdir(images_dir) self.images_fps = [os.path.join(images_dir, image_id) for image_id in self.ids] self.masks_fps = [os.path.join(masks_dir, image_id) for image_id in self.ids] def __getitem__(self, i): # read data image = np.array(Image.open(self.images_fps[i]).convert('RGB')) mask = np.array( Image.open(self.masks_fps[i]).convert('RGB')) image = self.transform(image=image,mask=mask) return image['image'], image['mask'][:,:,0] def __len__(self): return len(self.ids)# 设置数据集路径DATA_DIR = r'database/camvid/camvid/' # 根据自己的路径来设置x_train_dir = os.path.join(DATA_DIR, 'train_images')y_train_dir = os.path.join(DATA_DIR, 'train_labels')x_valid_dir = os.path.join(DATA_DIR, 'valid_images')y_valid_dir = os.path.join(DATA_DIR, 'valid_labels')train_dataset = CamVidDataset( x_train_dir, y_train_dir, )val_dataset = CamVidDataset( x_valid_dir, y_valid_dir, )train_loader = DataLoader(train_dataset, batch_size=16, shuffle=True,drop_last=True)val_loader = DataLoader(val_dataset, batch_size=16, shuffle=True,drop_last=True)Trainmodel = BiSeNetV2(num_classes=33)from d2l import torch as d2lfrom tqdm import tqdmimport pandas as pdimport monai# training loop 100 epochsepochs_num = 100# 选用SGD优化器来训练optimizer = torch.optim.SGD(model.parameters(), lr=0.1)schedule = monai.optimizers.LinearLR(optimizer, end_lr=0.05, num_iter=int(epochs_num*0.75))# 损失函数选用多分类交叉熵损失函数lossf = nn.CrossEntropyLoss(ignore_index=255)def evaluate_accuracy_gpu(net, data_iter, device=None): if isinstance(net, nn.Module): net.eval() # Set the model to evaluation mode if not device: device = next(iter(net.parameters())).device # No. of correct predictions, no. of predictions metric = d2l.Accumulator(2) with torch.no_grad(): for X, y in data_iter: if isinstance(X, list): # Required for BERT Fine-tuning (to be covered later) X = [x.to(device) for x in X] else: X = X.to(device) y = y.to(device) output = net(X) pred = output[0] metric.add(d2l.accuracy(pred, y), d2l.size(y)) return metric[0] / metric[1]# 训练函数def train_ch13(net, train_iter, test_iter, loss, optimizer, num_epochs, schedule, swa_start=swa_start, devices=d2l.try_all_gpus()): timer, num_batches = d2l.Timer(), len(train_iter) animator = d2l.Animator(xlabel='epoch', xlim=[1, num_epochs], ylim=[0, 1], legend=['train loss', 'train acc', 'test acc']) net = nn.DataParallel(net, device_ids=devices).to(devices[0]) # 用来保存一些训练参数 loss_list = [] train_acc_list = [] test_acc_list = [] epochs_list = [] time_list = [] lr_list = [] for epoch in range(num_epochs): # Sum of training loss, sum of training accuracy, no. of examples, # no. of predictions metric = d2l.Accumulator(4) for i, (X, labels) in enumerate(train_iter): timer.start() if isinstance(X, list): X = [x.to(devices[0]) for x in X] else: X = X.to(devices[0]) gt = labels.long().to(devices[0]) net.train() optimizer.zero_grad() result = net(X) pred = result[0] seg_loss = loss(result[0], gt) aux_loss_1 = loss(result[1], gt) aux_loss_2 = loss(result[2], gt) aux_loss_3 = loss(result[3], gt) aux_loss_4 = loss(result[4], gt) loss_sum = seg_loss + 0.2*aux_loss_1 + 0.2*aux_loss_2 + 0.2*aux_loss_3 + 0.2*aux_loss_4 l = loss_sum loss_sum.sum().backward() optimizer.step() acc = d2l.accuracy(pred, gt) metric.add(l, acc, labels.shape[0], labels.numel()) timer.stop() if (i + 1) % (num_batches // 5) == 0 or i == num_batches - 1: animator.add(epoch + (i + 1) / num_batches,(metric[0] / metric[2], metric[1] / metric[3], None)) if optimizer.state_dict()['param_groups'][0]['lr']>0.05: schedule.step() test_acc = evaluate_accuracy_gpu(net, test_iter) animator.add(epoch + 1, (None, None, test_acc)) print(f"epoch {epoch+1}/{epochs_num} --- loss {metric[0] / metric[2]:.3f} --- train acc {metric[1] / metric[3]:.3f} --- test acc {test_acc:.3f} --- lr {optimizer.state_dict()['param_groups'][0]['lr']} --- cost time {timer.sum()}") #---------保存训练数据--------------- df = pd.DataFrame() loss_list.append(metric[0] / metric[2]) train_acc_list.append(metric[1] / metric[3]) test_acc_list.append(test_acc) epochs_list.append(epoch+1) time_list.append(timer.sum()) lr_list.append(optimizer.state_dict()['param_groups'][0]['lr']) df['epoch'] = epochs_list df['loss'] = loss_list df['train_acc'] = train_acc_list df['test_acc'] = test_acc_list df["lr"] = lr_list df['time'] = time_list df.to_excel("savefile/BiseNetv2_camvid.xlsx") #----------------保存模型------------------- if np.mod(epoch+1, 5) == 0: torch.save(net.state_dict(), f'checkpoints/BiseNetv2_{epoch+1}.pth') # 保存下最后的model torch.save(net.state_dict(), f'checkpoints/BiseNetv2_last.pth')train_ch13(model, train_loader, val_loader, lossf, optimizer, epochs_num, schedule=schedule)Result

本文链接地址:https://www.jiuchutong.com/zhishi/299519.html 转载请保留说明!

上一篇:「Vue面试题」在项目中你是如何解决跨域的?(vue的常见面试题)

下一篇:GPU版本安装Pytorch教程最新方法(gpu版本的pytorch)

  • 华为荣耀微信语音窗口如何隐藏(华为荣耀微信语音一拿起来就黑屏)

    华为荣耀微信语音窗口如何隐藏(华为荣耀微信语音一拿起来就黑屏)

  • 微信收款如何同时两个手机都能收到信息(微信收款如何同意信用卡支付)

    微信收款如何同时两个手机都能收到信息(微信收款如何同意信用卡支付)

  • diamond文件夹是什么意思(diamonddownload文件夹)

    diamond文件夹是什么意思(diamonddownload文件夹)

  • 华为mate30为什么自动调节亮度会自动打开(华为mate30为什么会自动关机)

    华为mate30为什么自动调节亮度会自动打开(华为mate30为什么会自动关机)

  • 怎么保存推特视频(怎么保存推特视频到相册)

    怎么保存推特视频(怎么保存推特视频到相册)

  • 小红书怎么看自己的评论(小红书怎么看自己的浏览记录)

    小红书怎么看自己的评论(小红书怎么看自己的浏览记录)

  • 优酷上传视频有限制吗(优酷上传视频有时间限制吗)

    优酷上传视频有限制吗(优酷上传视频有时间限制吗)

  • 4g手机能用5gwifi网络吗(4g手机能用5gwifi吗)

    4g手机能用5gwifi网络吗(4g手机能用5gwifi吗)

  • 微信视频响了几声显示对方忙线中(微信视频响了几下显示对方忙线中啥意思)

    微信视频响了几声显示对方忙线中(微信视频响了几下显示对方忙线中啥意思)

  • wifi改密码后连不上网怎么办(wifi改了密码后连接不了)

    wifi改密码后连不上网怎么办(wifi改了密码后连接不了)

  • 苹果id更新设置是什么意思(苹果id更新设置必须更新吗)

    苹果id更新设置是什么意思(苹果id更新设置必须更新吗)

  • ipad有没有护眼模式(ipad 2020 护眼)

    ipad有没有护眼模式(ipad 2020 护眼)

  • 64v和60v控制器通用吗

    64v和60v控制器通用吗

  • x23和x23幻彩版手机壳通用吗(x23和x23幻彩版手机怎么区分)

    x23和x23幻彩版手机壳通用吗(x23和x23幻彩版手机怎么区分)

  • 声音图像等属于(声音与图像)

    声音图像等属于(声音与图像)

  • vivox21i和vivox21ia手机壳一样吗(vivox21izol)

    vivox21i和vivox21ia手机壳一样吗(vivox21izol)

  • 华为mate30怎么导出联系人(华为mate30怎么导入旧手机数据)

    华为mate30怎么导出联系人(华为mate30怎么导入旧手机数据)

  • 苹果手机怎么长截屏(苹果手机怎么长截图屏幕)

    苹果手机怎么长截屏(苹果手机怎么长截图屏幕)

  • 以动衬静的句子(山居秋暝以动衬静的句子)

    以动衬静的句子(山居秋暝以动衬静的句子)

  • 苹果x锁屏反应慢怎么调(苹果x锁屏的时候反应慢)

    苹果x锁屏反应慢怎么调(苹果x锁屏的时候反应慢)

  • 视频如何解除drm保护(视频怎么解)

    视频如何解除drm保护(视频怎么解)

  • 火山怎么飘屏呢(火山怎么设置大屏模式)

    火山怎么飘屏呢(火山怎么设置大屏模式)

  • ppt2016暗香扑面找不到(ppt2016暗香扑面主题)

    ppt2016暗香扑面找不到(ppt2016暗香扑面主题)

  • 快手被拉黑了是不是看不到别人作品(快手被拉黑了是不是就关注不了对方了)

    快手被拉黑了是不是看不到别人作品(快手被拉黑了是不是就关注不了对方了)

  • 5g是换手机卡还是手机(5g是换手机卡还是换手机)

    5g是换手机卡还是手机(5g是换手机卡还是换手机)

  • 微信语音连接失败怎么回事(微信语音连接失败是拒绝吗)

    微信语音连接失败怎么回事(微信语音连接失败是拒绝吗)

  • Vue项目使用百度地图api(vue项目使用百度地图怎么长久使用)

    Vue项目使用百度地图api(vue项目使用百度地图怎么长久使用)

  • 如何理解增值税中性特点
  • 个人社保部分公司承担可以入费用吗
  • 金税盘服务费全额抵扣是什么意思
  • 个人所得税年度汇算时间
  • 工会经费计税依据是应发还是实发
  • 税控盘减免税额转出会计分录
  • 增值税转型前后对比
  • 单位发放奖金如何做账
  • 固定资产提折旧 账务处理
  • 库存商品内部划转怎么做账?
  • 全资子公司改为独立公司
  • 报表中应交税金和应交税费区别
  • 净资产出资属于货币出资吗
  • 收回的材料赔偿款怎么处理?
  • 营改增后增加了什么征税项目
  • 增值税零税率发票开具条件
  • 应交增值税减免税额期末需要结转吗
  • 公司一年未经营怎么赔偿
  • 建筑业的合理避税
  • 董事会费汇算清缴计入什么科目
  • 附加税的税率表
  • 防暑降温列支什么费用
  • 库存商品暂估成本多少合适
  • 产品成本差异总额计算公式
  • 福利费专票进项转出怎么做账
  • 哪些税计入原材料费用
  • 烟丝消费税组成计税价格
  • 民办非企业捐赠支出会计处理
  • 农业自产自销如何认定
  • 如果注册资本没有到账如何处理
  • 事业单位福利发放时间
  • windows7电脑时间不对
  • php imagettftext
  • 会计准则 职工福利
  • php分类信息
  • 美国疫情告急,请求中国医护人员支援,大家怎么看呢?
  • 润滑油一般纳税人税率是多少
  • 电费发票查不到
  • 什么是付出对价的公允
  • 企业注销时实收资本没有到位要紧吗
  • 百度编辑器使用方法
  • 织梦如何使用
  • 不动产进项税额抵扣新政策2021
  • 预缴分包抵扣,所有分包都可以抵扣吗
  • 普票要盖发票的章子吗
  • vs2015安装方法
  • 异地预缴的企业所得税
  • 转账结算的原则是什么
  • 网银转账退回来是怎么回事
  • 销售商品的折扣
  • 提前支付利息如何计算
  • 日记账的设置和登记
  • sql查询界面怎么做
  • sql server如何查询
  • mysql 5.6.14 win32 解压缩版(免安装)安装配置教程
  • windows主进程rundll32总是用麦克风
  • linux不常用命令
  • vpngui.exe是什么进程
  • 联想笔记本从U盘启动
  • windows8关机在哪里
  • 电脑关机没反应怎么办出现一个对话框
  • os x 10.10.5
  • win7系统运行卡怎么办
  • win7怎么设置局域网共享文件和操作
  • 微软win8.1
  • windows mobile
  • 锁屏壁纸设置后不显示怎么办
  • javascript.info
  • shell脚本实现同时多台远程主机执行命令的代码分享
  • jQuery使用$.ajax进行即时验证实例详解
  • jquery实现移动端
  • android资源文件
  • js响应鼠标点击不了
  • js怎么设置css
  • python如何安装各种库
  • 深入理解新发展理念心得体会3篇
  • 江苏省发票真伪查询网站
  • 电子税务局待确认授权不能勾选
  • 云南省地方税务局公告2017年第2号
  • 营业税发票现在可以冲红么
  • 免责声明:网站部分图片文字素材来源于网络,如有侵权,请及时告知,我们会第一时间删除,谢谢! 邮箱:opceo@qq.com

    鄂ICP备2023003026号