VOT

2024-07-09 19:51:01

导读【#VOT#】1.标注框：Rectangle Polygon经典的OBT标注格式通常是这样的：(x, y, w, h)，左上角点的位置，宽，高。现有的VOT数据集优化...

【#VOT#】

1.标注框：Rectangle Polygon

经典的OBT标注格式通常是这样的：(x, y, w, h)，左上角点的位置，宽，高。

现有的VOT数据集优化了物体的表示，通过旋转矩形框来标注物体。这样一般在gt标注信息里面会看到一行有8个数字，分别代表了物体每个点的x y位置。

下面结合 vot toolkit 2016 版本的代码来分析。

源代码是python2，运行环境是python3，做了一定修改。

封装元素

collections.namedtuple是python的一个工厂方法，可以定义继承tuple的子类。这使得程序可以通过索引名访问元素。

这里定义了Rectangle类用于存储矩形框标注，Polygon存储多边形标注，Point存储每个点的信息，作为polygon的基本元素。

Rectangle = collections . namedtuple ( 'Rectangle' , [ 'x' , 'y' , 'width' , 'height' ])

Point = collections . namedtuple ( 'Point' , [ 'x' , 'y' ])

Polygon = collections . namedtuple ( 'Polygon' , [ 'points' ])

标注编解码

def parse_region ( string ): tokens = list ( map ( float , string . split ( ',' ))) if len ( tokens ) == 4 : return Rectangle ( tokens [ 0 ], tokens [ 1 ], tokens [ 2 ], tokens [ 3 ]) elif len ( tokens ) % 2 == 0 and len ( tokens ) > 4 : return Polygon ([ Point ( tokens [ i ], tokens [ i + 1 ]) for i in range ( 0 , len ( tokens ), 2 )]) return None def encode_region ( region ): if isinstance ( region , Polygon ): return ',' . join ([ ' {} , {} ' . format ( p . x , p . y ) for p in region . points ]) elif isinstance ( region , Rectangle ): return ' {} , {} , {} , {} ' . format ( region . x , region . y , region . width , region . height ) else : return ""

parse_region：

解码标注信息。将文件读取每一行之后。按逗号分隔。如果元素数为4，则封装为Rectangle类对象；如果元素为偶数且大于4，则每两个元素封装为一个Point对象，所有Point对象生成一个list封装到一个Polygon对象里面。

encode_region：

将标注转化为一行数据，每个数据之间逗号隔开。

格式转换

输入分别是标注信息，以及转换类型。

从Polygon到Rectangle

比较简单的比较，核心就是将旋转矩形框的最上下左右的坐标找出来，画出一个矩形。并且用(x, y, w, h)保存

从Rectangle到Polygon

只是将(x, y, w, h)简单转为4个点的坐标。

def convert_region ( region , to ): if to == 'rectangle' : if isinstance ( region , Rectangle ): return copy . copy ( region ) elif isinstance ( region , Polygon ): top = sys . float_info . max bottom = sys . float_info . min left = sys . float_info . max right = sys . float_info . min for point in region . points : top = min ( top , point . y ) bottom = max ( bottom , point . y ) left = min ( left , point . x ) right = max ( right , point . x ) return Rectangle ( left , top , right - left , bottom - top ) else : return None

if to == 'polygon' : if isinstance ( region , Rectangle ): points = [] points . append (( region . x , region . y )) points . append (( region . x + region . width , region . y )) points . append (( region . x + region . width , region . y + region . height )) points . append (( region . x , region . y + region . height )) return Polygon ( points ) elif isinstance ( region , Polygon ): return copy . copy ( region ) else : return None

return None

2.可视化

注意到这里是用plot可视化的，所以与cv2有所区别，y轴方向是反的。

Plot函数

手写一个函数用于打印Polygon格式的数据。

def pltPolygon ( Polygon , color = 'r' ): plt . axis ( 'equal' )

points = Polygon [ 0 ] if len ( points )

< 2 : return for i in range ( len ( points )): p1 = points [ i ] if i == len ( points )

- 1 : p2 = points [ 0 ] else : p2 = points [ i + 1 ] plt . plot ([ p1 [ 0 ], p2 [ 0 ]],[ p1 [ 1 ], p2 [ 1 ]], color = color )

可视化gt（Polygon）

从motocross1选择数据

gt = '237.59,104.16,303.42,36.05,381.48,109.48,315.65,177.59' poly = parse_region ( gt ) pltPolygon ( poly ) plt . show ()

输出格式化的标注信息：

Polygon(points=[Point(x=237.59, y=104.16), Point(x=303.42, y=36.05), Point(x=381.48, y=109.48), Point(x=315.65, y=177.59)])

转化GT到矩形框并可视化：

rect = convert_region(poly,'rectangle') prect = convert_region(rect,'polygon') pltPolygon(poly,'r') pltPolygon(prect,'y') plt.show()

输出标准矩形框：

Rectangle(x=237.59, y=36.05, width=143.89000000000001, height=141.54000000000002)

3.数值计算(updatenet代码)

axis_aligned_bbox

查了一下代码，是使用轴对齐bbox作为目标位置信息。

格式为（cx, cy, w, h)，中心点的横纵坐标，宽，高。

那么如何从旋转矩形标注得到轴对齐bbox 呢？

def get_axis_aligned_bbox ( region ): region = np . array ([ region [ 0 ][ 0 ][ 0 ], region [ 0 ][ 0 ][ 1 ], region [ 0 ][ 1 ][ 0 ], region [ 0 ][ 1 ][ 1 ], region [ 0 ][ 2 ][ 0 ], region [ 0 ][ 2 ][ 1 ], region [ 0 ][ 3 ][ 0 ], region [ 0 ][ 3 ][ 1 ]]) cx = np . mean ( region [ 0 :: 2 ]) cy = np . mean ( region [ 1 :: 2 ]) x1 = min ( region [ 0 :: 2 ]) x2 = max ( region [ 0 :: 2 ]) y1 = min ( region [ 1 :: 2 ]) y2 = max ( region [ 1 :: 2 ]) A1 = np . linalg . norm ( region [ 0 : 2 ] - region [ 2 : 4 ]) * np . linalg . norm ( region [ 2 : 4 ] - region [ 4 : 6 ]) A2 = ( x2 - x1 ) * ( y2 - y1 ) s = np . sqrt ( A1 / A2 ) w = s * ( x2 - x1 ) + 1 h

= s * ( y2 - y1 ) + 1 return cx , cy , w , h

这里region[0::2]，是numpy 索引的一种写法，即arr[start::step]，从start索引开始以步长step遍历数组。

中心点只要求出均值即可

x1,x2,y1,y2就是标注框在四个方向的坐标。

这样就求出来一个缩放比例一样的东西，有点迷惑。反正最后

无视1个像素的边界条件，有：

这么一看好像有点头绪了，原标注框本身也是矩形，A1就是原标注框面积，A2是（xywh）矩形框的面积。也就是说轴对齐bbox的面积与原标注面积的比值等于1,并且长宽比等于xywh矩形的长宽比。

可视化如下：

可以看到与两种标注都是有所区别的。

Overlap_ratio：

交并比IOU是目标检测与跟踪常用的一个指标。公式是：

def overlap_ratio ( rect1 , rect2 ): ''' Compute overlap ratio between two rects - rect: 1d array of [x,y,w,h] or 2d array of N x [x,y,w,h] ''' if rect1 . ndim == 1 : rect1 = rect1 [ None ,:] if rect2 . ndim == 1 : rect2 = rect2 [ None ,:] left = np . maximum ( rect1 [:, 0 ], rect2 [:, 0 ]) right = np . minimum ( rect1 [:, 0 ] + rect1 [:, 2 ], rect2 [:, 0 ] + rect2 [:, 2 ]) top = np . maximum ( rect1 [:, 1 ], rect2 [:, 1 ]) bottom = np . minimum ( rect1 [:, 1 ] + rect1 [:, 3 ], rect2 [:, 1 ] + rect2 [:, 3 ]) intersect = np . maximum ( 0 , right - left ) * np . maximum ( 0 , bottom - top ) union = rect1 [:, 2 ] * rect1 [:, 3 ] + rect2 [:, 2 ] * rect2 [:, 3 ] - intersect iou

= np . clip ( intersect / union , 0 , 1 ) return iou

注意输入格式(x, y, w, h)，可以批量计算。

思路比较简单，只要求出交集矩形的面积就可以了。

比较两个矩形框左边的最大值，右边的最大值。顶部最大值，底部最小值。

这样就得到了交集矩形的宽高right - left，与bottom-top（如果有的话）

最后用np.maximum(0,right - left)做一个限制，宽高有负数的时候输出为0。

最后按公式计算即可。

【#VOT#】到此分享完毕，希望对大家有所帮助。

免责声明：本文由用户上传，如有侵权请联系删除！

标签： vot

上一篇:手掌脱皮是什么原因

下一篇:最后一页

VOT

猜你喜欢

最新文章