（六）可视化KITTI标注的2D检测框

tech2022-07-29 217

一、KITTI 标注文件格式

在以下网址下载已经标注好的文件，里面有20个场景的标注文件。寻找与自己对应场景的文件。比如我对应的为文档里的0004.txt 。　　可以从https://blog.csdn.net/qq_29931083/article/details/106460698 这里下载　　通过Tracking里的README可以看到文本记录的格式https://blog.csdn.net/qq_29931083/article/details/106504415

Values Name Description

1 frame Frame within the sequence where the object appearers 1 track id Unique tracking id of this object within this sequence 1 type Describes the type of object: ‘Car’, ‘Van’, ‘Truck’, ‘Pedestrian’, ‘Person_sitting’, ‘Cyclist’, ‘Tram’, ‘Misc’ or ‘DontCare’ 1 truncated Integer (0,1,2) indicating the level of truncation.Note that this is in contrast to the object detection benchmark where truncation is a float in [0,1]. 1 occluded Integer (0,1,2,3) indicating occlusion state: 0 = fully visible, 1 = partly occluded 2 = largely occluded, 3 = unknown 1 alpha Observation angle of object, ranging [-pi…pi] 4 bbox 2D bounding box of object in the image (0-based index): contains left, top, right, bottom pixel coordinates 3 dimensions 3D object dimensions: height, width, length (in meters) 3 location 3D object location x,y,z in camera coordinates (in meters) 1 rotation_y Rotation ry around Y-axis in camera coordinates [-pi…pi] 1 score Only for results: Float, indicating confidence in detection, needed for p/r curves, higher is better.

二、画出2D检测框

使用jupyter notebook进行调试，将文件导入查看文件内容：其中包括了每一帧的帧序号frame；每一个物体特有的id track_id；2D 检测框记录的数据：bbox_lift，bbox_top，bbox_right，bbox_bottom；

import pandas as pd Columns_name = ['frame','track_id','type','truncated','occluded','alpha','bbox_lift','bbox_top','bbox_right','bbox_bottom','height','width','length','pos_x','pos_y','pos_z','rot_y'] df = pd.read_csv('/home/liqi/dev/catkin_ws/src/KITTI_tutorials/2011_09_26_drive_0014_sync/training/label_02/0004.txt',header = None,sep = ' ') df.columns = Columns_name df.head()

其中的frame与资料集中的文件相对应，如第0帧即frame 为0时，可对应下图有4辆car和一辆Van。为了后续的方便将 type 中的’Van’,‘Truck’,‘Tram’ 都转为 Car，使用以下语句，首先使用df.type.isin() 定位出’Van’,‘Truck’,‘Tram’ 是否在type中；显示为True的通过df.loc[ ,’type’] =’Car’ 定位并将其中的的type类型改为Car

df.loc[df.type.isin(['Van','Truck','Tram']),'type']='Car'

修改完成以后类别中只剩下 ‘Car’,‘Cyclist’,‘Pedestrian’ 和 DontCare Misc 等类别，为了简化数据只保留type为 ‘Car’,‘Cyclist’,‘Pedestrian’ 的数据

df=df[df.type.isin(['Car','Cyclist','Pedestrian'])]

筛选完成以后数据从2012个减少为1113个，有助于后续的处理。

tracking文件中，‘bbox_lift’,‘bbox_top’,‘bbox_right’,‘bbox_bottom’ 是在2D上面的资料，分别代表Bounding Box的边界分别以图像0为基准（图像最左上角），的像素距离。由KITTI的相机坐标系可以知道，从０开始，Ｘ向右为正，Ｙ向下为正，以此来组成2D的图像，以两条x 两条y来画出固定的框。因此可以认为，BoundingBox的左上点为x=bbox_lift y=bbox_top 右下点为 x = bbox_right y = bbox_bottom。

于是先提取第frame帧的2D检测框的数据即’bbox_lift’,‘bbox_top’,‘bbox_right’,‘bbox_bottom’ 和类型type，先将其原始列表提取出来，再使用np.array()转换成可以处理的数据类型

boxs = np.array(df.loc[df.frame.isin([frame])][['bbox_lift','bbox_top','bbox_right','bbox_bottom']]) types = np.array(df.loc[df.frame.isin([frame])]['type'])

PS:其实可以使用Pands的df[df.frame==0][[‘bbox_lift’,‘bbox_top’,‘bbox_right’,‘bbox_bottom’]]此语句进行提取，但是使用时本机会出现报错，故换一种方式。

当提提取出boxs后，因此左上点和右下点分别可以表示为：

top_lift = int(box[0]),int(box[1]) bottom_right = int(box[2]),int(box[3])

当提取出来types之后，可以通过type，判断数据属于’Car’,‘Cyclist’,‘Pedestrian’ 哪一种，一次来添加type名称和区分更改BoundingBox的颜色，对类别建立一个DICT

# 建立一个DICT 不同类别使用不同的标示颜色 DETECTION_COLOR_DICT = {'Car': (255,255,0),'Pedestrian':(0,266,255),'Cyclist':(141,40,255)}

为了实现对每一帧即不同物体的识别检测，使用for循环进行循环操作：

for typ, box in list(zip(types, boxs)): top_lift = int(box[0]),int(box[1]) bottom_right = int(box[2]),int(box[3]) cv2.rectangle(image,top_lift,bottom_right,DETECTION_COLOR_DICT[typ],2)

PS:使用list(zip(types, boxs))来将两个类别进行串接，可以达到分别读取操作的效果。 zip()函数在python２和python3中不同，py3中需要用list显示具体信息 list(zip(types, boxs))，而py2不需要； cv2.rectangle(image, 左上点, 右下点, 颜色BGR , 线型)，将画出的BOX加入到image之上。

效果如下图所示，这里Car定义为水蓝色， Cyclist为玫红色， Pedestrian为黄色。

整体在python3环境下的测试代码如下： Tracking_test.py

import numpy as np import pandas as pd Columns_name = ['frame','track_id','type','truncated','occluded','alpha','bbox_lift','bbox_top','bbox_right','bbox_bottom','height','width','length','pos_x','pos_y','pos_z','rot_y'] df = pd.read_csv('/home/liqi/dev/catkin_ws/src/KITTI_tutorials/2011_09_26_drive_0014_sync/training/label_02/0004.txt',header = None,sep = ' ') df.columns = Columns_name frame = 200 # 将type中所有车辆都归为一类Car df.loc[df.type.isin(['Van','Truck','Tram']),'type']='Car' #　去除　除了Ｃａｒ Cyclist Pedestrian以为的数据 df=df[df.type.isin(['Car','Cyclist','Pedestrian'])] # 筛选frame==0的行，使用df[df.frame==0][['bbox_lift','bbox_top','bbox_right','bbox_bottom']]报错故换种方式 # df.loc[df.frame.isin([0])][['bbox_lift','bbox_top','bbox_right','bbox_bottom']] boxs = np.array(df.loc[df.frame.isin([frame])][['bbox_lift','bbox_top','bbox_right','bbox_bottom']]) # box = np.array(df.loc[0,['bbox_lift','bbox_top','bbox_right','bbox_bottom']]) types = np.array(df.loc[df.frame.isin([frame])]['type']) import cv2 image = cv2.imread('/home/liqi/dev/catkin_ws/src/KITTI_tutorials/2011_09_26_drive_0014_sync/image_02/data/%010d.png'%frame) # 建立一个DICT 不同类别使用不同的标示颜色 DETECTION_COLOR_DICT = {'Car': (255,255,0),'Pedestrian':(0,266,255),'Cyclist':(141,40,255)} # zip()函数在python２和python3中不同，py3中需要用list显示具体信息 list(zip(types, boxs)) for typ, box in list(zip(types, boxs)): top_lift = int(box[0]),int(box[1]) bottom_right = int(box[2]),int(box[3]) cv2.rectangle(image,top_lift,bottom_right,DETECTION_COLOR_DICT[typ],2) cv2.imshow('img',image) cv2.waitKey(0) cv2.destroyAllWindows()

将其移植到ROS中 data_utils.py

def read_tracking(path): df = pd.read_csv(path, header = None,sep = ' ') df.columns = Tracking_Columns_name # 将type中所有车辆都归为一类Car df.loc[df.type.isin(['Van','Truck','Tram']),'type']='Car' # 去除　除了Ｃａｒ Cyclist Pedestrian以为的数据 df=df[df.type.isin(['Car','Cyclist','Pedestrian'])] return df

kitti.py中加入读取Tracking文件

# 读取Tracking文件 df_tracking = read_tracking('/home/liqi/dev/catkin_ws/src/KITTI_tutorials/2011_09_26_drive_0014_sync/training/label_02/0004.txt')

publish.py

def publish_camera (cam_pub,bridge,image, types,boxs): # zip()函数在python２和python3中不同，py3中需要用list显示具体信息 list(zip(types, boxs)) for typ, box in list(zip(types, boxs)): top_lift = int(box[0]),int(box[1]) bottom_right = int(box[2]),int(box[3]) cv2.rectangle(image, top_lift,bottom_right,DETECTION_COLOR_DICT[typ],2) # 使用CvBridge将opencv图像转为image cam_pub.publish(bridge.cv2_to_imgmsg(image,"bgr8"))

最新回复(0)