在以下网址下载已经标注好的文件,里面有20个场景的标注文件。寻找与自己对应场景的文件。比如我对应的为文档里的0004.txt 。 可以从https://blog.csdn.net/qq_29931083/article/details/106460698 这里下载 通过Tracking里的README可以看到文本记录的格式https://blog.csdn.net/qq_29931083/article/details/106504415
1 frame Frame within the sequence where the object appearers 1 track id Unique tracking id of this object within this sequence 1 type Describes the type of object: ‘Car’, ‘Van’, ‘Truck’, ‘Pedestrian’, ‘Person_sitting’, ‘Cyclist’, ‘Tram’, ‘Misc’ or ‘DontCare’ 1 truncated Integer (0,1,2) indicating the level of truncation.Note that this is in contrast to the object detection benchmark where truncation is a float in [0,1]. 1 occluded Integer (0,1,2,3) indicating occlusion state: 0 = fully visible, 1 = partly occluded 2 = largely occluded, 3 = unknown 1 alpha Observation angle of object, ranging [-pi…pi] 4 bbox 2D bounding box of object in the image (0-based index): contains left, top, right, bottom pixel coordinates 3 dimensions 3D object dimensions: height, width, length (in meters) 3 location 3D object location x,y,z in camera coordinates (in meters) 1 rotation_y Rotation ry around Y-axis in camera coordinates [-pi…pi] 1 score Only for results: Float, indicating confidence in detection, needed for p/r curves, higher is better.
使用jupyter notebook进行调试,将文件导入查看文件内容: 其中包括了每一帧的帧序号frame;每一个物体特有的id track_id;2D 检测框记录的数据:bbox_lift,bbox_top,bbox_right,bbox_bottom;
import pandas as pd Columns_name = ['frame','track_id','type','truncated','occluded','alpha','bbox_lift','bbox_top','bbox_right','bbox_bottom','height','width','length','pos_x','pos_y','pos_z','rot_y'] df = pd.read_csv('/home/liqi/dev/catkin_ws/src/KITTI_tutorials/2011_09_26_drive_0014_sync/training/label_02/0004.txt',header = None,sep = ' ') df.columns = Columns_name df.head()其中的frame与资料集中的文件相对应,如第0帧 即frame 为0时,可对应下图有4辆car和一辆Van。 为了后续的方便 将 type 中的’Van’,‘Truck’,‘Tram’ 都转为 Car,使用以下语句,首先使用df.type.isin() 定位出’Van’,‘Truck’,‘Tram’ 是否在type中;显示为True的通过df.loc[ ,’type’] =’Car’ 定位 并将其中的的type类型改为Car
df.loc[df.type.isin(['Van','Truck','Tram']),'type']='Car'修改完成以后类别中只剩下 ‘Car’,‘Cyclist’,‘Pedestrian’ 和 DontCare Misc 等类别,为了简化数据只保留type为 ‘Car’,‘Cyclist’,‘Pedestrian’ 的数据
df=df[df.type.isin(['Car','Cyclist','Pedestrian'])]筛选完成以后数据从2012个减少为1113个,有助于后续的处理。
tracking文件中,‘bbox_lift’,‘bbox_top’,‘bbox_right’,‘bbox_bottom’ 是在2D上面的资料,分别代表Bounding Box的边界分别以图像0为基准(图像最左上角),的像素距离。由KITTI的相机坐标系可以知道,从0开始,X向右为正,Y向下为正,以此来组成2D的图像,以两条x 两条y来画出固定的框。因此可以认为,BoundingBox的左上点为x=bbox_lift y=bbox_top 右下点为 x = bbox_right y = bbox_bottom。
于是先提取第frame帧的2D检测框的数据即’bbox_lift’,‘bbox_top’,‘bbox_right’,‘bbox_bottom’ 和类型type,先将其原始列表提取出来,再使用np.array()转换成可以处理的数据类型
boxs = np.array(df.loc[df.frame.isin([frame])][['bbox_lift','bbox_top','bbox_right','bbox_bottom']]) types = np.array(df.loc[df.frame.isin([frame])]['type'])PS:其实可以使用Pands的df[df.frame==0][[‘bbox_lift’,‘bbox_top’,‘bbox_right’,‘bbox_bottom’]]此语句进行提取,但是使用时本机会出现报错,故换一种方式。
当提提取出boxs后,因此左上点和右下点分别可以表示为:
top_lift = int(box[0]),int(box[1]) bottom_right = int(box[2]),int(box[3])当提取出来types之后,可以通过type,判断数据属于’Car’,‘Cyclist’,‘Pedestrian’ 哪一种,一次来添加type名称和区分更改BoundingBox的颜色,对类别建立一个DICT
# 建立一个DICT 不同类别使用不同的标示颜色 DETECTION_COLOR_DICT = {'Car': (255,255,0),'Pedestrian':(0,266,255),'Cyclist':(141,40,255)}为了实现对每一帧即不同物体的识别检测,使用for循环进行循环操作:
for typ, box in list(zip(types, boxs)): top_lift = int(box[0]),int(box[1]) bottom_right = int(box[2]),int(box[3]) cv2.rectangle(image,top_lift,bottom_right,DETECTION_COLOR_DICT[typ],2)PS:使用list(zip(types, boxs))来将两个类别进行串接,可以达到分别读取操作的效果。 zip()函数在python2和python3中不同,py3中需要用list显示具体信息 list(zip(types, boxs)),而py2不需要; cv2.rectangle(image, 左上点, 右下点, 颜色BGR , 线型),将画出的BOX加入到image之上。
效果如下图所示,这里Car定义为水蓝色, Cyclist为玫红色, Pedestrian为黄色。
整体在python3环境下的测试代码如下: Tracking_test.py
import numpy as np import pandas as pd Columns_name = ['frame','track_id','type','truncated','occluded','alpha','bbox_lift','bbox_top','bbox_right','bbox_bottom','height','width','length','pos_x','pos_y','pos_z','rot_y'] df = pd.read_csv('/home/liqi/dev/catkin_ws/src/KITTI_tutorials/2011_09_26_drive_0014_sync/training/label_02/0004.txt',header = None,sep = ' ') df.columns = Columns_name frame = 200 # 将type中所有车辆都归为一类Car df.loc[df.type.isin(['Van','Truck','Tram']),'type']='Car' # 去除 除了Car Cyclist Pedestrian以为的数据 df=df[df.type.isin(['Car','Cyclist','Pedestrian'])] # 筛选frame==0的行,使用df[df.frame==0][['bbox_lift','bbox_top','bbox_right','bbox_bottom']]报错故换种方式 # df.loc[df.frame.isin([0])][['bbox_lift','bbox_top','bbox_right','bbox_bottom']] boxs = np.array(df.loc[df.frame.isin([frame])][['bbox_lift','bbox_top','bbox_right','bbox_bottom']]) # box = np.array(df.loc[0,['bbox_lift','bbox_top','bbox_right','bbox_bottom']]) types = np.array(df.loc[df.frame.isin([frame])]['type']) import cv2 image = cv2.imread('/home/liqi/dev/catkin_ws/src/KITTI_tutorials/2011_09_26_drive_0014_sync/image_02/data/%010d.png'%frame) # 建立一个DICT 不同类别使用不同的标示颜色 DETECTION_COLOR_DICT = {'Car': (255,255,0),'Pedestrian':(0,266,255),'Cyclist':(141,40,255)} # zip()函数在python2和python3中不同,py3中需要用list显示具体信息 list(zip(types, boxs)) for typ, box in list(zip(types, boxs)): top_lift = int(box[0]),int(box[1]) bottom_right = int(box[2]),int(box[3]) cv2.rectangle(image,top_lift,bottom_right,DETECTION_COLOR_DICT[typ],2) cv2.imshow('img',image) cv2.waitKey(0) cv2.destroyAllWindows()将其移植到ROS中 data_utils.py
def read_tracking(path): df = pd.read_csv(path, header = None,sep = ' ') df.columns = Tracking_Columns_name # 将type中所有车辆都归为一类Car df.loc[df.type.isin(['Van','Truck','Tram']),'type']='Car' # 去除 除了Car Cyclist Pedestrian以为的数据 df=df[df.type.isin(['Car','Cyclist','Pedestrian'])] return dfkitti.py中加入读取Tracking文件
# 读取Tracking文件 df_tracking = read_tracking('/home/liqi/dev/catkin_ws/src/KITTI_tutorials/2011_09_26_drive_0014_sync/training/label_02/0004.txt')publish.py
def publish_camera (cam_pub,bridge,image, types,boxs): # zip()函数在python2和python3中不同,py3中需要用list显示具体信息 list(zip(types, boxs)) for typ, box in list(zip(types, boxs)): top_lift = int(box[0]),int(box[1]) bottom_right = int(box[2]),int(box[3]) cv2.rectangle(image, top_lift,bottom_right,DETECTION_COLOR_DICT[typ],2) # 使用CvBridge将opencv图像转为image cam_pub.publish(bridge.cv2_to_imgmsg(image,"bgr8"))