在以下网址下载已经标注好的文件,里面有20个场景的标注文件。寻找与自己对应场景的文件。比如我对应的为文档里的0004.txt 。 可以从https://blog.csdn.net/qq_29931083/article/details/106460698 这里下载 通过Tracking里的README可以看到文本记录的格式https://blog.csdn.net/qq_29931083/article/details/106504415
1 frame Frame within the sequence where the object appearers 1 track id Unique tracking id of this object within this sequence 1 type Describes the type of object: ‘Car’, ‘Van’, ‘Truck’, ‘Pedestrian’, ‘Person_sitting’, ‘Cyclist’, ‘Tram’, ‘Misc’ or ‘DontCare’ 1 truncated Integer (0,1,2) indicating the level of truncation.Note that this is in contrast to the object detection benchmark where truncation is a float in [0,1]. 1 occluded Integer (0,1,2,3) indicating occlusion state: 0 = fully visible, 1 = partly occluded 2 = largely occluded, 3 = unknown 1 alpha Observation angle of object, ranging [-pi…pi] 4 bbox 2D bounding box of object in the image (0-based index): contains left, top, right, bottom pixel coordinates 3 dimensions 3D object dimensions: height, width, length (in meters) 3 location 3D object location x,y,z in camera coordinates (in meters) 1 rotation_y Rotation ry around Y-axis in camera coordinates [-pi…pi] 1 score Only for results: Float, indicating confidence in detection, needed for p/r curves, higher is better.
使用jupyter notebook进行调试,将文件导入查看文件内容:
import pandas as pd Columns_name = ['frame','track_id','type','truncated','occluded','alpha','bbox_lift','bbox_top','bbox_right','bbox_bottom','height','width','length','pos_x','pos_y','pos_z','rot_y'] df = pd.read_csv('/home/liqi/dev/catkin_ws/src/KITTI_tutorials/2011_09_26_drive_0014_sync/training/label_02/0004.txt',header = None,sep = ' ') df.columns = Columns_name df.head()其中包括了每一帧的帧序号frame;每一个物体特有的id track_id;3D 检测框记录的数据:height,width,bbox_right,length,pos_x,pos_y,pos_z,rot_y;
其中’height’,‘width’,'length’为3D Bouding Box 分别的高 宽 长;其中’pos_x’,‘pos_y’,'pos_z’代表3D Bouding Box 下底面中心点的坐标;其中’rot_y’代表3D Bouding Box 绕着yaw轴旋转角度,可以计算旋转矩阵R。yaw即笛卡尔坐标系下绕y轴旋转的角度。具体公式如下:注:此使用的yaw=rot_y 而不是IMU中的yaw 3D旋转矩阵的推倒可以通过此处查看https://blog.csdn.net/mightbxg/article/details/793636993DBoundingBox的资料参考此处: https://github.com/navoshta/KITTI-Dataset/blob/master/kitti-dataset.ipynb
The reference point for the 3D bounding box for each object is centered on the bottom face of the box. The corners of bounding box are computed as follows with respect to the reference point and in the object coordinate system: x_corners = [l/2, l/2, -l/2, -l/2, l/2, l/2, -l/2, -l/2]^T y_corners = [0, 0, 0, 0, -h, -h, -h, -h ]^T z_corners = [w/2, -w/2, -w/2, w/2, w/2, -w/2, -w/2, w/2 ]^T with l=length, h=height, and w=width.
在KITTI的数据集的Tracking文件中,其中与激光雷达即3D检测相关的参数有七个,使用这七个参数通过以上公式可以计算出3DBoundingBox的8个顶点的x,y,z。再根据rot_y进行旋转即可得到3D Bounding Box在点云数据上的具体位置;但需要注意的是,这是在Camera2 坐标系下的数据,因此需要进行坐标转换,将其转换到Voledyne Lider坐标系下。 将3D Bouding Box 的8个顶点坐标与旋转矩阵相乘,即可得到cam2坐标系下,旋转变换后的点坐标,子函数代码如下:
# cam2坐标下的 3D box def compute_3D_box_cam2(h,w,l,x,y,z,yaw): ''' Return:3Xn in cam2 coordinate ''' # 建立旋转矩阵R R = np.array([[np.cos(yaw), 0, np.sin(yaw)],[0,1,0],[-np.sin(yaw),0,np.cos(yaw)]]) # 计算8个顶点坐标 x_corners = [l/2,l/2,-l/2,-l/2,l/2,l/2,-l/2,-l/2] y_corners = [0,0,0,0,-h,-h,-h,-h] z_corners = [w/2,-w/2,-w/2,w/2,w/2,-w/2,-w/2,w/2] # 使用旋转矩阵变换坐标 corners_3d_cam2 = np.dot(R, np.vstack([x_corners, y_corners, z_corners])) # 最后在加上中心点 corners_3d_cam2 += np.vstack([x, y, z]) return corners_3d_cam2如果是使用jupyter notebook 可以用以下代码画出3D box进行调试:
def draw_box(ax, vertices, axes=[0, 1, 2], color='black'): """ Draws a bounding 3D box in a pyplot axis. Parameters ---------- pyplot_axis : Pyplot axis to draw in. vertices : Array 8 box vertices containing x, y, z coordinates. axes : Axes to use. Defaults to `[0, 1, 2]`, e.g. x, y and z axes. color : Drawing color. Defaults to `black`. """ vertices = vertices[axes, :] connections = [ [0, 1], [1, 2], [2, 3], [3, 0], # Lower plane parallel to Z=0 plane [4, 5], [5, 6], [6, 7], [7, 4], # Upper plane parallel to Z=0 plane [0, 4], [1, 5], [2, 6], [3, 7] # Connections between upper and lower planes ] for connection in connections: ax.plot(*vertices[:, connection], c=color, lw=0.5) # 现在所处的为cam坐标系 corners_3d_cam2 = compute_3D_box_cam2(*df_tracking.loc[2,['height','width','length','pos_x','pos_y','pos_z','rot_y']]) # 画图程序 fig = plt.figure(figsize = (20,10)) ax = fig.add_subplot(111, projection = '3d') ax.view_init(40,150) draw_box(ax,(corners_3d_cam2))通过以上代码可以画出cam2坐标系下,其中一个frmae = 2的3D Box如下图:
从cam2坐标系下的3D Box可以看出,这个Box存在一定的变形,是因为激光雷达velodyne的坐标系与cam2坐标系不同,需要将在相机坐标系下的此标注文件的数据,转化到velodyne激光雷达坐标系上,进行坐标系之间的变换。 坐标变换首先需要知道原点的位置变换关系,即原点的位置变换(x,y,z)一组3X1的矩阵,此外晗需要知道三个坐标轴的旋转变换关系,即3X3的旋转矩阵R。 为了把缩放变换、旋转变换、平移变换统一成矩阵乘法的形式,将变换矩阵补足为齐次坐标,这样不管进行多少次变换,都可以表示成矩阵连乘的形式,将极大的方便计算和降低运算量。保证了形式上的线性一致性。 KITTI数据集中也做了这一工作,在跟随数据集下载下来的文件Clibration中,其中包含三个txt文件分别是cam to cam的标定变换文件,imu to velo的标定文件和velo to cam的标定文件,在后续的工作中会使用这几个文件进行坐标的变换。
齐次坐标https://blog.csdn.net/zhuiqiuzhuoyue583/article/details/95230246坐标转换文件可以参考https://github.com/charlesq34/frustum-pointnets其中KITTI下的kitti_utils.py文件(此github是一篇关于使用图片和3D点云进行目标检测的论文) 下载上述Github的KITTI下的kitti_utils.py文件使用其中的calib.project_rect_to_velo() 函数对其坐标进行进行转换 from kitti_utils import * calib = Calibration('/home/liqi/dev/catkin_ws/src/KITTI_tutorials/2011_09_26_drive_0014_sync',from_video = True) # 从cam2坐标转换到velo坐标系 .T转置操作 corners_3d_velo = calib.project_rect_to_velo(corners_3d_cam2.T).T corners_3d_velo fig = plt.figure(figsize = (20,10)) ax = fig.add_subplot(111, projection = '3d') ax.view_init(40,150) draw_box(ax,(corners_3d_velo))通过以上代码就可以将cam2坐标系下的Bounding Box转换到velo坐标系下,如下图所示即为frame = 2时,cam2坐标系下的Box的转换到velo坐标系下的结果。可以看到3DBoundingBox显示较为正常,而且具有一定的旋转角度,可以将其放到点云图中比较显示。 为了更好的看到效果,在jupyter notebook下进行单帧的调试: 3D视角下: 鸟瞰图视角下:
在ROS里画出Bounding Box ,依然使用Marker类中的LineList ,可以将各个点按照指定的顺序进行连接。 在车辆行驶坐标系上,Box其各顶点顺序ROS定义如下: 所以连点可以表示为:
LINES =[[0, 1], [1, 2], [2, 3], [3, 0]] #Low face LINES +=[[4, 5], [5, 6], [6, 7], [7, 4]] #upper face LINES +=[[4, 0], [5, 1], [6, 2], [7, 3]] #connect lower and upperface LINES +=[[4, 1], [5, 0]] # front faceROS下坐标转换: kitti.py
boxs_3d= np.array(df_tracking_frame[['height','width','length','pos_x','pos_y','pos_z','rot_y']]) track_ids = np.array(df_tracking_frame[['track_id']]) # 将Tracking文件中的每一个box从cam2转换到velo坐标系下 corners_3d_velos =[] for box_3d in boxs_3d: corners_3d_box_cam2 = compute_3D_box_cam2(* box_3d) # 使用*将array展开 # kitti_utils中坐标转换工具 corners_3d_velo = calib.project_rect_to_velo(corners_3d_box_cam2.T) corners_3d_velos += [corners_3d_velo] publish_3dbox(box3d_pub, corners_3d_velos, types, track_ids)示例代码: publish_utils.py
def publish_3dbox(box3d_pub, corners_3d_velos, object_types): marker_array = MarkerArray() # 显示3D Box for i, corners_3d_velo in enumerate(corners_3d_velos): # Line_marker marker = Marker() marker.header.frame_id = FRAME_ID marker.header.stamp = rospy.Time.now() # 设定marker的ID marker.id = i marker.action = Marker.ADD # 设定Bounding Box存在时间和帧率相同即0.1s marker.lifetime = rospy.Duration(LIFETIME) marker.type = Marker.LINE_LIST if object_types is None : marker .color.r= 0.0 marker.color.g = 1.0 marker.color.b = 1.0 else : b , g, r = DETECTION_COLOR_DICT[object_types[ i ]] marker .color.r = r/255.0 # py2只有制定.0才会小数除法 marker .color.g = g/255.0 marker .color.b = b/255.0 marker.color.a = 1.0 marker.scale.x = 0.1 marker.points = [] for l in LINES: p1 =corners_3d_velo[ l[0] ] marker.points.append (Point(p1[0], p1[1], p1[2])) p2 = corners_3d_velo[ l[1]] marker.points.append (Point(p2[0], p2[1], p2[2])) marker_array.markers.append(marker) box3d_pub.publish(marker_array)引入了之前使用过的DETECTION_COLOR_DICT ,使不同的types显示不同的颜色框,后得到的效果如下图: