位置：首页-资讯-后端开发

Python实现随机从图像中获取多个patch

2024-04-02 19:55

短信预约 -IT技能 免费直播动态提醒

经常有一些图像任务需要从一张大图中截取固定大小的patch来进行训练。这里面常常存在下面几个问题：

patch的位置尽可能随机，不然数据丰富性可能不够，容易引起过拟合
如果原图较大，读图带来的IO开销可能会非常大，影响训练速度，所以最好一次能够截取多个patch
我们经常不太希望因为随机性的存在而使得图像中某些区域没有被覆盖到，所以还需要注意patch位置的覆盖程度

基于以上问题，我们可以使用下面的策略从图像中获取位置随机的多个patch：

以固定的stride获取所有patch的左上角坐标
对左上角坐标进行随机扰动
对patch的左上角坐标加上宽和高得到右下角坐标
检查patch的坐标是否超出图像边界，如果超出则将其收进来，收的过程应保证patch尺寸不变
加入ROI（Region Of Interest）功能，也就是说patch不一定非要在整张图中获取，而是可以指定ROI区域

下面是实现代码和例子：

注意下面代码只是获取了patch的bounding box，并没有把patch截取出来。

# -*- coding: utf-8 -*-
import cv2
import numpy as np


def get_random_patch_bboxes(image, bbox_size, stride, jitter, roi_bbox=None):
    """
    Generate random patch bounding boxes for a image around ROI region

    Parameters
    ----------
    image: image data read by opencv, shape is [H, W, C]
    bbox_size: size of patch bbox, one digit or a list/tuple containing two
        digits, defined by (width, height)
    stride: stride between adjacent bboxes (before jitter), one digit or a
        list/tuple containing two digits, defined by (x, y)
    jitter: jitter size for evenly distributed bboxes, one digit or a
        list/tuple containing two digits, defined by (x, y)
    roi_bbox: roi region, defined by [xmin, ymin, xmax, ymax], default is whole
        image region

    Returns
    -------
    patch_bboxes: randomly distributed patch bounding boxes, n x 4 numpy array.
        Each bounding box is defined by [xmin, ymin, xmax, ymax]
    """
    height, width = image.shape[:2]
    bbox_size = _process_geometry_param(bbox_size, min_value=1)
    stride = _process_geometry_param(stride, min_value=1)
    jitter = _process_geometry_param(jitter, min_value=0)

    if bbox_size[0] > width or bbox_size[1] > height:
        raise ValueError('box_size must be <= image size')

    if roi_bbox is None:
        roi_bbox = [0, 0, width, height]

    # tl is for top-left, br is for bottom-right
    tl_x, tl_y = _get_top_left_points(roi_bbox, bbox_size, stride, jitter)
    br_x = tl_x + bbox_size[0]
    br_y = tl_y + bbox_size[1]

    # shrink bottom-right points to avoid exceeding image border
    br_x[br_x > width] = width
    br_y[br_y > height] = height
    # shrink top-left points to avoid exceeding image border
    tl_x = br_x - bbox_size[0]
    tl_y = br_y - bbox_size[1]
    tl_x[tl_x < 0] = 0
    tl_y[tl_y < 0] = 0
    # compute bottom-right points again
    br_x = tl_x + bbox_size[0]
    br_y = tl_y + bbox_size[1]

    patch_bboxes = np.concatenate((tl_x, tl_y, br_x, br_y), axis=1)
    return patch_bboxes


def _process_geometry_param(param, min_value):
    """
    Process and check param, which must be one digit or a list/tuple containing
    two digits, and its value must be >= min_value

    Parameters
    ----------
    param: parameter to be processed
    min_value: min value for param

    Returns
    -------
    param: param after processing
    """
    if isinstance(param, (int, float)) or \
            isinstance(param, np.ndarray) and param.size == 1:
        param = int(np.round(param))
        param = [param, param]
    else:
        if len(param) != 2:
            raise ValueError('param must be one digit or two digits')
        param = [int(np.round(param[0])), int(np.round(param[1]))]

    # check data range using min_value
    if not (param[0] >= min_value and param[1] >= min_value):
        raise ValueError('param must be >= min_value (%d)' % min_value)
    return param


def _get_top_left_points(roi_bbox, bbox_size, stride, jitter):
    """
    Generate top-left points for bounding boxes

    Parameters
    ----------
    roi_bbox: roi region, defined by [xmin, ymin, xmax, ymax]
    bbox_size: size of patch bbox, a list/tuple containing two digits, defined
        by (width, height)
    stride: stride between adjacent bboxes (before jitter), a list/tuple
        containing two digits, defined by (x, y)
    jitter: jitter size for evenly distributed bboxes, a list/tuple containing
        two digits, defined by (x, y)

    Returns
    -------
    tl_x: x coordinates of top-left points, n x 1 numpy array
    tl_y: y coordinates of top-left points, n x 1 numpy array
    """
    xmin, ymin, xmax, ymax = roi_bbox
    roi_width = xmax - xmin
    roi_height = ymax - ymin

    # get the offset between the first top-left point of patch box and the
    # top-left point of roi_bbox
    offset_x = np.arange(0, roi_width, stride[0])[-1] + bbox_size[0]
    offset_y = np.arange(0, roi_height, stride[1])[-1] + bbox_size[1]
    offset_x = (offset_x - roi_width) // 2
    offset_y = (offset_y - roi_height) // 2

    # get the coordinates of all top-left points
    tl_x = np.arange(xmin, xmax, stride[0]) - offset_x
    tl_y = np.arange(ymin, ymax, stride[1]) - offset_y
    tl_x, tl_y = np.meshgrid(tl_x, tl_y)
    tl_x = np.reshape(tl_x, [-1, 1])
    tl_y = np.reshape(tl_y, [-1, 1])

    # jitter the coordinates of all top-left points
    tl_x += np.random.randint(-jitter[0], jitter[0] + 1, size=tl_x.shape)
    tl_y += np.random.randint(-jitter[1], jitter[1] + 1, size=tl_y.shape)
    return tl_x, tl_y


if __name__ == '__main__':
    image = cv2.imread('1.bmp')
    patch_bboxes = get_random_patch_bboxes(
        image,
        bbox_size=[64, 96],
        stride=[128, 128],
        jitter=[32, 32],
        roi_bbox=[500, 200, 1500, 800])

    colors = [
        (255, 0, 0),
        (0, 255, 0),
        (0, 0, 255),
        (255, 255, 0),
        (255, 0, 255),
        (0, 255, 255)]
    color_idx = 0

    for bbox in patch_bboxes:
        color_idx = color_idx % 6
        pt1 = (bbox[0], bbox[1])
        pt2 = (bbox[2], bbox[3])
        cv2.rectangle(image, pt1, pt2, color=colors[color_idx], thickness=2)
        color_idx += 1

    cv2.namedWindow('image', 0)
    cv2.imshow('image', image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    cv2.imwrite('image.png', image)

在实际应用中可以进一步增加一些简单的功能：

1.根据位置增加一些过滤功能。比如说太靠近边缘的给剔除掉，有些算法可能有比较严重的边缘效应，所以此时我们可能不太想要边缘的数据加入训练

2.也可以根据某些简单的算法策略进行过滤。比如在超分辨率这样的任务中，我们可能一般不太关心面积非常大的平坦区域，比如纯色墙面，大片天空等，此时可以使用方差进行过滤

3.设置最多保留数目。有时候原图像的大小可能有很大差异，此时利用上述方法得到的patch数量也就随之有很大的差异，然而为了保持训练数据的均衡性，我们可以设置最多保留数目，为了确保覆盖程度，一般需要在截取之前对patch进行shuffle，或者计算stride

以上就是Python实现随机从图像中获取多个patch的详细内容，更多关于Python图像获取patch的资料请关注编程网其它相关文章！

免责声明：

① 本站未注明“稿件来源”的信息均来自网络整理。其文字、图片和音视频稿件的所属权归原作者所有。本站收集整理出于非商业性的教育和科研之目的，并不意味着本站赞同其观点或证实其内容的真实性。仅作为临时的测试数据，供内部测试之用。本站并未授权任何人以任何方式主动获取本站任何信息。

② 本站未注明“稿件来源”的临时测试数据将在测试完成后最终做删除处理。有问题或投稿请发送至: 邮箱/279061341@qq.com QQ/279061341

阅读原文内容投诉

Python实现随机从图像中获取多个patch

下载Word文档到电脑，方便收藏和打印～

下载Word文档

Python实现随机从图像中获取多个patch

Python实现随机从图像中获取多个patch

相关文章

猜你喜欢

Python实现随机从图像中获取多个patch

python怎么从列表中随机选取多个数

PHP中的array_rand()函数：如何从数组中随机获取一个或多个元素

使用Navicat实现从MySQL中的多个表和视图中获取行计数方法

Python怎么实现在字典中获取带权重的随机值

Linux Shell中如何实现一个获取任意位数的随机密码

热门标签

编程热搜

Python 学习之路 - Python

chatgpt的中文全称是什么

C/C++中extern函数使用详解

C/C++可变参数的使用

css样式文件该放在哪里

php中数组下标必须是连续的吗

Python 3 教程

Python pip包管理

ubuntu如何重新编译内核

改善Java代码之慎用java动态编译

编程资源站

2021年下半年软考高级信息系统项目管理师高频考点精选资料

2021下半年软考高级信息系统技术知识点记忆口诀精选资料

2021下半年软考《信息系统项目管理师》考试真题及答案精选资料

2021下半年软考高级考试备考攻略精选资料

2021年软考高级《信息系统项目管理师》巩固练习题汇总精选资料

2021下半年软考高级信息系统项目管理师30个易考知识点汇总精选资料

2021下半年软考高级知识点这样记，还担心记不住吗精选资料

2021年下半年软考高级考试重点汇总精选资料

2021下半年软考高级信息系统项目管理师计算公式汇总精选资料

2021年下半年软考高级《信息系统项目管理师》模拟试题精选资料

信息系统项目管理师选择题每日一练（2024）历年试题

2023年下半年信息系统项目管理师综合知识真题演练历年试题

目录

Python实现随机从图像中获取多个patch

Python实现随机从图像中获取多个patch

相关文章

猜你喜欢

Python实现随机从图像中获取多个patch

python怎么从列表中随机选取多个数

PHP中的array_rand()函数：如何从数组中随机获取一个或多个元素

使用Navicat实现从MySQL中的多个表和视图中获取行计数方法

Python怎么实现在字典中获取带权重的随机值

Linux Shell中如何实现一个获取任意位数的随机密码

热门标签

编程热搜

编程资源站

目录

感谢您的提交，我们服务专员将在30分钟内给您回复