博客
关于我
一种通用的载入本地数据集的方法
阅读量:252 次
发布时间:2019-03-01

本文共 3385 字,大约阅读时间需要 11 分钟。

??????????

1. ?????????

???????????????????????????????????????????

????????????????????????????????????????2????????????????????????????

2. ??????

???????????????????????????

import syssys.path.append(r'E:\Pycharm\project\yeah&ok\load_data')  from load_data import load_data_func, test_image, augment

??????load_data_func?test_image???

3. ?????????

?????????????????????????????

data_dir = 'E:\Pycharm\project\yeah&ok\dataset'Batch_size = 32  # ?????train_dataset, test_dataset = load_data_func(data_dir, batchsize=Batch_size)test_image(train_dataset)  # ??9???

????????????????????????

4. ?????

????????

import tensorflow as tfimport matplotlib.pyplot as pltimport numpy as npimport pathlibimport randomimport tensorflow_datasets as tfds

5. ?????

??????????????????????????????

def load_data_func(data_dir, batch_size):    data_root = pathlib.Path(data_dir)    all_image_path = list(data_root.glob('*/*'))    all_image_path = [str(path) for path in all_image_path]    random.shuffle(all_image_path)        label_names = sorted(item.name for item in data_root.glob('*/') if item.is_dir())    label_to_index = {name: idx for idx, name in enumerate(label_names)}    print(label_to_index)        all_image_label = [label_to_index[pathlib.Path(p).parent.name] for p in all_image_path]    print(len(all_image_label))        image_path = all_image_path[5]    image_show = (1 + load_preprocess_image(image_path)) / 2.    plt.imshow(image_show)    plt.show()        path_ds = tf.data.Dataset.from_tensor_slices(all_image_path)    image_dataset = path_ds.map(load_preprocess_image)    label_dataset = tf.data.Dataset.from_tensor_slices(all_image_label)    dataset = tf.data.Dataset.zip((image_dataset, label_dataset))        image_count = len(all_image_path)    test_count = int(image_count * 0.2)    train_count = image_count - test_count    print(test_count, train_count)        train_dataset = dataset.skip(test_count).shuffle(buffer_size=150).repeat(3).batch(batch_size)    test_dataset = dataset.take(test_count).batch(batch_size)        train_dataset = train_dataset.map(augment)    return train_dataset, test_dataset

6. ???????

???????????????

def load_preprocess_image(img_path):    img_raw = tf.io.read_file(img_path)    img_tensor = tf.image.decode_jpeg(img_raw, channels=3)    img_tensor = tf.image.resize(img_tensor, [160, 160])    img_tensor = tf.cast(img_tensor, tf.float32)    img = img_tensor / 127.5 - 1    return img

7. ??????

???????????

def augment(image, label):    image = tf.image.random_flip_left_right(image)    image = tf.image.random_contrast(image, lower=0.0, upper=1.0)    image = tf.image.random_flip_up_down(image)    image = tf.image.random_brightness(image, max_delta=0.5)    image = tf.image.random_hue(image, max_delta=0.3)    image = tf.image.random_saturation(image, lower=0.3, upper=0.5)    return image, label

8. ??????

???????????

def test_image(train_dataset):    plt.figure(figsize=(12, 12))    for batch in tfds.as_numpy(train_dataset):        for i in range(9):            image, label = (1 + batch[0][i]) / 2., batch[1][i]            plt.subplot(3, 3, i + 1)            plt.imshow(image)            plt.grid(False)        break    plt.show()

9. ?????

  • ???????????????????????????
  • ???????????????sys.path????????
  • ?????????load_data_func??????????????
  • ???????????????????????????????
  • ???????????????????????????????????
  • ?????????test_image????????????

转载地址:http://whcv.baihongyu.com/

你可能感兴趣的文章
NIFI汉化_替换logo_二次开发_Idea编译NIFI最新源码_详细过程记录_全解析_Maven编译NIFI避坑指南001---大数据之Nifi工作笔记0068
查看>>
NIFI集群_内存溢出_CPU占用100%修复_GC overhead limit exceeded_NIFI: out of memory error ---大数据之Nifi工作笔记0017
查看>>
NIFI集群_队列Queue中数据无法清空_清除队列数据报错_无法删除queue_解决_集群中机器交替重启删除---大数据之Nifi工作笔记0061
查看>>
NIH发布包含10600张CT图像数据库 为AI算法测试铺路
查看>>
Nim教程【十二】
查看>>
Nim游戏
查看>>
NIO ByteBuffer实现原理
查看>>
Nio ByteBuffer组件读写指针切换原理与常用方法
查看>>
NIO Selector实现原理
查看>>
nio 中channel和buffer的基本使用
查看>>
NIO基于UDP协议的网络编程
查看>>
NISP一级,NISP二级报考说明,零基础入门到精通,收藏这篇就够了
查看>>
Nitrux 3.8 发布!性能全面提升,带来非凡体验
查看>>
NI笔试——大数加法
查看>>
NLog 自定义字段 写入 oracle
查看>>
NLog类库使用探索——详解配置
查看>>
NLP 基于kashgari和BERT实现中文命名实体识别(NER)
查看>>
NLP 项目:维基百科文章爬虫和分类【01】 - 语料库阅读器
查看>>
NLP_什么是统计语言模型_条件概率的链式法则_n元统计语言模型_马尔科夫链_数据稀疏(出现了词库中没有的词)_统计语言模型的平滑策略---人工智能工作笔记0035
查看>>
NLP学习笔记:使用 Python 进行NLTK
查看>>