博客
关于我
一种通用的载入本地数据集的方法
阅读量:252 次
发布时间:2019-03-01

本文共 3441 字,大约阅读时间需要 11 分钟。

??????????

1. ?????????

???????????????????????????????????????????

????????????????????????????????????????2????????????????????????????

2. ??????

???????????????????????????

import sys
sys.path.append(r'E:\Pycharm\project\yeah&ok\load_data')
from load_data import load_data_func, test_image, augment

??????load_data_func?test_image???

3. ?????????

?????????????????????????????

data_dir = 'E:\Pycharm\project\yeah&ok\dataset'
Batch_size = 32 # ?????
train_dataset, test_dataset = load_data_func(data_dir, batchsize=Batch_size)
test_image(train_dataset) # ??9???

????????????????????????

4. ?????

????????

import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import pathlib
import random
import tensorflow_datasets as tfds

5. ?????

??????????????????????????????

def load_data_func(data_dir, batch_size):
data_root = pathlib.Path(data_dir)
all_image_path = list(data_root.glob('*/*'))
all_image_path = [str(path) for path in all_image_path]
random.shuffle(all_image_path)
label_names = sorted(item.name for item in data_root.glob('*/') if item.is_dir())
label_to_index = {name: idx for idx, name in enumerate(label_names)}
print(label_to_index)
all_image_label = [label_to_index[pathlib.Path(p).parent.name] for p in all_image_path]
print(len(all_image_label))
image_path = all_image_path[5]
image_show = (1 + load_preprocess_image(image_path)) / 2.
plt.imshow(image_show)
plt.show()
path_ds = tf.data.Dataset.from_tensor_slices(all_image_path)
image_dataset = path_ds.map(load_preprocess_image)
label_dataset = tf.data.Dataset.from_tensor_slices(all_image_label)
dataset = tf.data.Dataset.zip((image_dataset, label_dataset))
image_count = len(all_image_path)
test_count = int(image_count * 0.2)
train_count = image_count - test_count
print(test_count, train_count)
train_dataset = dataset.skip(test_count).shuffle(buffer_size=150).repeat(3).batch(batch_size)
test_dataset = dataset.take(test_count).batch(batch_size)
train_dataset = train_dataset.map(augment)
return train_dataset, test_dataset

6. ???????

???????????????

def load_preprocess_image(img_path):
img_raw = tf.io.read_file(img_path)
img_tensor = tf.image.decode_jpeg(img_raw, channels=3)
img_tensor = tf.image.resize(img_tensor, [160, 160])
img_tensor = tf.cast(img_tensor, tf.float32)
img = img_tensor / 127.5 - 1
return img

7. ??????

???????????

def augment(image, label):
image = tf.image.random_flip_left_right(image)
image = tf.image.random_contrast(image, lower=0.0, upper=1.0)
image = tf.image.random_flip_up_down(image)
image = tf.image.random_brightness(image, max_delta=0.5)
image = tf.image.random_hue(image, max_delta=0.3)
image = tf.image.random_saturation(image, lower=0.3, upper=0.5)
return image, label

8. ??????

???????????

def test_image(train_dataset):
plt.figure(figsize=(12, 12))
for batch in tfds.as_numpy(train_dataset):
for i in range(9):
image, label = (1 + batch[0][i]) / 2., batch[1][i]
plt.subplot(3, 3, i + 1)
plt.imshow(image)
plt.grid(False)
break
plt.show()

9. ?????

  • ???????????????????????????
  • ???????????????sys.path????????
  • ?????????load_data_func??????????????
  • ???????????????????????????????
  • ???????????????????????????????????
  • ?????????test_image????????????

转载地址:http://whcv.baihongyu.com/

你可能感兴趣的文章
Nginx SSL私有证书自签,且反代80端口
查看>>
Nginx upstream性能优化
查看>>
Nginx 中解决跨域问题
查看>>
nginx 代理解决跨域
查看>>
Nginx 动静分离与负载均衡的实现
查看>>
Nginx 反向代理 MinIO 及 ruoyi-vue-pro 配置 MinIO 详解
查看>>
nginx 反向代理 转发请求时,有时好有时没反应,产生原因及解决
查看>>
Nginx 反向代理解决跨域问题
查看>>
Nginx 反向代理配置去除前缀
查看>>
nginx 后端获取真实ip
查看>>
Nginx 多端口配置和访问异常问题的排查与优化
查看>>
Nginx 如何代理转发传递真实 ip 地址?
查看>>
Nginx 学习总结(16)—— 动静分离、压缩、缓存、黑白名单、性能等内容温习
查看>>
Nginx 学习总结(17)—— 8 个免费开源 Nginx 管理系统,轻松管理 Nginx 站点配置
查看>>
Nginx 学习(一):Nginx 下载和启动
查看>>
nginx 常用指令配置总结
查看>>
Nginx 常用配置清单
查看>>
nginx 常用配置记录
查看>>
nginx 开启ssl模块 [emerg] the “ssl“ parameter requires ngx_http_ssl_module in /usr/local/nginx
查看>>
Nginx 我们必须知道的那些事
查看>>