网站首页 > 技术文章 正文
可视化搜索引擎由大多数搜索引擎提供,视觉搜索也可用于电子商务空间或商店体验。我展示了我穿的一些东西,并且我也推荐了我正在寻找的类似东西。
这个项目主要有两部分:首先是通过对我们所拥有的数据进行训练来找到给定图像的相似图像。然后只给用户最相似的产品。
我们将通过使用预先训练的网络作为特征来计算图像嵌入来对图像数据集进行索引。然后我们可以使用K-NN算法查询数据集。
如上图所示,为所有输入图像创建索引。然后,当用户查询给定的图像,然后提取特征,并获得最接近给定图像的最近图像。
依赖:
安装mxnet
hnswlib(按照指南在这里:https : //github.com/nmslib/hnsw)
原始数据来自:http : //jmcauley.ucsd.edu/data/amazon/
入门:
导入模块:
import mxnet as mx
from mxnet import gluon, nd
from mxnet.gluon.model_zoo import vision
import multiprocessing
from mxnet.gluon.data.vision.datasets import ImageFolderDataset
from mxnet.gluon.data import DataLoader
import numpy as np
import wget
import imghdr
import json
import pickle
import hnswlib
import numpy as np
import glob, os, time
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
from urlparse import urlparse
import urllib
import gzip
%matplotlib inline
下载数据:
这是来自亚马逊的数据子集,由斯坦福大学提供
data_path = 'metadata.json'
images_path = '/data/amazon_images_subset'
if not os.path.isfile(data_path):
# Downloading the metadata, 3.1GB, unzipped 9GB
!wget -nv https://s3.us-east-2.amazonaws.com/mxnet-public/stanford_amazon/metadata.json.gz
!gzip -d metadata.json.gz
if not os.path.isdir(images_path):
os.makedirs(images_path)
处理数据:
实际下载的图像大小约为9 GB,但对于初学者,我们可以使用更小的数据子集进行处理。
subset_num = 1000
num_lines = 0
num_lines = sum(1 for line in open(data_path))
assert num_lines >= subset_num, "Subset needs to be smaller or equal to total number of example"
已下载的数据包含可以解析的图像的url链接,如下所示:
def parse(path, num_cpu, modulo):
g = open(path, 'r')
for i, l in enumerate(g):
if (i >= num_lines - subset_num and i%num_cpu == modulo):
yield eval(l)
从数据获取图像
NUM_CPU = multiprocessing.cpu_count()*10
def download_files(modulo):
for data in parse(data_path, NUM_CPU, modulo):
if 'imUrl' in data and data['imUrl'] is not None and 'categories' in data and data['imUrl'].split('.')[-1] == 'jpg':
url = data['imUrl']
try:
path = os.path.join(images_path, data['asin']+'.jpg')
if not os.path.isfile(path):
file = urllib.request.urlretrieve(url, path)
except:
print("Error downloading {}".format(url))
保存图像并删除假图像
pool = multiprocessing.Pool(processes=NUM_CPU)
results = pool.map(download_files, list(range(NUM_CPU)))
# Removing all the fake jpegs
list_files = glob.glob(os.path.join(images_path, '**.jpg'))
for file in list_files:
if imghdr.what(file) != 'jpeg':
print('Removed {} it is a {}'.format(file, imghdr.what(file)))
os.remove(file)
生成图像嵌入:
BATCH_SIZE = 256
EMBEDDING_SIZE = 512
SIZE = (224, 224)
MEAN_IMAGE= mx.nd.array([0.485, 0.456, 0.406])
STD_IMAGE = mx.nd.array([0.229, 0.224, 0.225])
Featurizer
我们使用model zoo的预训练模型
ctx = mx.cpu()
net = vision.resnet18_v2(pretrained=True, ctx=ctx)
net = net.features
数据转换
将图像转换成网络可用的形状
def transform(image, label):
resized = mx.image.resize_short(image, SIZE[0]).astype('float32')
cropped, crop_info = mx.image.center_crop(resized, SIZE)
cropped /= 255.
normalized = mx.image.color_normalize(cropped,
mean=MEAN_IMAGE,
std=STD_IMAGE)
transposed = nd.transpose(normalized, (2,0,1))
return transposed, label
数据加载
import os, tempfile, glob
empty_folder = tempfile.mkdtemp()
# Create an empty image Folder Data Set
dataset = ImageFolderDataset(root=empty_folder, transform=transform)
list_files = glob.glob(os.path.join(images_path, '**.jpg'))
dataset.items = list(zip(list_files, [0]*len(list_files)))
dataloader = DataLoader(dataset, batch_size=BATCH_SIZE, last_batch='keep', shuffle=False, num_workers=multiprocessing.cpu_count())
Featurization
features = np.zeros((len(dataset), EMBEDDING_SIZE), dtype=np.float32)
%%time
tick = time.time()
n_print = 100
j = 0
net.hybridize()
for i, (data, label) in enumerate(dataloader):
data = data.as_in_context(ctx)
if i%n_print == 0 and i > 0:
print("{0} batches, {1} images, {2:.3f} img/sec".format(i, i*BATCH_SIZE, BATCH_SIZE*n_print/(time.time()-tick)))
tick = time.time()
output = net(data)
features[(i)*BATCH_SIZE:(i+1)*max(BATCH_SIZE, len(output)), :] = output.asnumpy().squeeze()
创建搜索索引
# Number of elements in the index
num_elements = len(features)
labels_index = np.arange(num_elements)
%%time
# Declaring index
p = hnswlib.Index(space = 'l2', dim = EMBEDDING_SIZE) # possible options are l2, cosine or ip
# Initing index - the maximum number of elements should be known beforehand
p.init_index(max_elements = num_elements, ef_construction = 100, M = 16)
# Element insertion (can be called several times):
int_labels = p.add_items(features, labels_index)
# Controlling the recall by setting ef:
p.set_ef(100) # ef should always be > k
p.save_index('index.idx')
测试
我们通过从数据集中抽取随机图像并搜索他们的K-NN来测试结果
def plot_predictions(images):
gs = gridspec.GridSpec(3, 3)
fig = plt.figure(figsize=(15, 15))
gs.update(hspace=0.1, wspace=0.1)
for i, (gg, image) in enumerate(zip(gs, images)):
gg2 = gridspec.GridSpecFromSubplotSpec(10, 10, subplot_spec=gg)
ax = fig.add_subplot(gg2[:,:])
ax.imshow(image, cmap='Greys_r')
ax.tick_params(axis='both',
which='both',
bottom='off',
top='off',
left='off',
right='off',
labelleft='off',
labelbottom='off')
ax.axes.set_title("result [{}]".format(i))
if i == 0:
plt.setp(ax.spines.values(), color='red')
ax.axes.set_title("SEARCH".format(i))
def search(N, k):
# Query dataset, k - number of closest elements (returns 2 numpy arrays)
q_labels, q_distances = p.knn_query([features[N]], k = k)
images = [plt.imread(dataset.items[label][0]) for label in q_labels[0]]
plot_predictions(images)
一些测试
%%time
index = np.random.randint(0,len(features))
k = 6
search(index, k)
猜你喜欢
- 2024-09-09 分享一些你可能还没使用的 JavaScript 技巧
- 2024-09-09 Python 爬取张国荣最火的 8 首歌,60000 评论看完泪奔!
- 2024-09-09 万字详文:超越 BERT 模型的 ELECTRA 代码解读
- 2024-09-09 大受欢迎的Kubernetes:快速入门&进阶实战
- 2024-09-09 首发|Clusterpedia 0.1.0 四大重要功能
- 2024-09-09 NET开发者的HTTP交互新宠(豪门36夜:黑帝的替身新宠)
- 2024-09-09 BGP路径属性:Origin和AS_PATH(bgp路由协议中origin属性)
- 2024-09-09 如何修改容器时间而不改变宿主机时间?
- 2024-09-09 VASP计算杂化能带详细步骤教程(vasp杂化泛函计算)
- 2024-09-09 5分钟了解游戏加速器的原理与搭建
- 02-21走进git时代, 你该怎么玩?_gits
- 02-21GitHub是什么?它可不仅仅是云中的Git版本控制器
- 02-21Git常用操作总结_git基本用法
- 02-21为什么互联网巨头使用Git而放弃SVN?(含核心命令与原理)
- 02-21Git 高级用法,喜欢就拿去用_git基本用法
- 02-21Git常用命令和Git团队使用规范指南
- 02-21总结几个常用的Git命令的使用方法
- 02-21Git工作原理和常用指令_git原理详解
- 最近发表
- 标签列表
-
- cmd/c (57)
- c++中::是什么意思 (57)
- sqlset (59)
- ps可以打开pdf格式吗 (58)
- phprequire_once (61)
- localstorage.removeitem (74)
- routermode (59)
- vector线程安全吗 (70)
- & (66)
- java (73)
- org.redisson (64)
- log.warn (60)
- cannotinstantiatethetype (62)
- js数组插入 (83)
- resttemplateokhttp (59)
- gormwherein (64)
- linux删除一个文件夹 (65)
- mac安装java (72)
- reader.onload (61)
- outofmemoryerror是什么意思 (64)
- flask文件上传 (63)
- eacces (67)
- 查看mysql是否启动 (70)
- java是值传递还是引用传递 (58)
- 无效的列索引 (74)