finding similar 1-token words on OpenAI's CLIP.
Project description
The clip_similarwords is the implementation of finding similar 1-token words of OpenAI's CLIP in less than one second.
OpenAI's CLIP uses text-image similarities so its text-text similarities may also be text's typical image similarities unlike WordNet or other synonym dictionaries.
Note that, for speed and storage reason (PyPI is limited to 60MB), the words composed by 2 or more tokens are not supported.
Installation
clip_similarwords is easily installable via pip command:
pip install clip_similarwords
or
pip install git+https://github.com/nazodane/clip_similarwords.git
Usage of the command
~/.local/bin/clip-similarwords [ word_fragment | --all ]
Usage of the module
from clip_similarwords import CLIPTextSimilarWords
clipsim = CLIPTextSimilarWords()
for key_token, sim_token, cos_similarity in clipsim("cat"):
print("%s -> %s ( cos_similarity: %.2f )"%(key_token, sim_token, cos_similarity))
Requirements for model uses
- Linux (should also works on other environmets)
no PyTorch nor CUDA are required.
Requirements for model generation
- Linux
- Python 3.10 or later
- PyTorch 1.13 or later
- CUDA 11.7 or later
- DRAM 16GB or higher
- RTX 3060 12GB or higher
The patches and informations on other enviroments are surely welcome!
License
The codes are under MIT License. The model was converted under Japanese law.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for clip_similarwords-0.0.4.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | ee5868804402b0c2708ef323b704e0861b50c88e6c546922aa8fef5c983e39a7 |
|
MD5 | 267f72fe2c541671b36ce4bf8f1185e2 |
|
BLAKE2b-256 | c0cb2dd0e347be71e2f88c076203a24f002773275d8cf1bb6e8960a2469cd6ba |
Hashes for clip_similarwords-0.0.4.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba70a5003c1d547489846d371442e445c1a2355d809cfda67689cdaff85cfb87 |
|
MD5 | 4648d43f8619ae4775542e0de4bd14d1 |
|
BLAKE2b-256 | f324076e9bf05d4030e97b2b28c280fcac714237e70d1ab57f80293a20e7a82a |