A py4j based hdfs client for python for native hdfs CLI performance.
Project description
pyhdfs-client : Powerful HDFS Client for python
Why it's fast powerful?
Native hdfs client offers much better performance than webhdfs clients. However calling native client for hadoop operations have an additional overhead of starting jvm. pyhdfs-client brings the performance of native hdfs client without any overhead of starting jvm on every command execution.
Features
- HDFS client for python
- Easy to integrate with python applications
- Better Performance than webhdfs clients
- Provide native hadoop client performance without any overhead
- Support both UNIX and Windows
Whats new in 0.1.3?
- Multiple instances of HDFS client enabled.
- [fix] Temporary folder deletion
- [fix] Java process shutdown issues on UNIX
Installation
pip install pyhdfs-client
Requirements: hadoop binaries and py4j installed
Sample Usage
>>> from pyhdfs_client.pyhdfs_client import HDFSClient
>>> hdfs_client = HDFSClient()
>>> ret, out, err = hdfs_client.run(['-ls', '/'])
>>> print(out)
Found 1 items
drwxr-xr-x - gp supergroup 0 2021-03-21 01:10 /f1
>>> hdfs_client.stop() # to terminate hdfs client
Contribution
- Any contribution for enhancements and bug fixes is welcome.
Credits
- This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
History
- 0.1.2 (2021-03-23)
- added UNIX Support
- 0.1.1 (2021-03-22)
- First release on PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pyhdfs_client-0.1.3.tar.gz
(13.3 kB
view hashes)
Built Distribution
Close
Hashes for pyhdfs_client-0.1.3-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5491c5f9070951afb21468bb434c6a0d3ab0258c9f023547f3d51c2ea6df4b36 |
|
MD5 | 48c15495b4c3c72f5d611e8829355b59 |
|
BLAKE2b-256 | f8b3623f9fee236d1a5a7ebbc0fd8186d8e45b3a51a6758024d0a0663db781f7 |