Refactor storage operations into separate Backend classes (#348)
Following the discussion in #253 and #325 I've created a first iteration on what a `Backend` interface could look like and how the current file storage operations may be refactored into this interface. It goes from the following principles
* `app.py` talks only to `core.py` with regards to package operations
* at configuration time, a `Backend` implementation is chosen and created for the lifetime of the configured app
* `core.py` proxies requests for packages to this `Backend()`
* The `Backend` interface/api is defined through three things
* methods that an implementation must implement
* methods that an implementation may override if it knows better than the defaults
* the `PkgFIle` class that is (should be) the main carrier of data
* where possible, implementation details must be hidden from concrete `Backend`s to promote extensibility
Other things I've done in this PR:
* I've tried to talk about packages and projects, rather than files and prefixes, since these are the domain terms PEP503 uses, and imho it's also more clear what it means
* Better testability of the `CacheManager` (no more race conditions when `watchdog` is installed during testing)
* Cleanup some more Python 2 code
* Started moving away from `os.path` and `py.path` in favour of `pathlib`
Furthermore I've created a `plugin.py` with a sample of how I think plugin system could look like. This sampIe assumes we use `argparse` and allows for the extension of cli arguments that a plugin may need. I think the actual implementation of such a plugin system is beyond the scope of this PR, but I've used it as a target for the Backend refactoring. If requested, I'll remove it from this PR.
The following things still need to be done / discussed. These can be part of this PR or moved into their own, separate PRs
- [ ] Simplify the `PgkFile` class. It currently consists of a number of attributes that don't necessarily belong with it, and not all attributes are aptly named (imho). I would like to minimalize the scope of `PkgFile` so that its only concern is being a data carrier between the app and the backends, and make its use more clear.
- [ ] Add a `PkgFile.metadata` that backend implementations may use to store custom data for packages. For example the current `PkgFile.root` attribute is an implementation detail of the filestorage backends, and other Backend implementations should not be bothered by it.
- [ ] Use `pathlib` wherever possible. This may also result in less attributes for `PkgFile`, since some things may be just contained in a single `Path` object, instead of multtiple strings.
- [ ] Improve testing of the `CacheManager`.
----
* move some functions around in preparation for backend module
* rename pkg_utils to pkg_helpers to prevent confusion with stdlib pkgutil
* further implement the current filestorage as simple file backend
* rename prefix to project, since that's more descriptive
* add digester func as attribute to pkgfile
* WIP caching backend
* WIP make cache better testable
* better testability of cache
* WIP file backends as plugin
* fix typos, run black
* Apply suggestions from code review
Co-authored-by: Matthew Planchard <mplanchard@users.noreply.github.com>
* add more type hints to pass mypy, fix tox.ini
* add package count method to backend
* add package count method to backend
* minor changes
* bugfix when checking invalid whl file
* check for existing package recursively, bugfix, some more pathlib
* fix unittest
* rm dead code
* exclude bottle.py from coverage
* fix merge mistakes
* fix tab indentation
* backend as a cli argument
* fix cli, add tests
* fix mypy
* fix more silly mistakes
* process feedback
* remove dead code
Co-authored-by: Matthew Planchard <mplanchard@users.noreply.github.com>
2021-02-02 18:44:29 +01:00
|
|
|
import os
|
|
|
|
from pathlib import WindowsPath, PureWindowsPath
|
|
|
|
|
|
|
|
import pytest
|
|
|
|
|
|
|
|
from pypiserver.pkg_helpers import guess_pkgname_and_version, is_listed_path
|
|
|
|
|
|
|
|
files = [
|
|
|
|
("pytz-2012b.tar.bz2", "pytz", "2012b"),
|
|
|
|
("pytz-2012b.tgz", "pytz", "2012b"),
|
|
|
|
("pytz-2012b.ZIP", "pytz", "2012b"),
|
|
|
|
("pytz-2012a.zip", "pytz", "2012a"),
|
2023-11-13 16:19:52 +01:00
|
|
|
("pytz-2012b.tar.xz", "pytz", "2012b"),
|
Refactor storage operations into separate Backend classes (#348)
Following the discussion in #253 and #325 I've created a first iteration on what a `Backend` interface could look like and how the current file storage operations may be refactored into this interface. It goes from the following principles
* `app.py` talks only to `core.py` with regards to package operations
* at configuration time, a `Backend` implementation is chosen and created for the lifetime of the configured app
* `core.py` proxies requests for packages to this `Backend()`
* The `Backend` interface/api is defined through three things
* methods that an implementation must implement
* methods that an implementation may override if it knows better than the defaults
* the `PkgFIle` class that is (should be) the main carrier of data
* where possible, implementation details must be hidden from concrete `Backend`s to promote extensibility
Other things I've done in this PR:
* I've tried to talk about packages and projects, rather than files and prefixes, since these are the domain terms PEP503 uses, and imho it's also more clear what it means
* Better testability of the `CacheManager` (no more race conditions when `watchdog` is installed during testing)
* Cleanup some more Python 2 code
* Started moving away from `os.path` and `py.path` in favour of `pathlib`
Furthermore I've created a `plugin.py` with a sample of how I think plugin system could look like. This sampIe assumes we use `argparse` and allows for the extension of cli arguments that a plugin may need. I think the actual implementation of such a plugin system is beyond the scope of this PR, but I've used it as a target for the Backend refactoring. If requested, I'll remove it from this PR.
The following things still need to be done / discussed. These can be part of this PR or moved into their own, separate PRs
- [ ] Simplify the `PgkFile` class. It currently consists of a number of attributes that don't necessarily belong with it, and not all attributes are aptly named (imho). I would like to minimalize the scope of `PkgFile` so that its only concern is being a data carrier between the app and the backends, and make its use more clear.
- [ ] Add a `PkgFile.metadata` that backend implementations may use to store custom data for packages. For example the current `PkgFile.root` attribute is an implementation detail of the filestorage backends, and other Backend implementations should not be bothered by it.
- [ ] Use `pathlib` wherever possible. This may also result in less attributes for `PkgFile`, since some things may be just contained in a single `Path` object, instead of multtiple strings.
- [ ] Improve testing of the `CacheManager`.
----
* move some functions around in preparation for backend module
* rename pkg_utils to pkg_helpers to prevent confusion with stdlib pkgutil
* further implement the current filestorage as simple file backend
* rename prefix to project, since that's more descriptive
* add digester func as attribute to pkgfile
* WIP caching backend
* WIP make cache better testable
* better testability of cache
* WIP file backends as plugin
* fix typos, run black
* Apply suggestions from code review
Co-authored-by: Matthew Planchard <mplanchard@users.noreply.github.com>
* add more type hints to pass mypy, fix tox.ini
* add package count method to backend
* add package count method to backend
* minor changes
* bugfix when checking invalid whl file
* check for existing package recursively, bugfix, some more pathlib
* fix unittest
* rm dead code
* exclude bottle.py from coverage
* fix merge mistakes
* fix tab indentation
* backend as a cli argument
* fix cli, add tests
* fix mypy
* fix more silly mistakes
* process feedback
* remove dead code
Co-authored-by: Matthew Planchard <mplanchard@users.noreply.github.com>
2021-02-02 18:44:29 +01:00
|
|
|
("gevent-1.0b1.win32-py2.6.exe", "gevent", "1.0b1"),
|
|
|
|
("gevent-1.0b1.win32-py2.7.msi", "gevent", "1.0b1"),
|
|
|
|
("greenlet-0.3.4-py3.1-win-amd64.egg", "greenlet", "0.3.4"),
|
|
|
|
("greenlet-0.3.4.win-amd64-py3.2.exe", "greenlet", "0.3.4"),
|
|
|
|
("greenlet-0.3.4-py3.2-win32.egg", "greenlet", "0.3.4"),
|
|
|
|
("greenlet-0.3.4-py2.7-linux-x86_64.egg", "greenlet", "0.3.4"),
|
|
|
|
("pep8-0.6.0.zip", "pep8", "0.6.0"),
|
|
|
|
("ABC12-34_V1X-1.2.3.zip", "ABC12", "34_V1X-1.2.3"),
|
|
|
|
("A100-200-XYZ-1.2.3.zip", "A100-200-XYZ", "1.2.3"),
|
|
|
|
("flup-1.0.3.dev-20110405.tar.gz", "flup", "1.0.3.dev-20110405"),
|
|
|
|
("package-1.0.0-alpha.1.zip", "package", "1.0.0-alpha.1"),
|
|
|
|
("package-1.3.7+build.11.e0f985a.zip", "package", "1.3.7+build.11.e0f985a"),
|
|
|
|
("package-v1-8.1.301.ga0df26f.zip", "package-v1", "8.1.301.ga0df26f"),
|
|
|
|
("package-v1.1-8.1.301.ga0df26f.zip", "package-v1.1", "8.1.301.ga0df26f"),
|
|
|
|
("package-2013.02.17.dev123.zip", "package", "2013.02.17.dev123"),
|
|
|
|
("package-20000101.zip", "package", "20000101"),
|
|
|
|
("flup-123-1.0.3.dev-20110405.tar.gz", "flup-123", "1.0.3.dev-20110405"),
|
|
|
|
("package-123-1.0.0-alpha.1.zip", "package-123", "1.0.0-alpha.1"),
|
|
|
|
(
|
|
|
|
"package-123-1.3.7+build.11.e0f985a.zip",
|
|
|
|
"package-123",
|
|
|
|
"1.3.7+build.11.e0f985a",
|
|
|
|
),
|
|
|
|
("package-123-v1.1_3-8.1.zip", "package-123-v1.1_3", "8.1"),
|
|
|
|
("package-123-2013.02.17.dev123.zip", "package-123", "2013.02.17.dev123"),
|
|
|
|
("package-123-20000101.zip", "package-123", "20000101"),
|
|
|
|
(
|
|
|
|
"pyelasticsearch-0.5-brainbot-1-20130712.zip",
|
|
|
|
"pyelasticsearch",
|
|
|
|
"0.5-brainbot-1-20130712",
|
|
|
|
),
|
|
|
|
("pywin32-217-cp27-none-win32.whl", "pywin32", "217"),
|
|
|
|
("pywin32-217-55-cp27-none-win32.whl", "pywin32", "217-55"),
|
|
|
|
("pywin32-217.1-cp27-none-win32.whl", "pywin32", "217.1"),
|
|
|
|
("package.zip", "package", ""),
|
|
|
|
(
|
|
|
|
"package-name-0.0.1.dev0.linux-x86_64.tar.gz",
|
|
|
|
"package-name",
|
|
|
|
"0.0.1.dev0",
|
|
|
|
),
|
|
|
|
(
|
|
|
|
"package-name-0.0.1.dev0.macosx-10.10-intel.tar.gz",
|
|
|
|
"package-name",
|
|
|
|
"0.0.1.dev0",
|
|
|
|
),
|
|
|
|
(
|
|
|
|
"package-name-0.0.1.alpha.1.win-amd64-py3.2.exe",
|
|
|
|
"package-name",
|
|
|
|
"0.0.1.alpha.1",
|
|
|
|
),
|
|
|
|
("pkg-3!1.0-0.1.tgz", "pkg", "3!1.0-0.1"), # TO BE FIXED
|
|
|
|
("pkg-3!1+.0-0.1.tgz", "pkg", "3!1+.0-0.1"), # TO BE FIXED
|
|
|
|
("pkg.zip", "pkg", ""),
|
|
|
|
("foo/pkg.zip", "pkg", ""),
|
|
|
|
("foo/pkg-1b.zip", "pkg", "1b"),
|
|
|
|
("foo/pywin32-217.1-cp27-none-win32.whl", "pywin32", "217.1"),
|
|
|
|
(
|
|
|
|
"package-name-0.0.1.alpha.1.win-amd64-py3.2.exe",
|
|
|
|
"package-name",
|
|
|
|
"0.0.1.alpha.1",
|
|
|
|
),
|
|
|
|
]
|
|
|
|
|
|
|
|
|
|
|
|
def _capitalize_ext(fpath):
|
|
|
|
f, e = os.path.splitext(fpath)
|
|
|
|
if e != ".whl":
|
|
|
|
e = e.upper()
|
|
|
|
return f + e
|
|
|
|
|
|
|
|
|
|
|
|
@pytest.mark.parametrize(("filename", "pkgname", "version"), files)
|
|
|
|
def test_guess_pkgname_and_version(filename, pkgname, version):
|
|
|
|
exp = (pkgname, version)
|
|
|
|
assert guess_pkgname_and_version(filename) == exp
|
|
|
|
assert guess_pkgname_and_version(_capitalize_ext(filename)) == exp
|
|
|
|
|
|
|
|
|
|
|
|
@pytest.mark.parametrize(("filename", "pkgname", "version"), files)
|
|
|
|
def test_guess_pkgname_and_version_asc(filename, pkgname, version):
|
|
|
|
exp = (pkgname, version)
|
|
|
|
filename = f"{filename}.asc"
|
|
|
|
assert guess_pkgname_and_version(filename) == exp
|
|
|
|
|
|
|
|
|
|
|
|
invalid_files = ["some_file", "some_file.ext", "some_wheel.whl"]
|
|
|
|
|
|
|
|
|
|
|
|
@pytest.mark.parametrize("filename", invalid_files)
|
|
|
|
def test_guess_pkgname_and_version_invalid_files(filename):
|
|
|
|
assert guess_pkgname_and_version(filename) is None
|
|
|
|
|
|
|
|
|
|
|
|
paths = [
|
|
|
|
("/some/path", True),
|
|
|
|
(PureWindowsPath(r"c:\some\windows\path"), True),
|
|
|
|
("/.hidden", False),
|
|
|
|
(PureWindowsPath(r"c:\.hidden\windows\path"), False),
|
|
|
|
]
|
|
|
|
|
|
|
|
|
|
|
|
@pytest.mark.parametrize(("pathname", "allowed"), paths)
|
|
|
|
def test_allowed_path_check(pathname, allowed):
|
|
|
|
assert is_listed_path(pathname) == allowed
|