Compare commits

...

13 Commits

Author SHA1 Message Date
383764fca2 update 2024-02-02 08:31:47 +00:00
3b623d4fdf pyproject.toml 2024-01-14 00:08:22 +00:00
4dc394213c update 2023-12-18 06:46:57 +00:00
48a4b37b76 Corrections 2022-11-18 16:16:26 +00:00
bdedba8d11 add Pipfile 2022-11-17 08:56:04 +00:00
146cd71281 add .github/workflows/test.yml 2022-11-17 08:51:30 +00:00
81a5e66b60 added setup.cfg 2022-11-17 08:07:23 +00:00
97db0946da Fixes 2022-11-16 20:59:26 +00:00
12ff9b924e add setup.py 2022-11-16 18:53:54 +00:00
c6a7d839d9 add setup.py 2022-11-16 18:33:59 +00:00
1d92e0ec65 add setup.py 2022-11-16 17:13:39 +00:00
1cb4e53cce added argparse 2022-11-16 15:33:50 +00:00
71672da7af add lookupdns.py 2022-11-16 13:49:17 +00:00
19 changed files with 801 additions and 149 deletions

47
.github/workflows/test.yml vendored Normal file
View File

@ -0,0 +1,47 @@
name: test
on: [push]
jobs:
ci:
name: Python-${{ matrix.python }} ${{ matrix.qt.qt_api }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
qt:
- package: PyQt5
qt_api: "pyqt5"
- package: PyQt6
qt_api: "pyqt6"
- package: PySide2
qt_api: "pyside2"
- package: PySide6
qt_api: "pyside6"
python: [3.6, 3.7, 3.8, 3.9]
steps:
- name: Checkout
uses: actions/checkout@v1
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python }}
architecture: x64
- name: Install pipenv
run: |
python -m pip install --upgrade pipenv wheel
- name: Install dependencies
run: |
pipenv install --dev
pipenv run pip install ${{ matrix.qt.package }} pytest
- name: Install Libxcb dependencies
run: |
sudo apt-get update
sudo apt-get install '^libxcb.*-dev' libx11-xcb-dev libglu1-mesa-dev libxrender-dev libxi-dev libxkbcommon-dev libxkbcommon-x11-dev
- name: Run headless test
uses: GabrielBB/xvfb-action@v1
env:
QT_API: ${{ matrix.qt.qt_api }}
with:
run: pipenv run py.test --forked -v

2
.gitignore vendored
View File

@ -1,6 +1,8 @@
*.pyc
*.pyo
.idea
*.junk
*~
*.iml
*.so

36
Makefile Normal file
View File

@ -0,0 +1,36 @@
# to run the tests, run make PASS=controllerpassword test
PREFIX=/usr/local
PYTHON_EXE_MSYS=${PREFIX}/bin/python3.sh
PIP_EXE_MSYS=${PREFIX}/bin/pip3.sh
LOCAL_DOCTEST=${PREFIX}/bin/toxcore_run_doctest3.bash
DOCTEST=${LOCAL_DOCTEST}
MOD=phantompy
check::
sh ${PYTHON_EXE_MSYS} -c "import ${MOD}"
lint::
sh .pylint.sh
install::
PYTHONPATH=${PWD}/src \
${PIP_EXE_MSYS} install --target ${PREFIX}/lib/python3.11/site-packages/ --upgrade .
rsync::
bash .rsync.sh
install::
PYTHONPATH=${PWD}/src \
${PIP_EXE_MSYS} --timeout=30 --disable-pip-version-check --proxy http://127.0.0.1:9128 install --only-binary :none: --progress-bar=off --target /usr/local/lib/python3.11/site-packages --upgrade .
# execute these tests as: make test PASS=password
test::
doctest:
veryclean:: clean
rm -rf build dist __pycache__ .pylint.err .pylint.out
clean::
find . -name \*~ -delete

View File

@ -4,8 +4,9 @@ A simple replacement for phantomjs using PyQt.
This code is based on a brilliant idea of
[Michael Franzl](https://gist.github.com/michaelfranzl/91f0cc13c56120391b949f885643e974/raw/a0601515e7a575bc4c7d4d2a20973b29b6c6f2df/phantom.py)
that he wrote up in his
[blog](https://blog.michael.franzl.name/2017/10/16/phantom-py/index.html)
that he wrote up in his blog:
* https://blog.michael.franzl.name/2017/10/16/phantomjs-alternative-write-short-pyqt-scripts-instead-phantom-py/
* https://blog.michael.franzl.name/2017/10/16/phantom-py/
## Features
@ -26,17 +27,22 @@ way of knowing when that script has finished doing its work. For this
reason, the external script should execute at the end
```console.log("__PHANTOM_PY_DONE__");``` when done. This will trigger
the PDF generation or the file saving, after which phantompy will exit.
If you do not want to run any javascipt file, this trigger is provided
in the code by default.
It is important to remember that since you're just running WebKit, you can
use everything that WebKit supports, including the usual JS client
libraries, CSS, CSS @media types, etc.
Qt picks up proxies from the environment, so this will respect
```https_proxy``` or ```http_proxy``` if set.
## Dependencies
* Python3
* PyQt5 (this should work with PySide2 and PyQt6 - let us know.)
* [qasnyc](https://github.com/CabbageDevelopment/qasync) for the
standalone program ```qasync_lookup.py```
standalone program ```qasync_phantompy.py```
## Standalone
@ -57,10 +63,10 @@ for the PyQt ```app.exec``` and the exiting of the program.
We've decided to use the best of the shims that merge the Python
```asyncio``` and Qt event loops:
[qasyc](https://github.com/CabbageDevelopment/qasync). This is seen as
the successor to the sorta abandoned[quamash](https://github.com/harvimt/quamash).
the successor to the sorta abandoned [quamash](https://github.com/harvimt/quamash).
The code is based on a
[comment](https://github.com/CabbageDevelopment/qasync/issues/35#issuecomment-1315060043)
by [Alex Marcha](https://github.com/hosaka) who's excellent code helped me.
by [Alex March](https://github.com/hosaka) who's excellent code helped me.
As this is my first use of ```asyncio``` and ```qasync``` I may have
introduced some errors and it may be improved on, but it works, and
it not a monolithic Qt program, so it can be used as a library.
@ -73,9 +79,11 @@ The standalone program is ```quash_phantompy.py```
### Arguments
```
<url> Can be a http(s) URL or a path to a local file
<pdf-file> Path and name of PDF file to generate
[<javascript-file>] (optional) Path and name of a JavaScript file to execute
--js_input (optional) Path and name of a JavaScript file to execute on the HTML
--html_output <html-file> (optional) Path a HTML output file to generate after JS is applied
--pdf_output <pdf-file> (optional) Path and name of PDF file to generate after JS is applied
--log_level 10=debug 20=info 30=warn 40=error
html_or_url - required argument, a http(s) URL or a path to a local file.
```
Setting ```DEBUG=1``` in the environment will give debugging messages
on ```stderr```.

22
appveyor.yml Normal file
View File

@ -0,0 +1,22 @@
environment:
matrix:
- PYTHON: "C:\\Python36"
- PYTHON: "C:\\Python37"
- PYTHON: "C:\\Python38"
- PYTHON: "C:\\Python39"
init:
- set PATH=%PYTHON%;%PYTHON%\Scripts;%PATH%
install:
- pip install pipenv
- pipenv install --dev
- pipenv run pip install PyQt5 PySide2
# FIX: colorama not installed by pipenv
- pipenv run pip install colorama
build: off
test_script:
- set QT_API=PyQt5&& pipenv run py.test -v
- set QT_API=PySide2&& pipenv run py.test -v

47
pyproject.toml Normal file
View File

@ -0,0 +1,47 @@
[project]
name = "phantompy"
description = "A simple replacement for phantomjs using PyQt"
authors = [{ name = "emdee", email = "emdee@spm.plastiras.org" } ]
requires-python = ">=3.8"
dependencies = [
'qasync',
# PyQt5 PyQt6
]
keywords = ["phantomjs", "python3", "qasync"]
classifiers = [
"License :: OSI Approved",
"Operating System :: POSIX :: BSD :: FreeBSD",
"Operating System :: POSIX :: Linux",
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: Implementation :: CPython",
"Classifier: Topic :: Software Development :: Libraries :: Python Modules",
]
#
dynamic = ["version", "readme", ] # cannot be dynamic ['license']
[project.scripts]
phantompy = "phantompy.qasync_phantompy:iMain"
#[project.license]
#file = "LICENSE.md"
[project.urls]
repository = "https://git.plastiras.org/emdee/phantompy"
[build-system]
requires = ["setuptools >= 61.0"]
build-backend = "setuptools.build_meta"
[tool.setuptools.dynamic]
version = {attr = "phantompy.__version__"}
readme = {file = ["README.md"]}
[tool.setuptools]
packages = ["phantompy"]
#[tool.setuptools.packages.find]
#where = "src"

65
setup.cfg Normal file
View File

@ -0,0 +1,65 @@
[metadata]
classifiers =
License :: OSI Approved
License :: OSI Approved :: BSD 1-clause
Intended Audience :: Web Developers
Operating System :: Microsoft :: Windows
Operating System :: POSIX :: BSD :: FreeBSD
Operating System :: POSIX :: Linux
Programming Language :: Python :: 3 :: Only
Programming Language :: Python :: 3.6
Programming Language :: Python :: 3.7
Programming Language :: Python :: 3.8
Programming Language :: Python :: 3.9
Programming Language :: Python :: 3.10
Programming Language :: Python :: 3.11
Programming Language :: Python :: Implementation :: CPython
Framework :: AsyncIO
[options]
zip_safe = false
python_requires = ~=3.8
include_package_data = false
install_requires =
qasync
cryptography
rsa
stem
package_dir=
=src
packages=find:
[options.packages.find]
where=src
[options.entry_points]
console_scripts =
phantompy = phantompy.__main__:iMain
[easy_install]
zip_ok = false
[flake8]
jobs = 1
max-line-length = 88
ignore =
E111
E114
E128
E225
E261
E302
E305
E402
E501
E502
E541
E701
E702
E704
E722
E741
F508
F541
W503
W601

44
setup.py Normal file
View File

@ -0,0 +1,44 @@
# -*-mode: python; indent-tabs-mode: nil; py-indent-offset: 4; coding: utf-8 -*
import re
from setuptools import setup
with open("qasync/__init__.py") as f:
version = re.search(r'__version__\s+=\s+"(.*)"', f.read()).group(1)
long_description = "\n\n".join([
open("README.md").read(),
])
if __name__ == '__main__':
setup(
name="phantompy",
version=__version__,
description="""A simple replacement for phantomjs using PyQt""",
long_description=long_description,
author="Michael Franzl (originally)",
author_email='',
license="1clause BSD",
packages=['phantompy'],
# url="",
# download_url="https://",
keywords=['JavaScript', 'phantomjs', 'asyncio'],
# maybe less - nothing fancy
python_requires="~=3.6",
# probably works on PyQt6 and PySide2 but untested
# https://github.com/CabbageDevelopment/qasync/
install_requires=['qasync',
'PyQt5'],
entry_points={
'console_scripts': ['phantompy = phantompy.__main__:iMain', ]},
classifiers=[
'Development Status :: 4 - Beta',
'Environment :: Console',
'Intended Audience :: Developers',
'Intended Audience :: Web Developers',
'Natural Language :: English',
'Operating System :: OS Independent',
'Programming Language :: Python :: 3',
'Topic :: Software Development :: Documentation',
],
)

44
setup.py.bak Normal file
View File

@ -0,0 +1,44 @@
# -*-mode: python; indent-tabs-mode: nil; py-indent-offset: 4; coding: utf-8 -*
import re
from setuptools import setup
with open("qasync/__init__.py") as f:
version = re.search(r'__version__\s+=\s+"(.*)"', f.read()).group(1)
long_description = "\n\n".join([
open("README.md").read(),
])
if __name__ == '__main__':
setup(
name="phantompy",
version=__version__,
description="""A simple replacement for phantomjs using PyQt""",
long_description=long_description,
author="Michael Franzl (originally)",
author_email='',
license="1clause BSD",
packages=['phantompy'],
# url="",
# download_url="https://",
keywords=['JavaScript', 'phantomjs', 'asyncio'],
# maybe less - nothing fancy
python_requires="~=3.6",
# probably works on PyQt6 and PySide2 but untested
# https://github.com/CabbageDevelopment/qasync/
install_requires=['qasync',
'PyQt5'],
entry_points={
'console_scripts': ['phantompy = phantompy.__main__:iMain', ]},
classifiers=[
'Development Status :: 4 - Beta',
'Environment :: Console',
'Intended Audience :: Developers',
'Intended Audience :: Web Developers',
'Natural Language :: English',
'Operating System :: OS Independent',
'Programming Language :: Python :: 3',
'Topic :: Software Development :: Documentation',
],
)

View File

@ -0,0 +1,3 @@
# -*-mode: python; indent-tabs-mode: nil; py-indent-offset: 2; coding: utf-8 -*-
__version__ = "0.1.0"

View File

@ -1,10 +1,13 @@
#!/usr/local/bin/python3.sh
# -*-mode: python; indent-tabs-mode: nil; py-indent-offset: 4; coding: utf-8 -*
from qasync_phantompy import iMain
from __future__ import absolute_import
import sys
from .qasync_phantompy import iMain
try:
from support_phantompy import vsetup_logging
from .support_phantompy import vsetup_logging
d = int(os.environ.get('DEBUG', 0))
if d > 0:
vsetup_logging(10, stream=sys.stderr)
@ -13,4 +16,5 @@ try:
vsetup_logging(log_level, logfile='', stream=sys.stderr)
except: pass
iMain(sys.argv[1:], bgui=False)
if __name__ == '__main__':
iMain(sys.argv[1:])

View File

@ -0,0 +1,84 @@
#!/usr/local/bin/python3.sh
# -*-mode: python; indent-tabs-mode: nil; py-indent-offset: 4; coding: utf-8 -*
"""
Looks for urls https://dns.google/resolve?
https://dns.google/resolve?name=domain.name&type=TXT&cd=true&do=true
and parses them to extract a magic field.
A good example of how you can parse json embedded in HTML with phantomjs.
"""
import sys
import os
from phantompy import Render
global LOG
import logging
import warnings
warnings.filterwarnings('ignore')
LOG = logging.getLogger()
class LookFor(Render):
def __init__(self, app, do_print=True, do_save=False):
app.lfps = []
self._app = app
self.do_print = do_print
self.do_save = do_save
self.progress = 0
self.we_run_this_tor_relay = None
Render.__init__(self, app, do_print, do_save)
def _exit(self, val):
Render._exit(self, val)
self.percent = 100
LOG.debug(f"phantom.py: Exiting with val {val}")
i = self.uri.find('name=')
fp = self.uri[i+5:]
i = fp.find('.')
fp = fp[:i]
# threadsafe?
self._app.lfps.append(fp)
def _html_callback(self, *args):
"""print(self, QPrinter, Callable[[bool], None])"""
if type(args[0]) is str:
self._save(args[0])
i = self.ilookfor(args[0])
self._onConsoleMessage(i, "__PHANTOM_PY_SAVED__", 0 , '')
def ilookfor(self, html):
import json
marker = '<pre style="word-wrap: break-word; white-space: pre-wrap;">'
if marker not in html: return -1
i = html.find(marker) + len(marker)
html = html[i:]
assert html[0] == '{', html
i = html.find('</pre')
html = html[:i]
assert html[-1] == '}', html
LOG.debug(f"Found {len(html)} json")
o = json.loads(html)
if "Answer" not in o.keys() or type(o["Answer"]) != list:
LOG.warn(f"FAIL {self.uri}")
return 1
for elt in o["Answer"]:
assert type(elt) == dict, elt
assert 'type' in elt, elt
if elt['type'] != 16: continue
assert 'data' in elt, elt
if elt['data'] == 'we-run-this-tor-relay':
LOG.info(f"OK {self.uri}")
self.we_run_this_tor_relay = True
return 0
self.we_run_this_tor_relay = False
LOG.warn(f"BAD {self.uri}")
return 2
def _loadFinished(self, result):
LOG.debug(f"phantom.py: Loading finished {self.uri}")
self.toHtml(self._html_callback)

View File

@ -13,17 +13,18 @@ replacement for other bulky headless browser frameworks.
If you have a display attached:
./phantom.py <url> <pdf-file> [<javascript-file>]
If you don't have a display attached (i.e. on a remote server):
./phantom.py [--pdf_output <pdf-file>] [--js_input <javascript-file>] <url-or-html-file>
xvfb-run ./phantom.py <url> <pdf-file> [<javascript-file>]
If you don't have a display attached (i.e. on a remote server), you can use
xvfb-run, or don't add --show_gui - it should work without a display.
Arguments:
[--pdf_output <pdf-file>] (optional) Path and name of PDF file to generate
[--html_output <html-file>] (optional) Path and name of HTML file to generate
[--js_input <javascript-file>] (optional) Path and name of a JavaScript file to execute
--log_level 10=debug 20=info 30=warn 40=error
<url> Can be a http(s) URL or a path to a local file
<pdf-file> Path and name of PDF file to generate
[<javascript-file>] (optional) Path and name of a JavaScript file to execute
## Features
@ -55,12 +56,15 @@ CSS @media types, etc.
* Python3
* PyQt5
* [qasnyc](https://github.com/CabbageDevelopment/qasync) for the
standalone program ```qasnyc_phantompy.py```
* xvfb (optional for display-less machines)
Installation of dependencies in Debian Stretch is easy:
apt-get install xvfb python3-pyqt5 python3-pyqt5.qtwebkit
Finding the equivalent for other OSes is an exercise that I leave to you.
@ -76,16 +80,16 @@ Given the following file /tmp/test.html
document.getElementById('id1').innerHTML = "bar";
</script>
</html>
... and the following file /tmp/test.js:
document.getElementById('id2').innerHTML = "baz";
console.log("__PHANTOM_PY_DONE__");
... and running this script (without attached display) ...
xvfb-run python3 phantom.py /tmp/test.html /tmp/out.pdf /tmp/test.js
... you will get a PDF file /tmp/out.pdf with the contents "foo bar baz".
Note that the second occurrence of "foo" has been replaced by the web page's own
@ -114,23 +118,22 @@ IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
"""
import sys
import importlib
import os
import traceback
import atexit
import time
import sys # noqa
from PyQt5.QtCore import QUrl
from PyQt5.QtCore import QTimer
from PyQt5.QtWidgets import QApplication
from PyQt5.QtPrintSupport import QPrinter
from PyQt5.QtWebEngineWidgets import QWebEnginePage
QtModuleName = os.environ.get('QT_API')
if not QtModuleName:
from qasync import QtModuleName
from qasync.QtCore import QUrl
from support_phantompy import vsetup_logging
QPrinter = importlib.import_module(QtModuleName + ".QtPrintSupport.QPrinter", package=QtModuleName)
QWebEnginePage = importlib.import_module(QtModuleName + ".QtWebEngineWidgets.QWebEnginePage", package=QtModuleName)
global LOG
import logging
import warnings
warnings.filterwarnings('ignore')
LOG = logging.getLogger()
@ -157,86 +160,90 @@ def prepare(sdir='/tmp'):
</html>
""")
LOG.debug(f"wrote {sfile} ")
class Render(QWebEnginePage):
def __init__(self, app, do_print=False, do_save=True):
app.ldone = []
self._app = app
self.do_print = do_print
self.do_save = do_save
self.percent = 0
self.uri = None
self.jsfile = None
self.outfile = None
QWebEnginePage.__init__(self)
app.ldone = []
self._app = app
self.do_print = do_print
self.do_save = do_save
self.percent = 0
self.uri = None
self.jsfile = None
self.htmlfile = None
self.pdffile = None
QWebEnginePage.__init__(self)
def run(self, url, outfile, jsfile):
def run(self, url, pdffile, htmlfile, jsfile):
self._app.lstart.append(id(self))
self.percent = 10
self.uri = url
self.jsfile = jsfile
self.outfile = outfile
LOG.debug(f"phantom.py: URL={url} OUTFILE={outfile} JSFILE={jsfile}")
qurl = QUrl.fromUserInput(url)
self.htmlfile = htmlfile
self.pdffile = pdffile
self.outfile = pdffile or htmlfile
LOG.debug(f"phantom.py: URL={url} htmlfile={htmlfile} pdffile={pdffile} JSFILE={jsfile}")
qurl = QUrl.fromUserInput(url)
# The PDF generation only happens when the special string __PHANTOM_PY_DONE__
# is sent to console.log(). The following JS string will be executed by
# default, when no external JavaScript file is specified.
self.js_contents = "setTimeout(function() { console.log('__PHANTOM_PY_DONE__') }, 5000);";
self.js_contents = "setTimeout(function() { console.log('__PHANTOM_PY_DONE__') }, 5000);"
if jsfile:
try:
with open(self.jsfile, 'rt') as f:
self.js_contents = f.read()
except Exception as e:
except Exception as e: # noqa
LOG.exception(f"error reading jsfile {self.jsfile}")
self.loadFinished.connect(self._loadFinished)
self.percent = 20
self.load(qurl)
self.javaScriptConsoleMessage = self._onConsoleMessage
LOG.debug(f"phantom.py: loading 10")
def _onConsoleMessage(self, *args):
if len(args) > 3:
level, txt, lineno, filename = args
else:
level = 1
txt, lineno, filename = args
LOG.debug(f"CONSOLE {lineno} {txt} {filename}")
if "__PHANTOM_PY_DONE__" in txt:
self.percent = 40
# If we get this magic string, it means that the external JS is done
if self.do_save:
self.toHtml(self._html_callback)
return
# drop through
txt = "__PHANTOM_PY_SAVED__"
if "__PHANTOM_PY_SAVED__" in txt:
self.percent = 50
if self.do_print:
self._print()
return
txt = "__PHANTOM_PY_PRINTED__"
if "__PHANTOM_PY_PRINTED__" in txt:
self.percent = 60
self._exit(level)
if len(args) > 3:
level, txt, lineno, filename = args
else:
level = 1
txt, lineno, filename = args
LOG.debug(f"CONSOLE {lineno} {txt} {filename}")
if "__PHANTOM_PY_DONE__" in txt:
self.percent = 40
# If we get this magic string, it means that the external JS is done
if self.do_save:
self.toHtml(self._html_callback)
return
# drop through
txt = "__PHANTOM_PY_SAVED__"
if "__PHANTOM_PY_SAVED__" in txt:
self.percent = 50
if self.do_print:
self._print()
return
txt = "__PHANTOM_PY_PRINTED__"
if "__PHANTOM_PY_PRINTED__" in txt:
self.percent = 60
self._exit(level)
def _loadFinished(self, result):
self.percent = 30
LOG.info(f"phantom.py: _loadFinished {result} {self.percent}")
LOG.debug(f"phantom.py: Evaluating JS from {self.jsfile}")
self.runJavaScript("document.documentElement.contentEditable=true")
self.runJavaScript(self.js_contents)
# RenderProcessTerminationStatus ?
self.percent = 30
LOG.info(f"phantom.py: _loadFinished {result} {self.percent}")
LOG.debug(f"phantom.py: Evaluating JS from {self.jsfile}")
self.runJavaScript("document.documentElement.contentEditable=true")
self.runJavaScript(self.js_contents)
def _html_callback(self, *args):
"""print(self, QPrinter, Callable[[bool], None])"""
if type(args[0]) is str:
self._save(args[0])
self._onConsoleMessage(0, "__PHANTOM_PY_SAVED__", 0 , '')
self._onConsoleMessage(0, "__PHANTOM_PY_SAVED__", 0, '')
def _save(self, html):
sfile = self.outfile.replace('.pdf','.html')
sfile = self.htmlfile
# CompleteHtmlSaveFormat SingleHtmlSaveFormat MimeHtmlSaveFormat
with open(sfile, 'wt') as ofd:
ofd.write(html)
@ -244,49 +251,25 @@ class Render(QWebEnginePage):
def _printer_callback(self, *args):
"""print(self, QPrinter, Callable[[bool], None])"""
# print(f"_printer_callback {self.outfile} {args}")
if args[0] is False:
i = 1
else:
i = 0
self._onConsoleMessage(i, "__PHANTOM_PY_PRINTED__", 0 , '')
self._onConsoleMessage(i, "__PHANTOM_PY_PRINTED__", 0, '')
def _print(self):
sfile = self.outfile.replace('.html', '.pdf')
sfile = self.pdffile
printer = QPrinter()
printer.setPageMargins(10, 10, 10, 10, QPrinter.Millimeter)
printer.setPaperSize(QPrinter.A4)
printer.setCreator("phantom.py by Michael Karl Franzl")
printer.setOutputFormat(QPrinter.PdfFormat);
printer.setOutputFormat(QPrinter.PdfFormat)
printer.setOutputFileName(sfile)
self.print(printer, self._printer_callback)
LOG.debug("phantom.py: Printed")
def _exit(self, val):
self.percent = 100
LOG.debug(f"phantom.py: Exiting with val {val}")
# threadsafe?
self._app.ldone.append(self.uri)
def omain(app, largs):
if (len(largs) < 2):
LOG.info("USAGE: ./phantom.py <url> <pdf-file> [<javascript-file>]")
return -1
url = largs[0]
outfile = largs[1]
jsfile = largs[2] if len(largs) > 2 else None
ilen = 1
r = Render(app, do_print=False, do_save=True)
r.run(url, outfile, jsfile)
for i in range(1, 120):
app.processEvents()
print(f"{app.ldone} {i}")
if len(app.ldone) == ilen:
print(f"{app.ldone} found {ilen}")
app.exit()
return r
time.sleep(1)
return r

View File

@ -1,25 +1,30 @@
#!/usr/local/bin/python3.sh
# -*-mode: python; indent-tabs-mode: nil; py-indent-offset: 4; coding: utf-8 -*
import sys
import os
import qasync
import asyncio
import time
import random
import os
import sys
from PyQt5 import QtWidgets
from PyQt5.QtWidgets import (QProgressBar, QWidget, QVBoxLayout)
# let qasync figure out what Qt we are using - we dont care
from qasync import QApplication, QEventLoop, QtWidgets
from phantompy import Render
# if you want an example of looking for things in downloaded HTML:
# from lookupdns import LookFor as Render
from support_phantompy import omain_argparser, vsetup_logging
global LOG
import logging
import warnings
warnings.filterwarnings('ignore')
LOG = logging.getLogger()
try:
import shtab
except:
shtab = None
class Widget(QtWidgets.QWidget):
def __init__(self):
QtWidgets.QWidget.__init__(self)
@ -27,7 +32,7 @@ class Widget(QtWidgets.QWidget):
box = QtWidgets.QHBoxLayout()
self.setLayout(box)
box.addWidget(self._label)
self.progress = QProgressBar()
self.progress = QtWidgets.QProgressBar()
self.progress.setRange(0, 99)
box.addWidget(self.progress)
@ -35,15 +40,19 @@ class Widget(QtWidgets.QWidget):
i = len(asyncio.all_tasks())
self._label.setText(str(i))
self.progress.setValue(int(text))
class ContextManager:
def __init__(self) -> None:
self._seconds = 0
async def __aenter__(self):
LOG.debug("ContextManager enter")
return self
async def __aexit__(self, *args):
LOG.debug("ContextManager exit")
async def tick(self):
await asyncio.sleep(1)
self._seconds += 1
@ -63,12 +72,25 @@ async def main(widget, app, ilen):
app.exit()
# raise asyncio.CancelledError
return
LOG.debug(f"{app.ldone} {perc} {seconds}")
except asyncio.CancelledError as ex:
LOG.debug(f"{app.ldone} {seconds}")
except asyncio.CancelledError as ex: # noqa
LOG.debug("Task cancelled")
def iMain(largs, bgui=True):
app = QtWidgets.QApplication([])
def iMain(largs):
parser = omain_argparser()
if shtab:
shtab.add_argument_to(parser, ["-s", "--print-completion"]) # magic!
oargs = parser.parse_args(largs)
bgui = oargs.show_gui
try:
d = int(os.environ.get('DEBUG', 0))
if d > 0:
oargs.log_level = 10
vsetup_logging(oargs.log_level, logfile='', stream=sys.stderr)
except: pass
app = QApplication([])
app.lstart = []
if bgui:
widget = Widget()
@ -76,22 +98,24 @@ def iMain(largs, bgui=True):
widget.show()
else:
widget = None
loop = qasync.QEventLoop(app)
loop = QEventLoop(app)
asyncio.set_event_loop(loop)
largs = sys.argv[1:]
url = largs[0]
outfile = largs[1]
jsfile = largs[2] if len(largs) > 2 else None
url = oargs.html_url
htmlfile = oargs.html_output
pdffile = oargs.html_output
jsfile = oargs.js_input
# run only starts the url loading
r = Render(app, do_print=False, do_save=True)
r = Render(app,
do_print=True if pdffile else False,
do_save=True if htmlfile else False)
uri = url.strip()
r.run(uri, outfile, jsfile)
r.run(uri, pdffile, htmlfile, jsfile)
LOG.debug(f"{r.percent} {app.lstart}")
LOG.info(f"queued {len(app.lstart)} urls")
task = loop.create_task(main(widget, app, 1))
loop.run_forever()
@ -101,15 +125,4 @@ def iMain(largs, bgui=True):
loop.run_until_complete(asyncio.gather(*tasks))
if __name__ == '__main__':
try:
from exclude_badExits import vsetup_logging
d = int(os.environ.get('DEBUG', 0))
if d > 0:
vsetup_logging(10, stream=sys.stderr)
else:
vsetup_logging(20, stream=sys.stderr)
vsetup_logging(log_level, logfile='', stream=sys.stderr)
except: pass
iMain(sys.argv[1:], bgui=False)
iMain(sys.argv[1:])

View File

@ -0,0 +1,143 @@
#!/usr/local/bin/python3.sh
# -*-mode: python; indent-tabs-mode: nil; py-indent-offset: 4; coding: utf-8 -*
import sys
import os
import atexit
import traceback
import functools
import asyncio
import time
import qasync
import threading
QtModuleName = os.environ.get('QT_API')
if not QtModuleName:
from qasync import QtModuleName
QtWidgets = importlib.import_module(QtModuleName + ".QtWidgets", package=QtModuleName)
# from PyQt5.QtWidgets import (QProgressBar, QWidget, QVBoxLayout)
from qasync import QEventLoop, QThreadExecutor
from qasync import asyncSlot, asyncClose, QApplication
from phantompy import Render
from lookupdns import LookFor
global LOG
import logging
import warnings
warnings.filterwarnings('ignore')
LOG = logging.getLogger()
class MainWindow(QWidget.QWidget):
"""Main window."""
def __init__(self):
super().__init__()
self.setLayout(QVBoxLayout.QVBoxLayout())
self.progress = QtWidgets.QProgressBar()
self.progress.setRange(0, 99)
self.layout().addWidget(self.progress)
async def main(app):
def close_future(future, loop):
loop.call_later(10, future.cancel)
future.cancel()
loop = asyncio.get_running_loop()
future = asyncio.Future()
app.ldone = []
getattr(app, "aboutToQuit").connect(
functools.partial(close_future, future, loop)
)
if False:
progress = QtWidgets.QProgressBar()
progress.setRange(0, 99)
progress.show()
else:
mw = MainWindow()
progress = mw.progress
mw.show()
# LOG.info(f"calling first_50 {r}")
# await first_50(progress, r)
LOG.info(f"calling last_50 {r}")
o = QThreadExecutor(max_workers=1)
app.o = o
with o as executor:
await loop.run_in_executor(executor, functools.partial(last_50, progress, sys.argv[1:], app), loop)
LOG.info(f" {dir(o)}")
LOG.info(f"awaiting {future}")
await future
return True
async def first_50(progress, r=None):
progress.setValue(5)
LOG.info(f"first_50 {r}")
if r is not None:
# loop = asyncio.get_running_loop()
# LOG.info(f"first_50.r.run {r}")
# loop.call_soon_threadsafe(r.run, r.url, r.outfile, r.jsfile)
# r.run( r.url, r.outfile, r.jsfile)
for i in range(50):
# LOG.info(f"first_50 {r.progress} {i}")
# if r.progress >= 100: break
# progress.setValue(max(r.progress,i))
progress.setValue(i)
await asyncio.sleep(.1)
return
for i in range(50):
LOG.info(f"first_50 {r} {i}")
loop.call_soon_threadsafe(progress.setValue, i)
time.sleep(1)
def last_50(progress, largs, app, loop):
url = largs[0]
outfile = largs[1]
jsfile = largs[2] if len(largs) > 2 else None
r = Render(app, do_print=False, do_save=True)
uri = url.strip()
loop.call_soon_threadsafe(r.run, uri, outfile, jsfile)
time.sleep(1)
for i in range(50, 100):
j = len(app.ldone) # r.progress
if j == 100:
LOG.info(f"last_50 None {i} {j}")
else:
LOG.debug(f"last_50 None {i} {j}")
loop.call_soon_threadsafe(progress.setValue, i)
time.sleep(1)
if __name__ == '__main__':
url = 'https://dns.google/resolve?name=6D6EC2A2E2ED8BFF2D4834F8D669D82FC2A9FA8D.for-privacy.net&type=TXT&cd=true&do=true'
outfile = '/tmp/test1.pdf'
jsfile = '/tmp/test1.js'
from exclude_badExits import vsetup_logging
vsetup_logging(10)
app = QApplication([])
#?
loop = qasync.QEventLoop(app)
#NOT loop = asyncio.get_event_loop()
asyncio._set_running_loop(loop)
asyncio.events._set_running_loop(loop)
r = Render(app, do_print=False, do_save=True)
#loop.call_soon_threadsafe(r.run, url, outfile, jsfile)
r.run(url, outfile, jsfile)
app.rs = [r]
for i in range(20):
for elt in app.rs:
print (elt.percent)
time.sleep(2)
try:
qasync.run(main(app))
except asyncio.exceptions.CancelledError:
sys.exit(0)
except RuntimeError as e:
LOG.debug('Fixme')
sys.exit(0)
except KeyboardInterrupt:
sys.exit(0)
else:
val = 0
sys.exit(val)

View File

@ -0,0 +1,49 @@
#!/usr/local/bin/python3.sh
# -*-mode: python; indent-tabs-mode: nil; py-indent-offset: 4; coding: utf-8 -*
import sys
import os
import traceback
from phantompy import Render
global LOG
import logging
import warnings
warnings.filterwarnings('ignore')
LOG = logging.getLogger()
import sys
import asyncio
import time
from PyQt5.QtWidgets import QApplication, QProgressBar
from quamash import QEventLoop, QThreadExecutor
app = QApplication(sys.argv)
loop = QEventLoop(app)
asyncio.set_event_loop(loop) # NEW must set the event loop
asyncio.events._set_running_loop(loop)
progress = QProgressBar()
progress.setRange(0, 99)
progress.show()
async def master():
await first_50()
with QThreadExecutor(1) as executor:
await loop.run_in_executor(exec, last_50)
# TODO announce completion?
async def first_50():
for i in range(50):
progress.setValue(i)
await asyncio.sleep(.1)
def last_50():
for i in range(50,100):
loop.call_soon_threadsafe(progress.setValue, i)
time.sleep(.1)
with loop: ## context manager calls .close() when loop completes, and releases all resources
loop.run_until_complete(master())

View File

@ -1,20 +1,22 @@
#!/usr/local/bin/python3.sh
# -*-mode: python; indent-tabs-mode: nil; py-indent-offset: 4; coding: utf-8 -*
import sys
import argparse
import os
import sys
try:
if 'COLOREDLOGS_LEVEL_STYLES' not in os.environ:
os.environ['COLOREDLOGS_LEVEL_STYLES'] = 'spam=22;debug=28;verbose=34;notice=220;warning=202;success=118,bold;error=124;critical=background=red'
# https://pypi.org/project/coloredlogs/
import coloredlogs
except ImportError as e:
except ImportError:
coloredlogs = False
global LOG
import logging
import warnings
warnings.filterwarnings('ignore')
LOG = logging.getLogger()
@ -23,7 +25,7 @@ def vsetup_logging(log_level, logfile='', stream=sys.stdout):
add = True
# stem fucks up logging
from stem.util import log
# from stem.util import log
logging.getLogger('stem').setLevel(30)
logging._defaultFormatter = logging.Formatter(datefmt='%m-%d %H:%M:%S')
@ -79,3 +81,37 @@ def vsetup_logging(log_level, logfile='', stream=sys.stdout):
'NOTSET': logging.NOTSET,
}
def omain_argparser(_=None):
try:
from OpenSSL import SSL
lCAfs = SSL._CERTIFICATE_FILE_LOCATIONS
except:
lCAfs = []
CAfs = []
for elt in lCAfs:
if os.path.exists(elt):
CAfs.append(elt)
if not CAfs:
CAfs = ['']
parser = argparse.ArgumentParser(add_help=True,
epilog=__doc__)
parser.add_argument('--https_cafile', type=str,
help="Certificate Authority file (in PEM) (unused)",
default=CAfs[0])
parser.add_argument('--log_level', type=int, default=20,
help="10=debug 20=info 30=warn 40=error")
parser.add_argument('--js_input', type=str, default='',
help="Operate on the HTML file with javascript")
parser.add_argument('--html_output', type=str, default='',
help="Write loaded and javascripted result to a HTML file")
parser.add_argument('--pdf_output', type=str, default='',
help="Write loaded and javascripted result to a PDF file")
parser.add_argument('--show_gui', type=bool, default=False, store_action=True,
help="show a progress meter that doesn't work")
parser.add_argument('html_url', type=str, nargs='?',
required=True,
help='html file or url')
return parser

22
tests/conftest.py Normal file
View File

@ -0,0 +1,22 @@
# -*- mode: python; indent-tabs-mode: nil; py-indent-offset: 4; coding: utf-8 -*-
# (c) 2018 Gerard Marull-Paretas <gerard@teslabs.com>
# (c) 2014 Mark Harviston <mark.harviston@gmail.com>
# (c) 2014 Arve Knudsen <arve.knudsen@gmail.com>
# BSD License
# phantompy test - just test qasync for now
import os
import logging
from pytest import fixture
logging.basicConfig(
level=logging.DEBUG, format="%(asctime)s - %(levelname)s - %(name)s - %(message)s"
)
@fixture(scope="session")
def application():
from phantompy.qasync_phantompy import QApplication
return QApplication([])